Data Lake, Data Warehouse and Metadata

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. Table metadata is fetched from AWS Glue. The generated Athena SQL query is run.

Metadata

Metadata Data Lake Modeling Data Warehouse

Understanding the Differences Between Data Lakes and Data Warehouses

Smart Data Collective

AUGUST 28, 2021

Data lakes and data warehouses are probably the two most widely used structures for storing data. Data Warehouses and Data Lakes in a Nutshell. A data warehouse is used as a central storage space for large amounts of structured data coming from various sources.

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Understanding the Differences Between Data Lakes and Data Warehouses

Webinars

Trending Sources

Recap of Amazon Redshift key product announcements in 2024

Webinars

Run Apache XTable in AWS Lambda for background conversion of open table formats

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Write queries faster with Amazon Q generative SQL for Amazon Redshift

Use Apache Iceberg in a data lake to support incremental data processing

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Accelerate SQL code migration from Google BigQuery to Amazon Redshift using BladeBridge

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Choosing an open table format for your transactional data lake on AWS

How EUROGATE established a data mesh architecture using Amazon DataZone

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

What is a Data Mesh?

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Top analytics announcements of AWS re:Invent 2024

The Data Lakehouse: Blending Data Warehouses and Data Lakes

Cloudera Data Warehouse Demonstrates Best-in-Class Cloud-Native Price-Performance

How Morningstar used tag-based access controls in AWS Lake Formation to manage permissions for an Amazon Redshift data warehouse

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

Manage your data warehouse cost allocations with Amazon Redshift Serverless tagging

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Build a real-time GDPR-aligned Apache Iceberg data lake

Amazon SageMaker Lakehouse now supports attribute-based access control

The Increasing Importance of Open Table Formats

Data Lakes: What Are They and Who Needs Them?

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

Building a Beautiful Data Lakehouse

Salesforce debuts Zero Copy Partner Network to ease data integration

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

The Future of the Data Lakehouse – Open

Open Data Lakehouse powered by Iceberg for all your Data Warehouse needs

Informatica’s new data management clouds target health, finance services

Read and write S3 Iceberg table using AWS Glue Iceberg Rest Catalog from Open Source Apache Spark

The Future of the Data Lakehouse – Open

Data governance in the age of generative AI

How to modernize data lakes with a data lakehouse architecture

Stay Connected