Blog, Data Lake and Metadata - Data Leaders Brief

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. Table metadata is fetched from AWS Glue. The generated Athena SQL query is run.

Metadata

Metadata Data Lake Modeling Data Warehouse

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights.

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Trending Sources

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

Webinars

Use Apache Iceberg in a data lake to support incremental data processing

Recap of Amazon Redshift key product announcements in 2024

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

How BMW streamlined data access using AWS Lake Formation fine-grained access control

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

Data Lakes on Cloud & it’s Usage in Healthcare

Multicloud data lake analytics with Amazon Athena

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

Use open table format libraries on AWS Glue 5.0 for Apache Spark

Cloudera and Snowflake Partner to Deliver the Most Comprehensive Open Data Lakehouse

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

What is a Data Mesh?

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

Addressing Data Mesh Technical Challenges with DataOps

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

Doing Cloud Migration and Data Governance Right the First Time

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

How to modernize data lakes with a data lakehouse architecture

Apache Iceberg optimization: Solving the small files problem in Amazon EMR

The Data Lakehouse: Blending Data Warehouses and Data Lakes

Introducing Apache Hudi support with AWS Glue crawlers

Data Cataloging in the Data Lake: Alation + Kylo

How Cargotec uses metadata replication to enable cross-account data sharing

Driving Business Value and ROI from a Hybrid Cloud Data Lake

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

Where Do Data Catalogs Fit in Metadata Management?

Apache Ozone and Dense Data Nodes

The Future of the Data Lakehouse – Open

Migrate Hive data from CDH to CDP public cloud

Query AWS Glue Data Catalog views using Amazon Athena and Amazon Redshift

NVIDIA RAPIDS in Cloudera Machine Learning

How Cloudera Data Flow Enables Successful Data Mesh Architectures

The Security Challenges of Data Warehousing in the Cloud

Stay Connected