Data Lake, Metadata and Strategy

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

AWS Big Data

OCTOBER 30, 2024

This is part two of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to load data from a legacy database (SQL Server) into a transactional data lake ( Apache Iceberg ) using AWS Glue. Delete the bucket.

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Webinars

Trending Sources

Build a high-performance quant research platform with Apache Iceberg

Webinars

Manage concurrent write conflicts in Apache Iceberg on the AWS Glue Data Catalog

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Use Apache Iceberg in a data lake to support incremental data processing

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

How BMW streamlined data access using AWS Lake Formation fine-grained access control

Choosing an open table format for your transactional data lake on AWS

Recap of Amazon Redshift key product announcements in 2024

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Data’s dark secret: Why poor quality cripples AI and growth

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

Migrate Delta tables from Azure Data Lake Storage to Amazon S3 using AWS Glue

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Use open table format libraries on AWS Glue 5.0 for Apache Spark

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Enrich your serverless data lake with Amazon Bedrock

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

Build a real-time GDPR-aligned Apache Iceberg data lake

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Top analytics announcements of AWS re:Invent 2024

Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

The Increasing Importance of Open Table Formats

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

How to modernize data lakes with a data lakehouse architecture

Building a Beautiful Data Lakehouse

Doing Cloud Migration and Data Governance Right the First Time

Unstructured data management and governance using AWS AI/ML and analytics services

Use Amazon Athena with Spark SQL for your open-source transactional table formats

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Data governance in the age of generative AI

Create an end-to-end data strategy for Customer 360 on AWS

Governing data in relational databases using Amazon DataZone

What is a data architect? Skills, salaries, and how to become a data framework master

Apache Ozone and Dense Data Nodes

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

Harness Zero Copy data sharing from Salesforce Data Cloud to Amazon Redshift for Unified Analytics – Part 2

Stay Connected