Data Analytics, Data Lake and Metadata

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

AWS Big Data

OCTOBER 30, 2024

This is part two of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to load data from a legacy database (SQL Server) into a transactional data lake ( Apache Iceberg ) using AWS Glue. Delete the bucket.

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Webinars

Trending Sources

Recap of Amazon Redshift key product announcements in 2024

Webinars

Use Apache Iceberg in a data lake to support incremental data processing

Choosing an open table format for your transactional data lake on AWS

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

Multicloud data lake analytics with Amazon Athena

How EUROGATE established a data mesh architecture using Amazon DataZone

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Enrich your serverless data lake with Amazon Bedrock

Migrate Delta tables from Azure Data Lake Storage to Amazon S3 using AWS Glue

Write queries faster with Amazon Q generative SQL for Amazon Redshift

What is a Data Mesh?

Cloudera and Snowflake Partner to Deliver the Most Comprehensive Open Data Lakehouse

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Top analytics announcements of AWS re:Invent 2024

Gartner Data & Analytics Sydney 2022

Addressing Data Mesh Technical Challenges with DataOps

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

Building a Beautiful Data Lakehouse

Read and write S3 Iceberg table using AWS Glue Iceberg Rest Catalog from Open Source Apache Spark

Unstructured data management and governance using AWS AI/ML and analytics services

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Accelerate SQL code migration from Google BigQuery to Amazon Redshift using BladeBridge

Use Amazon Athena with Spark SQL for your open-source transactional table formats

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

How Cargotec uses metadata replication to enable cross-account data sharing

Data governance in the age of generative AI

Regeneron turns to IT to accelerate drug discovery

The Future of the Data Lakehouse – Open

Query AWS Glue Data Catalog views using Amazon Athena and Amazon Redshift

Apache Ozone and Dense Data Nodes

A Day in the Life of a DataOps Engineer

The Future of the Data Lakehouse – Open

Unlock data across organizational boundaries using Amazon DataZone – now generally available

How Ruparupa gained updated insights with an Amazon S3 data lake, AWS Glue, Apache Hudi, and Amazon QuickSight

Lay the groundwork now for advanced analytics and AI

Clean up your Excel and CSV files without writing code using AWS Glue DataBrew

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

Amazon DataZone announces custom blueprints for AWS services

Achieve your AI goals with an open data lakehouse approach

Stay Connected