Data Lake and Data Processing - Data Leaders Brief

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

AWS Big Data

OCTOBER 30, 2024

This is part two of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to load data from a legacy database (SQL Server) into a transactional data lake ( Apache Iceberg ) using AWS Glue. To start the job, choose Run. format(dbname)).config("spark.sql.catalog.glue_catalog.catalog-impl",

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Trending Sources

Important Considerations When Migrating to a Data Lake

Webinars

Oracle Wants to Be the Database for AI

How BMW streamlined data access using AWS Lake Formation fine-grained access control

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Enrich your serverless data lake with Amazon Bedrock

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Eight Top DataOps Trends for 2022

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

Scaling RISE with SAP data and AWS Glue

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

Build a data lake with Apache Flink on Amazon EMR

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

How EUROGATE established a data mesh architecture using Amazon DataZone

The success of GenAI models lies in your data management strategy

Achieve data resilience using Amazon OpenSearch Service disaster recovery with snapshot and restore

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Query your Apache Hive metastore with AWS Lake Formation permissions

Data Management Requirements for the Enterprise Data Lake

Your New Cloud for AI May Be Inside a Colo

Top 15 data management platforms

The essential check list for effective data democratization

Configure cross-Region table access with the AWS Glue Catalog and AWS Lake Formation

Governing data in relational databases using Amazon DataZone

AWS Glue crawlers support cross-account crawling to support data mesh architecture

Introducing AWS Glue crawler and create table support for Apache Iceberg format

BMC on BMC: How the company enables IT observability with BMC Helix and AIOps

How Gilead used Amazon Redshift to quickly and cost-effectively load third-party medical claims data

10 Things AWS Can Do for Your SaaS Company

Create your Private Data Warehousing Environment Using Azure Kubernetes Service

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

Habib Bank manages data at scale with Cloudera Data Platform

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Use AWS Glue to streamline SFTP data processing

DS Smith sets a single-cloud agenda for sustainability

Preparing the foundations for Generative AI

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

Access Amazon Athena in your applications using the WebSocket API

Migrate Hive data from CDH to CDP public cloud

Stay Connected