Data Lake, Management and Optimization

Multicloud data lake analytics with Amazon Athena

AWS Big Data

MARCH 18, 2024

Many organizations operate data lakes spanning multiple cloud data stores. In these cases, you may want an integrated query layer to seamlessly run analytical queries across these diverse cloud stores and streamline your data analytics processes. The AWS Glue Data Catalog holds the metadata for Amazon S3 and GCS data.

Multicloud data lake analytics with Amazon Athena

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Webinars

Trending Sources

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

Webinars

How Salesforce optimized their detection and response platform using AWS managed services

The Unexpected Cost of Data Copies

Choosing an open table format for your transactional data lake on AWS

Incremental refresh for Amazon Redshift materialized views on data lake tables

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

Use Apache Iceberg in a data lake to support incremental data processing

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Optimize your workloads with Amazon Redshift Serverless AI-driven scaling and optimization

Enrich your serverless data lake with Amazon Bedrock

The success of GenAI models lies in your data management strategy

Synchronize data lakes with CDC-based UPSERT using open table format, AWS Glue, and Amazon MSK

Migrate Delta tables from Azure Data Lake Storage to Amazon S3 using AWS Glue

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Apache Iceberg optimization: Solving the small files problem in Amazon EMR

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Top 15 data management platforms

The AWS Glue Data Catalog now supports storage optimization of Apache Iceberg tables

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Centralize Your Data Processes With a DataOps Process Hub

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

Cloudera Lakehouse Optimizer Makes it Easier Than Ever to Deliver High-Performance Iceberg Tables

Denodo Provides a Logical Approach to Data Management

What is data architecture? A framework to manage data

Drug Launch Case Study: Amazing Efficiency Using DataOps

DIY cloud cost management: The strategic case for building your own tools

Implementing a Pharma Data Mesh using DataOps

Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue

Speed up queries with the cost-based optimizer in Amazon Athena

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

IBM showcases Gen AI-driven Concert to monitor and manage enterprise applications

MLOps and DevOps: Why Data Makes It Different

AWS Glue Data Catalog supports automatic optimization of Apache Iceberg tables through your Amazon VPC

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

Top 15 data management platforms available today

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Analyzing the business-case approach Perdue Farms takes to derive value from data

Stay Connected