Cost-Benefit, Data Lake and Metadata

Run Apache XTable in AWS Lambda for background conversion of open table formats

AWS Big Data

NOVEMBER 26, 2024

Initially, data warehouses were the go-to solution for structured data and analytical workloads but were limited by proprietary storage formats and their inability to handle unstructured data. Eventually, transactional data lakes emerged to add transactional consistency and performance of a data warehouse to the data lake.

Metadata

Metadata Data Lake Snapshot Data Warehouse

Bridging the gap between mainframe data and hybrid cloud environments

CIO Business Intelligence

FEBRUARY 27, 2025

A high hurdle many enterprises have yet to overcome is accessing mainframe data via the cloud. Mainframes hold an enormous amount of critical and sensitive business data including transactional information, healthcare records, customer data, and inventory metrics. Four key challenges prevent them from doing so: 1.

Metadata

Metadata Data Lake Cost-Benefit Forecasting

Build a high-performance quant research platform with Apache Iceberg

AWS Big Data

JANUARY 9, 2025

Our experiments are based on real-world historical full order book data, provided by our partner CryptoStruct , and compare the trade-offs between these choices, focusing on performance, cost, and quant developer productivity. Data management is the foundation of quantitative research.

Metadata

Metadata Snapshot Cost-Benefit Optimization

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JUNE 10, 2024

In this blog post, we dive into different data aspects and how Cloudinary breaks the two concerns of vendor locking and cost efficient data analytics by using Apache Iceberg, Amazon Simple Storage Service (Amazon S3 ), Amazon Athena , Amazon EMR , and AWS Glue.

Run Apache XTable in AWS Lambda for background conversion of open table formats

Bridging the gap between mainframe data and hybrid cloud environments

Webinars

Trending Sources

Build a high-performance quant research platform with Apache Iceberg

Webinars

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

Use Apache Iceberg in a data lake to support incremental data processing

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

How EUROGATE established a data mesh architecture using Amazon DataZone

Data Lakes on Cloud & it’s Usage in Healthcare

Choosing an open table format for your transactional data lake on AWS

Multicloud data lake analytics with Amazon Athena

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

What is a Data Mesh?

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Enrich your serverless data lake with Amazon Bedrock

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Data’s dark secret: Why poor quality cripples AI and growth

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

How to modernize data lakes with a data lakehouse architecture

Apache Iceberg optimization: Solving the small files problem in Amazon EMR

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Driving Business Value and ROI from a Hybrid Cloud Data Lake

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

Apache Ozone and Dense Data Nodes

Lay the groundwork now for advanced analytics and AI

The Future of the Data Lakehouse – Open

Modernize your data observability with Amazon OpenSearch Service zero-ETL integration with Amazon S3

The Future of the Data Lakehouse – Open

Unlock data across organizational boundaries using Amazon DataZone – now generally available

How Ruparupa gained updated insights with an Amazon S3 data lake, AWS Glue, Apache Hudi, and Amazon QuickSight

Achieve your AI goals with an open data lakehouse approach

Don’t Fear Artificial Intelligence; Embrace it Through Data Governance

How Data Governance Protects Sensitive Data

Salesforce readies Einstein Copilot to unleash generative AI across its offerings

Governing data in relational databases using Amazon DataZone

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

How smava makes loans transparent and affordable using Amazon Redshift Serverless

How Fujitsu implemented a global data mesh architecture and democratized data

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

6 BI challenges IT teams must address

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

Stay Connected