Cost-Benefit, Data Lake and Metadata

Run Apache XTable in AWS Lambda for background conversion of open table formats

AWS Big Data

NOVEMBER 26, 2024

Initially, data warehouses were the go-to solution for structured data and analytical workloads but were limited by proprietary storage formats and their inability to handle unstructured data. Eventually, transactional data lakes emerged to add transactional consistency and performance of a data warehouse to the data lake.

Metadata

Metadata Data Lake Snapshot Data Warehouse

Build a high-performance quant research platform with Apache Iceberg

AWS Big Data

JANUARY 9, 2025

Our experiments are based on real-world historical full order book data, provided by our partner CryptoStruct , and compare the trade-offs between these choices, focusing on performance, cost, and quant developer productivity. Data management is the foundation of quantitative research.

Metadata

Metadata Snapshot Cost-Benefit Optimization

Bridging the gap between mainframe data and hybrid cloud environments

CIO Business Intelligence

FEBRUARY 27, 2025

A high hurdle many enterprises have yet to overcome is accessing mainframe data via the cloud. Mainframes hold an enormous amount of critical and sensitive business data including transactional information, healthcare records, customer data, and inventory metrics. Four key challenges prevent them from doing so: 1.

Metadata

Metadata Data Lake Cost-Benefit Forecasting

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Use Apache Iceberg in a data lake to support incremental data processing

AWS Big Data

MARCH 2, 2023

Apache Iceberg is an open table format for very large analytic datasets, which captures metadata information on the state of datasets as they evolve and change over time. Iceberg has become very popular for its support for ACID transactions in data lakes and features like schema and partition evolution, time travel, and rollback.

Run Apache XTable in AWS Lambda for background conversion of open table formats

Build a high-performance quant research platform with Apache Iceberg

Webinars

Trending Sources

Bridging the gap between mainframe data and hybrid cloud environments

Webinars

Use Apache Iceberg in a data lake to support incremental data processing

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Multicloud data lake analytics with Amazon Athena

How EUROGATE established a data mesh architecture using Amazon DataZone

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

Choosing an open table format for your transactional data lake on AWS

Data Lakes on Cloud & it’s Usage in Healthcare

Enrich your serverless data lake with Amazon Bedrock

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

What is a Data Mesh?

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Empower financial analytics by creating structured knowledge bases using Amazon Bedrock and Amazon Redshift

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

Data’s dark secret: Why poor quality cripples AI and growth

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

Amazon SageMaker Lakehouse now supports attribute-based access control

How to modernize data lakes with a data lakehouse architecture

Apache Iceberg optimization: Solving the small files problem in Amazon EMR

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

Driving Business Value and ROI from a Hybrid Cloud Data Lake

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

Modernize your data observability with Amazon OpenSearch Service zero-ETL integration with Amazon S3

Apache Ozone and Dense Data Nodes

Lay the groundwork now for advanced analytics and AI

The Future of the Data Lakehouse – Open

Governing data in relational databases using Amazon DataZone

Unlock data across organizational boundaries using Amazon DataZone – now generally available

The Future of the Data Lakehouse – Open

How Ruparupa gained updated insights with an Amazon S3 data lake, AWS Glue, Apache Hudi, and Amazon QuickSight

Achieve your AI goals with an open data lakehouse approach

Don’t Fear Artificial Intelligence; Embrace it Through Data Governance

How smava makes loans transparent and affordable using Amazon Redshift Serverless

How Data Governance Protects Sensitive Data

Salesforce readies Einstein Copilot to unleash generative AI across its offerings

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

Stay Connected