article thumbnail

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

AWS Big Data

Over the years, organizations have invested in creating purpose-built, cloud-based data lakes that are siloed from one another. A major challenge is enabling cross-organization discovery and access to data across these multiple data lakes, each built on different technology stacks.

Data Lake 130
article thumbnail

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

AWS Big Data

Unlocking the true value of data often gets impeded by siloed information. Traditional data management—wherein each business unit ingests raw data in separate data lakes or warehouses—hinders visibility and cross-functional analysis. Amazon DataZone natively supports data sharing for Amazon Redshift data assets.

Data Lake 121
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time.

Data Lake 124
article thumbnail

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

Use cases for Hive metastore federation for Amazon EMR Hive metastore federation for Amazon EMR is applicable to the following use cases: Governance of Amazon EMR-based data lakes – Producers generate data within their AWS accounts using an Amazon EMR-based data lake supported by EMRFS on Amazon Simple Storage Service (Amazon S3)and HBase.

Data Lake 114
article thumbnail

Achieve data resilience using Amazon OpenSearch Service disaster recovery with snapshot and restore

AWS Big Data

Sesha Sanjana Mylavarapu is an Associate Data Lake Consultant at AWS Professional Services. She specializes in cloud-based data management and collaborates with enterprise clients to design and implement scalable data lakes. For instructions, see Creating an IAM role (console).

Snapshot 110
article thumbnail

Use AWS Glue Data Catalog views to analyze data

AWS Big Data

Additionally, you can use the power of SQL in a view to express complex boundaries in data across multiple tables that can’t be expressed with simpler permissions. Data lakes provide customers the flexibility required to derive useful insights from data across many sources and many use cases.

Data Lake 110
article thumbnail

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

AWS Big Data

Note that the extra package ( delta-iceberg ) is required to create a UniForm table in AWS Glue Data Catalog. The extra package is also required to generate Iceberg metadata along with Delta Lake metadata for the UniForm table. He’s passionate about helping customers use Apache Iceberg for their data lakes on AWS.

Metadata 122