Remove Data Processing Remove Data Warehouse Remove Metadata
article thumbnail

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

An extract, transform, and load (ETL) process using AWS Glue is triggered once a day to extract the required data and transform it into the required format and quality, following the data product principle of data mesh architectures. From here, the metadata is published to Amazon DataZone by using AWS Glue Data Catalog.

IoT 111
article thumbnail

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

These nodes can implement analytical platforms like data lake houses, data warehouses, or data marts, all united by producing data products. The Institutional Data & AI platform adopts a federated approach to data while centralizing the metadata to facilitate simpler discovery and sharing of data products.

Metadata 105
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Federate to Amazon Redshift Query Editor v2 with Microsoft Entra ID

AWS Big Data

Amazon Redshift is a fast, petabyte-scale, cloud data warehouse that tens of thousands of customers rely on to power their analytics workloads. With its massively parallel processing (MPP) architecture and columnar data storage, Amazon Redshift delivers high price-performance for complex analytical queries against large datasets.

Sales 105
article thumbnail

Governing data in relational databases using Amazon DataZone

AWS Big Data

It also makes it easier for engineers, data scientists, product managers, analysts, and business users to access data throughout an organization to discover, use, and collaborate to derive data-driven insights. The producer also needs to manage and publish the data asset so it’s discoverable throughout the organization.

Metadata 107
article thumbnail

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated. To address this challenge, organizations can deploy a data mesh using AWS Lake Formation that connects the multiple EMR clusters. An entity can act both as a producer of data assets and as a consumer of data assets.

Data Lake 116
article thumbnail

CIOs are (still) closer than ever to their dream data lakehouse

CIO Business Intelligence

“The data catalog is critical because it’s where business manages its metadata,” said Venkat Rajaji, Senior Vice President of Product Management at Cloudera. There’s been a ton of innovation lately around the Iceberg REST catalog because the data turf war is over. But the metadata turf war is just getting started.”

Metadata 117
article thumbnail

Implement disaster recovery with Amazon Redshift

AWS Big Data

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers. Document the entire disaster recovery process.