Data Architecture, Data Transformation and Metadata

Data Architecture

Data Transformation

Metadata

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Need for a data mesh architecture Because entities in the EUROGATE group generate vast amounts of data from various sourcesacross departments, locations, and technologiesthe traditional centralized data architecture struggles to keep up with the demands for real-time insights, agility, and scalability.

IoT

IoT Machine Learning Metadata Data-driven

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

We also examine how centralized, hybrid and decentralized data architectures support scalable, trustworthy ecosystems. As data-centric AI, automated metadata management and privacy-aware data sharing mature, the opportunity to embed data quality into the enterprises core has never been more significant.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Making OT-IT integration a reality with new data architectures and generative AI

CIO Business Intelligence

FEBRUARY 20, 2024

The data transformation imperative What Denso and other industry leaders realise is that for IT-OT convergence to be realised, and the benefits of AI unlocked, data transformation is vital. The company can also unify its knowledge base and promote search and information use that better meets its needs.

Data Architecture

Data Architecture Unstructured Data Manufacturing IT

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Top 6 Benefits of Automating End-to-End Data Lineage

erwin

SEPTEMBER 17, 2020

With complex data architectures and systems within so many organizations, tracking data in motion and data at rest is daunting to say the least. Harvesting the data through automation seamlessly removes ambiguity and speeds up the processing time-to-market capabilities.

Cost-Benefit

Cost-Benefit Data Governance Metadata Reporting

Lay the groundwork now for advanced analytics and AI

CIO Business Intelligence

AUGUST 3, 2023

“You had to be an expert in the programming language that interacts with that data, and understand the relationships of each data element within each data source, let alone understand its relation to elements in other data sources,” he says. Without those templates, it’s hard to add such information after the fact.”

Analytics

Analytics Data Lake Metadata Cost-Benefit

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

AWS Big Data

NOVEMBER 15, 2023

It seamlessly consolidates data from various data sources within AWS, including AWS Cost Explorer (and forecasting with Cost Explorer ), AWS Trusted Advisor , and AWS Compute Optimizer. Data providers and consumers are the two fundamental users of a CDH dataset. You might notice that this differs slightly from traditional ETL.

Analytics

Analytics Dashboards Metadata Data Warehouse

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

These tools empower analysts and data scientists to easily collaborate on the same data, with their choice of tools and analytic engines. No more lock-in, unnecessary data transformations, or data movement across tools and clouds just to extract insights out of the data.

Data Lake

Data Lake Data Warehouse Data Architecture Metadata

Automate discovery of data relationships using ML and Amazon Neptune graph technology

AWS Big Data

APRIL 19, 2023

Independent data products often only have value if you can connect them, join them, and correlate them to create a higher order data product that creates additional insights. A modern data architecture is critical in order to become a data-driven organization.

Technology

Technology Data-driven Machine Learning Sales

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

Overview of solution As a data-driven company, smava relies on the AWS Cloud to power their analytics use cases. smava ingests data from various external and internal data sources into a landing stage on the data lake based on Amazon Simple Storage Service (Amazon S3).

Data Lake

Data Lake Data Warehouse Data-driven B2B

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

AWS Big Data

NOVEMBER 16, 2023

Data Vault 2.0 allows for the following: Agile data warehouse development Parallel data ingestion A scalable approach to handle multiple data sources even on the same entity A high level of automation Historization Full lineage support However, Data Vault 2.0

Enterprise

Enterprise Data Warehouse Data Lake Optimization

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. This ensures that the data is suitable for training purposes. These robust capabilities ensure that data within the data lake remains accurate, consistent, and reliable.

Data Lake

Data Lake Analytics Snapshot Data Quality

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

The data mesh framework In the dynamic landscape of data management, the search for agility, scalability, and efficiency has led organizations to explore new, innovative approaches. One such innovation gaining traction is the data mesh framework. This empowers individual teams to own and manage their data.

Metadata

Metadata Data Quality Data Governance Modeling

How to modernize data lakes with a data lakehouse architecture

IBM Big Data Hub

JULY 5, 2023

This was, without a question, a significant departure from traditional analytic environments, which often meant vendor-lock in and the inability to work with data at scale. Another unexpected challenge was the introduction of Spark as a processing framework for big data. Comprehensive data security and data governance (i.e.

Data Lake

Data Lake Metadata Cost-Benefit Data Warehouse

Choosing A Graph Data Model to Best Serve Your Use Case

Ontotext

MARCH 27, 2024

For example, GPS, social media, cell phone handoffs are modeled as graphs while data catalogs, data lineage and MDM tools leverage knowledge graphs for linking metadata with semantics. RDF is used extensively for data publishing and data interchange and is based on W3C and other industry standards.

Modeling

Modeling Metadata Data Quality Enterprise

Data platform trinity: Competitive or complementary?

IBM Big Data Hub

JANUARY 18, 2023

This adds an additional ETL step, making the data even more stale. Data lakehouse was created to solve these problems. The data warehouse storage layer is removed from lakehouse architectures. Instead, continuous data transformation is performed within the BLOB storage. Data mesh: A mostly new culture.

Data Lake

Data Lake Data Warehouse Data-driven Metadata

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Ontotext

FEBRUARY 12, 2024

For many organizations, a centralized data platform will fall short as it gives data teams much less autonomy over managing increasingly diverse and voluminous datasets. A centralized data engineering team focuses on building a governed self-serviced infrastructure, while domain teams use the services to build full-stack data products.

Data-driven

Data-driven Data Lake Data Quality Business Objectives

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

Data ingestion – Steps 1 and 2 use AWS DMS, which connects to the source database and moves full and incremental data (CDC) to Amazon S3 in Parquet format. Data transformation – Steps 3 and 4 represent an EMR Serverless Spark application (Amazon EMR 6.9 Monjumi Sarma is a Data Lab Solutions Architect at AWS.

Data Lake

Data Lake Dashboards Metrics Metadata

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

Data Environment First off, the solutions you consider should be compatible with your current data architecture. We have outlined the requirements that most providers ask for: Data Sources Strategic Objective Use native connectivity optimized for the data source. addresses).

Analytics

Analytics Cost-Benefit Visualization Dashboards

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

AWS Big Data

APRIL 29, 2025

While enabling organization-wide efficiency, the team also applied these principles to the data architecture, making sure that CLEA itself operates frugally. After evaluating various tools, we built a serverless data transformation pipeline using Amazon Athena and dbt.

Data Transformation

Data Transformation Cost-Benefit Testing Data Lake

Melting the ice — How Natural Intelligence simplified a data lake migration to Apache Iceberg

AWS Big Data

APRIL 28, 2025

The data is stored in Apache Parquet format with AWS Glue Catalog providing metadata management. While this architecture supported NI analytical needs, it lacked the flexibility required for a truly open and adaptable data platform. This meant NI couldnt rely on Glue Catalog events to detect partition changes.

Data Lake

Data Lake Metadata Cost-Benefit Snapshot

Petabyte-scale data migration made simple: AppsFlyer’s best practice journey with Amazon EMR Serverless

AWS Big Data

MAY 12, 2025

Streaming pipelines used Spark Streaming to ingest real-time data from Kafka, writing raw datasets to an Amazon Simple Storage Service (Amazon S3) data lake while simultaneously loading them into BigQuery and Google Cloud Storage to build logical data layers.

Metrics

Metrics Cost-Benefit Metadata Data Lake

Introducing the HubSpot connector for AWS Glue

AWS Big Data

DECEMBER 2, 2024

AWS Glue establishes a secure connection to HubSpot using OAuth for authorization and TLS for data encryption in transit. AWS Glue also supports the ability to apply complex data transformations, enabling efficient data integration and preparation to meet your needs.

Data Lake

Data Lake Testing Data Integration Metadata

“You Complete Me,” said Data Lineage to DataOps Observability.

DataKitchen

JANUARY 23, 2023

To capture a more complete picture of the data’s journey, it is important to have a DataOps Observability system in place. Data lineage is static and often lags by weeks or months. Data lineage is often considered static because it is typically based on snapshots of data and metadata taken at a specific time.

Testing

Testing Data Governance Data Quality Data-driven

Ingest telemetry messages in near real time with Amazon API Gateway, Amazon Data Firehose, and Amazon Location Service

AWS Big Data

NOVEMBER 14, 2024

We use the built-in features of Data Firehose, including AWS Lambda for necessary data transformation and Amazon Simple Notification Service (Amazon SNS) for near real-time alerts. AWS Glue – The AWS Glue Data Catalog is your persistent technical metadata store in the AWS Cloud. Meters) GPS value Speed s 1.0 (km/h)

Data Lake

Data Lake Metadata Testing Data-driven

Data Leaders Brief

How EUROGATE established a data mesh architecture using Amazon DataZone

Data’s dark secret: Why poor quality cripples AI and growth

Webinars

Trending Sources

Making OT-IT integration a reality with new data architectures and generative AI

Webinars

Top 6 Benefits of Automating End-to-End Data Lineage

Lay the groundwork now for advanced analytics and AI

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Automate discovery of data relationships using ML and Amazon Neptune graph technology

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Empowering data mesh: The tools to deliver BI excellence

How to modernize data lakes with a data lakehouse architecture

Choosing A Graph Data Model to Best Serve Your Use Case

Data platform trinity: Competitive or complementary?

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

What Is Embedded Analytics?

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

Melting the ice — How Natural Intelligence simplified a data lake migration to Apache Iceberg

Petabyte-scale data migration made simple: AppsFlyer’s best practice journey with Amazon EMR Serverless

Introducing the HubSpot connector for AWS Glue

“You Complete Me,” said Data Lineage to DataOps Observability.

Ingest telemetry messages in near real time with Amazon API Gateway, Amazon Data Firehose, and Amazon Location Service

Stay Connected