Data Quality, Data Transformation and Metadata

Data Quality

Data Transformation

Metadata

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data.

Data Quality

Data Quality Metrics Data-driven Management

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

Datasphere goes beyond the “big three” data usage end-user requirements (ease of discovery, access, and delivery) to include data orchestration (data ops and data transformations) and business data contextualization (semantics, metadata, catalog services).

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Domain ownership recognizes that the teams generating the data have the deepest understanding of it and are therefore best suited to manage, govern, and share it effectively. This principle makes sure data accountability remains close to the source, fostering higher data quality and relevance.

Metadata

Metadata Data Governance Data Quality Data-driven

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

An extract, transform, and load (ETL) process using AWS Glue is triggered once a day to extract the required data and transform it into the required format and quality, following the data product principle of data mesh architectures. This process is shown in the following figure.

IoT

IoT Machine Learning Metadata Data-driven

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

DECEMBER 13, 2023

In addition to using native managed AWS services that BMS didn’t need to worry about upgrading, BMS was looking to offer an ETL service to non-technical business users that could visually compose data transformation workflows and seamlessly run them on the AWS Glue Apache Spark-based serverless data integration engine.

Metadata

Metadata Data Lake Visualization Data Transformation

What is Data Lineage? Top 5 Benefits of Data Lineage

erwin

APRIL 29, 2020

An understanding of the data’s origins and history helps answer questions about the origin of data in a Key Performance Indicator (KPI) reports, including: How the report tables and columns are defined in the metadata? Who are the data owners? What are the transformation rules? Data Governance.

Key Performance Indicator

Key Performance Indicator Metadata Data Governance Data Quality

From Raw Inputs to Polished Outputs: The Art of Testing Data Transformations

Wayne Yaddow

MARCH 5, 2025

The goal is to examine five major methods of verifying and validating data transformations in data pipelines with an eye toward high-quality data deployment. First, we look at how unit and integration tests uncover transformation errors at an early stage. Applicability by Transformation Type 2.

Testing

Testing Data Transformation Statistics Metadata

Available Now! Automated Testing for Data Transformations

Wayne Yaddow

FEBRUARY 18, 2025

Selecting the strategies and tools for validating data transformations and data conversions in your data pipelines. Introduction Data transformations and data conversions are crucial to ensure that raw data is organized, processed, and ready for useful analysis.

Testing

Testing Data Transformation Data-driven Data Quality

Ensuring Data Transformation Quality with dbt Core

Wayne Yaddow

MARCH 14, 2025

How dbt Core aids data teams test, validate, and monitor complex data transformations and conversions Photo by NASA on Unsplash Introduction dbt Core, an open-source framework for developing, testing, and documenting SQL-based data transformations, has become a must-have tool for modern data teams as the complexity of data pipelines grows.

Data Transformation

Data Transformation Testing Unstructured Data Data Quality

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

Alation

OCTOBER 18, 2022

Data analysts and engineers use dbt to transform, test, and document data in the cloud data warehouse. Yet every dbt transformation contains vital metadata that is not captured – until now. Data Transformation in the Modern Data Stack. How did the data transform exactly?

Metadata

Metadata Metrics Recreation/Entertainment Data Quality

Top 6 Benefits of Automating End-to-End Data Lineage

erwin

SEPTEMBER 17, 2020

Here are six benefits of automating end-to-end data lineage: Reduced Errors and Operational Costs. Data quality is crucial to every organization. Automated data capture can significantly reduce errors when compared to manual entry. Automating data capture frees up resources to focus on more strategic and useful tasks.

Cost-Benefit

Cost-Benefit Data Governance Metadata Reporting

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. How does Data Virtualization manage data quality requirements?

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

The What & Why of Data Governance

erwin

MARCH 4, 2021

In addition to drivers like digital transformation and compliance, it’s really important to look at the effect of poor data on enterprise efficiency/productivity. Then it is accessible and understandable via role-based, contextual views so stakeholders can make strategic decisions based on accurate insights.

Data Governance

Data Governance Digital Transformation Data-driven Cost-Benefit

The importance of data ingestion and integration for enterprise AI

IBM Big Data Hub

JANUARY 9, 2024

The entire generative AI pipeline hinges on the data pipelines that empower it, making it imperative to take the correct precautions. 4 key components to ensure reliable data ingestion Data quality and governance: Data quality means ensuring the security of data sources, maintaining holistic data and providing clear metadata.

Enterprise

Enterprise Data Integration Data Quality Contextual Data

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

Prior to the creation of the data lake, Orca’s data was distributed among various data silos, each owned by a different team with its own data pipelines and technology stack. Moreover, running advanced analytics and ML on disparate data sources proved challenging.

Data Lake

Data Lake Analytics Snapshot Data Quality

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

In this blog, we’ll delve into the critical role of governance and data modeling tools in supporting a seamless data mesh implementation and explore how erwin tools can be used in that role. erwin also provides data governance, metadata management and data lineage software called erwin Data Intelligence by Quest.

Metadata

Metadata Data Quality Data Governance Modeling

NEW: Octopai Announces Support of Microsoft Azure Data Factory

Octopai

JANUARY 19, 2021

This is done by visualizing the Azure Data Factory pipelines’ full column-level with source-to-target traceability through different data transformations at the most detailed level. Octopai can fully map the BI landscape and trace metadata movement in a mixed environment including complex multi-vendor landscapes.

Metadata

Metadata ROI Machine Learning Data Quality

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Alation

AUGUST 11, 2022

We chatted about industry trends, why decentralization has become a hot topic in the data world, and how metadata drives many data-centric use cases. But, through it all, Mohan says it’s critical to view everything through the same lens: gaining business value from data. Data fabric is a technology architecture.

Metadata

Metadata Data Warehouse Data Quality Data Lake

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

erwin

JANUARY 11, 2019

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms , including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. So questions linger about whether transformed data can be trusted.

Data Governance

Data Governance Risk Metadata Management

5 best open source data flow lineage tools

Octopai

AUGUST 11, 2024

Just as a navigation app provides a detailed map of roads, guiding you from your starting point to your destination while highlighting every turn and intersection, data flow lineage offers a comprehensive view of data movement and transformations throughout its lifecycle. Open Source Data Lineage Tools 1.

Metadata

Metadata Visualization Data Quality Data Governance

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Ontotext

NOVEMBER 18, 2021

OntoRefine is a data transformation tool that lets you unite plenty of data formats and get them into your triplestore. One of the core upsides of storing your data in that format is inference. You can think about that as metadata about the data, describing its relationships. Inferring new knowledge.

Visualization

Visualization Reporting Metadata Enterprise

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Ontotext

FEBRUARY 12, 2024

This is especially beneficial when teams need to increase data product velocity with trust and data quality, reduce communication costs, and help data solutions align with business objectives. In most enterprises, data is needed and produced by many business units but owned and trusted by no one.

Data-driven

Data-driven Data Lake Data Quality Business Objectives

Turnkey Cloud DataOps: Solution from Alation and Accenture

Alation

MARCH 22, 2022

As the latest iteration in this pursuit of high-quality data sharing, DataOps combines a range of disciplines. It synthesizes all we’ve learned about agile, data quality , and ETL/ELT. This produces end-to-end lineage so business and technology users alike can understand the state of a data lake and/or lake house.

Metadata

Metadata Cost-Benefit Data Quality Data Lake

Automate discovery of data relationships using ML and Amazon Neptune graph technology

AWS Big Data

APRIL 19, 2023

Background The success of a data-driven organization recognizes data as a key enabler to increase and sustain innovation. The goal of a data product is to solve the long-standing issue of data silos and data quality. This solution solves the interoperability and linkage problem for data products.

Technology

Technology Data-driven Machine Learning Sales

Why The Public Sector Needs Data Governance

Alation

NOVEMBER 22, 2022

What Is Data Governance In The Public Sector? Effective data governance for the public sector enables entities to ensure data quality, enhance security, protect privacy, and meet compliance requirements. With so much focus on compliance, democratizing data for self-service analytics can present a challenge.

Data Governance

Data Governance Metadata Data-driven Unstructured Data

Choosing A Graph Data Model to Best Serve Your Use Case

Ontotext

MARCH 27, 2024

For example, GPS, social media, cell phone handoffs are modeled as graphs while data catalogs, data lineage and MDM tools leverage knowledge graphs for linking metadata with semantics. RDF is used extensively for data publishing and data interchange and is based on W3C and other industry standards.

Modeling

Modeling Metadata Data Quality Enterprise

Sure, Trust Your Data… Until It Breaks Everything: How Automated Data Lineage Saves the Day

Octopai

JUNE 9, 2024

For HealthCo, this meant they could finally see how data moved from its source through various transformations to its final destination. This visibility was crucial for identifying and rectifying data quality issues quickly, ensuring consistent and reliable insights. This is where Octopai excels.

IT Data-driven Predictive Analytics Data Strategy

Tackling AI’s data challenges with IBM databases on AWS

IBM Big Data Hub

MARCH 14, 2024

Businesses face significant hurdles when preparing data for artificial intelligence (AI) applications. The existence of data silos and duplication, alongside apprehensions regarding data quality, presents a multifaceted environment for organizations to manage.

Cost-Benefit

Cost-Benefit Metadata Optimization Management

What is Data Mapping?

Jet Global

FEBRUARY 23, 2024

The quick and dirty definition of data mapping is the process of connecting different types of data from various data sources. Data mapping is a crucial step in data modeling and can help organizations achieve their business goals by enabling data integration, migration, transformation, and quality.

Data Warehouse

Data Warehouse Reporting Data Transformation Sales

A Stitch in Time: How Jet Analytics Boosts Microsoft Fabric Time-to-Value

Jet Global

MARCH 14, 2024

Jet’s interface lets you handle data administration easily, without advanced coding skills. You don’t need technical skills to manage complex data workflows in the Fabric environment. Data Lineage and Documentation Jet Analytics simplifies the process of documenting data assets and tracking data lineage in Fabric.

Analytics

Analytics Management Reporting Data Quality

Unleashing GenAI — Ensuring Data Quality at Scale (Part 2)

Wayne Yaddow

MARCH 28, 2025

Unleashing GenAIEnsuring Data Quality at Scale (Part2) Transitioning from individual repository source systems to consolidated AI LLM pipelines, the importance of automated checks, end-to-end observability, and compliance with enterprise businessrules. First: It is critical to set up a thorough data inventory and assessment procedure.

Data Quality

Data Quality Data Integration Data Governance Metadata

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

AWS Big Data

APRIL 29, 2025

While enabling organization-wide efficiency, the team also applied these principles to the data architecture, making sure that CLEA itself operates frugally. After evaluating various tools, we built a serverless data transformation pipeline using Amazon Athena and dbt. The Source stage maintains raw data in its original form.

Data Transformation

Data Transformation Cost-Benefit Testing Data Lake

Automating Data Warehouses in the Era of AI, Data Products and Data Lakehouses

BI-Survey

MARCH 6, 2025

For data management teams, achieving more with fewer resources has become a familiar challenge. While efficiency is a priority, data quality and security remain non-negotiable. Developing and maintaining data transformation pipelines are among the first tasks to be targeted for automation.

Data Warehouse

Data Warehouse Metadata Unstructured Data Data-driven

“You Complete Me,” said Data Lineage to DataOps Observability.

DataKitchen

JANUARY 23, 2023

It allows organizations to see how data is being used, where it is coming from, its quality, and how it is being transformed. DataOps Observability includes monitoring and testing the data pipeline, data quality, data testing, and alerting. Data lineage is static and often lags by weeks or months.

Testing

Testing Data Governance Data Quality Data-driven

Data Leaders Brief

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

SAP Datasphere Powers Business at the Speed of Data

Webinars

Trending Sources

Data’s dark secret: Why poor quality cripples AI and growth

Webinars

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

How EUROGATE established a data mesh architecture using Amazon DataZone

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

What is Data Lineage? Top 5 Benefits of Data Lineage

From Raw Inputs to Polished Outputs: The Art of Testing Data Transformations

Available Now! Automated Testing for Data Transformations

Ensuring Data Transformation Quality with dbt Core

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

Top 6 Benefits of Automating End-to-End Data Lineage

Biggest Trends in Data Visualization Taking Shape in 2022

The What & Why of Data Governance

The importance of data ingestion and integration for enterprise AI

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Empowering data mesh: The tools to deliver BI excellence

NEW: Octopai Announces Support of Microsoft Azure Data Factory

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

5 best open source data flow lineage tools

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Turnkey Cloud DataOps: Solution from Alation and Accenture

Automate discovery of data relationships using ML and Amazon Neptune graph technology

Why The Public Sector Needs Data Governance

Choosing A Graph Data Model to Best Serve Your Use Case

Sure, Trust Your Data… Until It Breaks Everything: How Automated Data Lineage Saves the Day

Tackling AI’s data challenges with IBM databases on AWS

What is Data Mapping?

A Stitch in Time: How Jet Analytics Boosts Microsoft Fabric Time-to-Value

Unleashing GenAI — Ensuring Data Quality at Scale (Part 2)

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

Automating Data Warehouses in the Era of AI, Data Products and Data Lakehouses

“You Complete Me,” said Data Lineage to DataOps Observability.

Stay Connected