Data Integration, Data Quality and Metrics

Data Integration

Data Quality

Metrics

The Race For Data Quality in a Medallion Architecture

DataKitchen

NOVEMBER 5, 2024

The Race For Data Quality In A Medallion Architecture The Medallion architecture pattern is gaining traction among data teams. It is a layered approach to managing and transforming data. It sounds great, but how do you prove the data is correct at each layer? How do you ensure data quality in every layer ?

Data Quality

Data Quality Testing Metrics Reporting

Data Observability and Data Quality Testing Certification Series

DataKitchen

MAY 14, 2024

Data Observability and Data Quality Testing Certification Series We are excited to invite you to a free four-part webinar series that will elevate your understanding and skills in Data Observation and Data Quality Testing. Slides and recordings will be provided.

Data Quality

Data Quality Testing Metrics Measurement

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

With the growing emphasis on data, organizations are constantly seeking more efficient and agile ways to integrate their data, especially from a wide variety of applications. We take care of the ETL for you by automating the creation and management of data replication. What’s the difference between zero-ETL and Glue ETL?

Data Integration

Data Integration Data Lake Statistics Data-driven

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Introducing AWS Glue Data Quality anomaly detection

AWS Big Data

AUGUST 8, 2024

Thousands of organizations build data integration pipelines to extract and transform data. They establish data quality rules to ensure the extracted data is of high quality for accurate business decisions. After a few months, daily sales surpassed 2 million dollars, rendering the threshold obsolete.

Data Quality

Data Quality Statistics Visualization Metrics

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Get started with AWS Glue Data Quality dynamic rules for ETL pipelines

AWS Big Data

MAY 23, 2024

Hundreds of thousands of organizations build data integration pipelines to extract and transform data. They establish data quality rules to ensure the extracted data is of high quality for accurate business decisions. We also show how to take action based on the data quality results.

Data Quality

Data Quality Metrics Sales Data Lake

AWS Glue Data Quality is Generally Available

AWS Big Data

JUNE 6, 2023

We are excited to announce the General Availability of AWS Glue Data Quality. Our journey started by working backward from our customers who create, manage, and operate data lakes and data warehouses for analytics and machine learning. It takes days for data engineers to identify and implement data quality rules.

Data Quality

Data Quality Statistics Data Lake Visualization

The DataOps Vendor Landscape, 2021

DataKitchen

APRIL 13, 2021

RightData – A self-service suite of applications that help you achieve Data Quality Assurance, Data Integrity Audit and Continuous Data Quality Control with automated validation and reconciliation capabilities. QuerySurge – Continuously detect data issues in your delivery pipelines. Data breaks.

Testing

Testing Machine Learning Consulting Data Science

How to Deliver Data Quality with Data Governance: Ryan Doupe, CDO of American Fidelity, 9-Step Process

Alation

JANUARY 20, 2022

Several weeks ago (prior to the Omicron wave), I got to attend my first conference in roughly two years: Dataversity’s Data Quality and Information Quality Conference. Ryan Doupe, Chief Data Officer of American Fidelity, held a thought-provoking session that resonated with me. Step 2: Data Definitions.

Data Quality

Data Quality Data Governance Metrics Statistics

Top 10 Analytics And Business Intelligence Trends For 2020

datapine

NOVEMBER 27, 2019

Companies are no longer wondering if data visualizations improve analyses but what is the best way to tell each data-story. 2020 will be the year of data quality management and data discovery: clean and secure data combined with a simple and powerful presentation. 1) Data Quality Management (DQM).

Business Intelligence

Business Intelligence Analytics Prescriptive Analytics Data Quality

Getting started with AWS Glue Data Quality from the AWS Glue Data Catalog

AWS Big Data

JUNE 6, 2023

AWS Glue is a serverless data integration service that makes it simple to discover, prepare, and combine data for analytics, machine learning (ML), and application development. Hundreds of thousands of customers use data lakes for analytics and ML to make data-driven business decisions.

Data Quality

Data Quality Data-driven Data Lake Metrics

Bridging the Gap: How ‘Data in Place’ and ‘Data in Use’ Define Complete Data Observability

DataKitchen

SEPTEMBER 21, 2023

These layers help teams delineate different stages of data processing, storage, and access, offering a structured approach to data management. In the context of Data in Place, validating data quality automatically with Business Domain Tests is imperative for ensuring the trustworthiness of your data assets.

Testing

Testing Data Quality Predictive Modeling Metrics

Navigating the Chaos of Unruly Data: Solutions for Data Teams

DataKitchen

NOVEMBER 10, 2023

Extrinsic Control Deficit: Many of these changes stem from tools and processes beyond the immediate control of the data team. Unregulated ETL/ELT Processes: The absence of stringent data quality tests in ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) processes further exacerbates the problem.

Data Quality

Data Quality Testing Data Lake Data Integration

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

In this blog post, we’ll discuss how the metadata layer of Apache Iceberg can be used to make data lakes more efficient. You will learn about an open-source solution that can collect important metrics from the Iceberg metadata layer. This ensures that each change is tracked and reversible, enhancing data governance and auditability.

Metadata

Metadata Snapshot Data Lake Metrics

Finding Data Quality

Jim Harris

DECEMBER 24, 2015

Have you ever experienced that sinking feeling, where you sense if you don’t find data quality, then data quality will find you? These discussions are a critical prerequisite for determining data usage, standards, and the business relevant metrics for measuring and improving data quality.

Data Quality

Data Quality Enterprise Business Intelligence Data Governance

The Five Use Cases in Data Observability: Mastering Data Production

DataKitchen

MAY 10, 2024

The Third of Five Use Cases in Data Observability Data Evaluation: This involves evaluating and cleansing new datasets before being added to production. This process is critical as it ensures data quality from the onset. Examples include regular loading of CRM data and anomaly detection.

Metrics

Metrics Testing Data Quality Dashboards

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

cycle_end";') con.close() With this, as the data lands in the curated data lake (Amazon S3 in parquet format) in the producer account, the data science and AI teams gain instant access to the source data eliminating traditional delays in the data availability.

IoT

IoT Machine Learning Metadata Data-driven

Why you should care about debugging machine learning models

O'Reilly on Data

DECEMBER 12, 2019

Residual plots place input data and predictions into a two-dimensional visualization where influential outliers, data-quality problems, and other types of bugs often become plainly visible. For model training and selection, we recommend considering fairness metrics when selecting hyperparameters and decision cutoff thresholds.

Machine Learning

Machine Learning Modeling Testing Risk Management

Development Strategies to Prevent Data Quality Issues in Production (Part 1)

Wayne Yaddow

MARCH 3, 2025

When implementing automated validation, AI-driven regression testing, real-time canary pipelines, synthetic data generation, freshness enforcement, KPI tracking, and CI/CD automation, organizations can shift from reactive data observability to proactive data quality assurance.

Data Quality

Data Quality Strategy ROI Testing

Alation Launches Open Data Quality Framework

Alation

MAY 24, 2022

In a sea of questionable data, how do you know what to trust? Data quality tells you the answer. It signals what data is trustworthy, reliable, and safe to use. It empowers engineers to oversee data pipelines that deliver trusted data to the wider organization. Today, as part of its 2022.2

Data Quality

Data Quality Metadata Reporting Metrics

Data Observability and Monitoring with DataOps

DataKitchen

MAY 10, 2021

Some will argue that observability is nothing more than testing and monitoring applications using tests, metrics, logs, and other artifacts. That’s a fair point, and it places emphasis on what is most important – what best practices should data teams employ to apply observability to data analytics. It’s not about data quality .

Testing

Testing Manufacturing Data Quality Statistics

What Is Data Quality and Why Is It Important?

Alation

AUGUST 5, 2021

What is Data Quality? Data quality is defined as: the degree to which data meets a company’s expectations of accuracy, validity, completeness, and consistency. By tracking data quality , a business can pinpoint potential issues harming quality, and ensure that shared data is fit to be used for a given purpose.

Data Quality

Data Quality IT Data Governance Sales

DataOps with Matillion and DataKitchen

DataKitchen

JANUARY 19, 2022

The Matillion data integration and transformation platform enables enterprises to perform advanced analytics and business intelligence using cross-cloud platform-as-a-service offerings such as Snowflake. DataOps recommends that tests monitor data continuously in addition to checks performed when pipelines are run on demand.

Testing

Testing Data Integration Data Warehouse Enterprise

How Can BI Consulting Services Help Foster Data-driven Decisions

BizAcuity

NOVEMBER 13, 2024

Beyond mere data collection, BI consulting helps businesses create a cohesive data strategy that aligns with organizational goals. This approach involves everything from identifying key metrics to implementing analytics systems and designing dashboards.

Consulting

Consulting Data-driven Cost-Benefit Business Intelligence

Introducing The Five Pillars Of Data Journeys

DataKitchen

JUNE 19, 2023

Another way to look at the five pillars is to see them in the context of a typical complex data estate. Monitoring is another pillar of Data Journeys, extending down the stack. Moreover, cost monitoring ensures that your data operations stay within budget and that resources are used efficiently. Donkey: Oh, they have layers.

Testing

Testing Data Quality Metrics Cost-Benefit

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

AWS Big Data

SEPTEMBER 11, 2024

The application supports custom workflows to allow demand and supply planning teams to collaborate, plan, source, and fulfill customer orders, then track fulfillment metrics via persona-based operational and management reports and dashboards. The data quality (DQ) checks are managed using DQ configurations stored in Aurora PostgreSQL tables.

Data Architecture

Data Architecture Optimization Data Warehouse Metadata

How to Pinpoint Where Your Organization Wins (and Loses) with Data

CIO Business Intelligence

NOVEMBER 29, 2022

Here, I’ll highlight the where and why of these important “data integration points” that are key determinants of success in an organization’s data and analytics strategy. Layering technology on the overall data architecture introduces more complexity. Data and cloud strategy must align.

Data Architecture

Data Architecture Data Integration IoT Data-driven

Don’t Fear Artificial Intelligence; Embrace it Through Data Governance

CIO Business Intelligence

APRIL 29, 2022

Despite soundings on this from leading thinkers such as Andrew Ng , the AI community remains largely oblivious to the important data management capabilities, practices, and – importantly – the tools that ensure the success of AI development and deployment. Further, data management activities don’t end once the AI model has been developed.

Data Governance

Data Governance IT Data Lake Risk

Data governance in the age of generative AI

AWS Big Data

FEBRUARY 29, 2024

Working with large language models (LLMs) for enterprise use cases requires the implementation of quality and privacy considerations to drive responsible AI. However, enterprise data generated from siloed sources combined with the lack of a data integration strategy creates challenges for provisioning the data for generative AI applications.

Data Governance

Data Governance Unstructured Data Metadata Data Lake

Data Teams and Their Types of Data Journeys

DataKitchen

OCTOBER 2, 2023

Data Journeys track and monitor all levels of the data stack, from data to tools to code to tests across all critical dimensions. A Data Journey supplies real-time statuses and alerts on start times, processing durations, test results, and infrastructure events, among other metrics.

Data Quality

Data Quality Testing Uncertainty Data Enablement

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

Prior to the creation of the data lake, Orca’s data was distributed among various data silos, each owned by a different team with its own data pipelines and technology stack. Moreover, running advanced analytics and ML on disparate data sources proved challenging.

Data Lake

Data Lake Analytics Snapshot Data Quality

How AWS helped Altron Group accelerate their vision for optimized customer engagement

AWS Big Data

JULY 13, 2023

Data quality for account and customer data – Altron wanted to enable data quality and data governance best practices. Goals – Lay the foundation for a data platform that can be used in the future by internal and external stakeholders.

Optimization

Optimization B2B Data Quality Sales

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

In 2022, AWS commissioned a study conducted by the American Productivity and Quality Center (APQC) to quantify the Business Value of Customer 360. The following figure shows some of the metrics derived from the study. Data exploration Data exploration helps unearth inconsistencies, outliers, or errors.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Your Effective Roadmap To Implement A Successful Business Intelligence Strategy

datapine

FEBRUARY 22, 2022

IT should be involved to ensure governance, knowledge transfer, data integrity, and the actual implementation. Clean data in, clean analytics out. Cleaning your data may not be quite as simple, but it will ensure the success of your BI. Indeed, every year low-quality data is estimated to cost over $9.7

Business Intelligence

Business Intelligence Strategy Cost-Benefit Key Performance Indicator

AI is key player in Texas Rangers’ winning formula

CIO Business Intelligence

JUNE 5, 2024

With the help of Hawk-Eye, Statcast tracks and quantifies all manner of data: pitching (including velocity, spin rate and direction, and movement), hitting (exit velocity, launch angle, batted ball distance), running (sprint speed, base-to-base times), and fielding (arm strength, catch probability, catcher pop time).

Optimization

Optimization Predictive Modeling Data-driven Modeling

Financial Dashboard: Definition, Examples, and How-tos

FineReport

MAY 31, 2023

Financial Performance Dashboard The financial performance dashboard provides a comprehensive overview of key metrics related to your balance sheet, shedding light on the efficiency of your capital expenditure. While sales dashboards focus on future prospects, accounting primarily focuses on analyzing the same metrics retrospectively.

Dashboards

Dashboards Key Performance Indicator Metrics Visualization

Sure, Trust Your Data… Until It Breaks Everything: How Automated Data Lineage Saves the Day

Octopai

JUNE 9, 2024

For instance, aligning patient care data from Oracle databases with operational metrics from Power BI was daunting without clear data lineage. Different departments managed their data independently, leading to silos and inconsistencies. This led to better integration and consistency across the organization.

IT Data-driven Predictive Analytics Data Strategy

HR Dashboard: Everything You Need To Know

FineReport

MAY 25, 2023

An HR dashboard functions as an advanced analytics tool that utilizes interactive data visualizations to present crucial HR metrics. Similar to various other business departments, human resources is gradually transforming into a data-centric function. Otherwise, it may become a ‘vanity metric’ in the HR dashboard.

Dashboards

Dashboards Metrics Key Performance Indicator Cost-Benefit

Data Engineers Are Using AI to Verify Data Transformations

Wayne Yaddow

FEBRUARY 26, 2025

Photo by Markus Spiske on Unsplash Introduction Senior data engineers and data scientists are increasingly incorporating artificial intelligence (AI) and machine learning (ML) into data validation procedures to increase the quality, efficiency, and scalability of data transformations and conversions.

Data Transformation

Data Transformation Testing Data-driven Data Quality

How Data Governance Supports Analytics

Alation

JANUARY 27, 2022

Creating a single view of any data, however, requires the integration of data from disparate sources. Data integration is valuable for businesses of all sizes due to the many benefits of analyzing data from different sources. But data integration is not trivial. Establishes Trust in Data.

Data Governance

Data Governance Analytics Cost-Benefit Data-driven

AI In Analytics: Today and Tomorrow!

Smarten

APRIL 19, 2024

Key Influencer Analytics to understand interrelationships and impact of data columns with each other and target columns Sentiment Analysis This sophisticated analytical technique goes beyond quantitative questionnaires and surveys to capture the real opinions, feelings and sentiments of consumers, employees, and other stakeholders.

Analytics

Analytics Predictive Modeling KPI Machine Learning

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

AWS Big Data

MARCH 3, 2023

It has been well published since the State of DevOps 2019 DORA Metrics were published that with DevOps, companies can deploy software 208 times more often and 106 times faster, recover from incidents 2,604 times faster, and release 7 times fewer defects. Finally, data integrity is of paramount importance.

Software

Software Data Lake Testing Cost-Benefit

9 Distinct Threats to Your BI Implementation

Jet Global

MAY 1, 2020

They are going to have different ways of combining numbers into metrics. We can almost guarantee you different results from each, and you end up with no data integrity whatsoever. Data quality issues. Here’s the ugly truth: Everybody has a data quality problem. Learn how to prepare your data for BI.

Data Warehouse

Data Warehouse Data Quality Risk Reporting

How data stores and governance impact your AI initiatives

IBM Big Data Hub

OCTOBER 12, 2023

Security and privacy —When all data scientists and AI models are given access to data through a single point of entry, data integrity and security are improved. They can also spot and root out bias and drift proactively by monitoring, cataloging and governing their models.

Cost-Benefit

Cost-Benefit Metadata Data Governance Optimization

The Race For Data Quality in a Medallion Architecture

Data Observability and Data Quality Testing Certification Series

Webinars

Trending Sources

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Webinars

Introducing AWS Glue Data Quality anomaly detection

Data’s dark secret: Why poor quality cripples AI and growth

Get started with AWS Glue Data Quality dynamic rules for ETL pipelines

AWS Glue Data Quality is Generally Available

The DataOps Vendor Landscape, 2021

How to Deliver Data Quality with Data Governance: Ryan Doupe, CDO of American Fidelity, 9-Step Process

Top 10 Analytics And Business Intelligence Trends For 2020

Getting started with AWS Glue Data Quality from the AWS Glue Data Catalog

Bridging the Gap: How ‘Data in Place’ and ‘Data in Use’ Define Complete Data Observability

Navigating the Chaos of Unruly Data: Solutions for Data Teams

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

Finding Data Quality

The Five Use Cases in Data Observability: Mastering Data Production

How EUROGATE established a data mesh architecture using Amazon DataZone

Why you should care about debugging machine learning models

Development Strategies to Prevent Data Quality Issues in Production (Part 1)

Alation Launches Open Data Quality Framework

Data Observability and Monitoring with DataOps

What Is Data Quality and Why Is It Important?

DataOps with Matillion and DataKitchen

How Can BI Consulting Services Help Foster Data-driven Decisions

Introducing The Five Pillars Of Data Journeys

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

How to Pinpoint Where Your Organization Wins (and Loses) with Data

Don’t Fear Artificial Intelligence; Embrace it Through Data Governance

Data governance in the age of generative AI

Data Teams and Their Types of Data Journeys

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

How AWS helped Altron Group accelerate their vision for optimized customer engagement

Create an end-to-end data strategy for Customer 360 on AWS

Your Effective Roadmap To Implement A Successful Business Intelligence Strategy

AI is key player in Texas Rangers’ winning formula

Financial Dashboard: Definition, Examples, and How-tos

Sure, Trust Your Data… Until It Breaks Everything: How Automated Data Lineage Saves the Day

HR Dashboard: Everything You Need To Know

Data Engineers Are Using AI to Verify Data Transformations

How Data Governance Supports Analytics

AI In Analytics: Today and Tomorrow!

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

9 Distinct Threats to Your BI Implementation

How data stores and governance impact your AI initiatives

Stay Connected