Data Quality, Metrics and Optimization

The Race For Data Quality in a Medallion Architecture

DataKitchen

NOVEMBER 5, 2024

The Race For Data Quality In A Medallion Architecture The Medallion architecture pattern is gaining traction among data teams. It is a layered approach to managing and transforming data. It sounds great, but how do you prove the data is correct at each layer? How do you ensure data quality in every layer ?

Data Quality

Data Quality Testing Metrics Reporting

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data.

Data Quality

Data Quality Metrics Data-driven Management

Unlocking Data Team Success: Are You Process-Centric or Data-Centric?

DataKitchen

MARCH 20, 2025

We’ve identified two distinct types of data teams: process-centric and data-centric. Understanding this framework offers valuable insights into team efficiency, operational excellence, and data quality. Process-centric data teams focus their energies predominantly on orchestrating and automating workflows.

Data Quality

Data Quality Testing Metrics Management

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Introducing AWS Glue Data Quality anomaly detection

AWS Big Data

AUGUST 8, 2024

They establish data quality rules to ensure the extracted data is of high quality for accurate business decisions. These rules commonly assess the data based on fixed criteria reflecting the current business state. In this post, we demonstrate how this feature works with an example.

Data Quality

Data Quality Statistics Visualization Metrics

Get started with AWS Glue Data Quality dynamic rules for ETL pipelines

AWS Big Data

MAY 23, 2024

They establish data quality rules to ensure the extracted data is of high quality for accurate business decisions. These rules assess the data based on fixed criteria reflecting current business states. We are excited to talk about how to use dynamic rules , a new capability of AWS Glue Data Quality.

Data Quality

Data Quality Metrics Sales Data Lake

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Set up alerts and orchestrate data quality rules with AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

Alerts and notifications play a crucial role in maintaining data quality because they facilitate prompt and efficient responses to any data quality issues that may arise within a dataset. This proactive approach helps mitigate the risk of making decisions based on inaccurate information.

Data Quality

Data Quality Metrics Data-driven Visualization

Top 10 Analytics And Business Intelligence Trends For 2020

datapine

NOVEMBER 27, 2019

Companies are no longer wondering if data visualizations improve analyses but what is the best way to tell each data-story. 2020 will be the year of data quality management and data discovery: clean and secure data combined with a simple and powerful presentation. 1) Data Quality Management (DQM).

Business Intelligence

Business Intelligence Analytics Prescriptive Analytics Data Quality

The DataOps Vendor Landscape, 2021

DataKitchen

APRIL 13, 2021

RightData – A self-service suite of applications that help you achieve Data Quality Assurance, Data Integrity Audit and Continuous Data Quality Control with automated validation and reconciliation capabilities. QuerySurge – Continuously detect data issues in your delivery pipelines. Data breaks.

Testing

Testing Machine Learning Consulting Data Science

7 types of tech debt that could cripple your business

CIO Business Intelligence

MARCH 25, 2025

Data debt that undermines decision-making In Digital Trailblazer , I share a story of a private company that reported a profitable year to the board, only to return after the holiday to find that data quality issues and calculation mistakes turned it into an unprofitable one.

Risk

Risk Cost-Benefit Data-driven Digital Transformation

Measure performance of AWS Glue Data Quality for ETL pipelines

AWS Big Data

MARCH 12, 2024

In recent years, data lakes have become a mainstream architecture, and data quality validation is a critical factor to improve the reusability and consistency of the data. In this post, we provide benchmark results of running increasingly complex data quality rulesets over a predefined test dataset.

Data Quality

Data Quality Measurement Testing Visualization

Unbundling the Graph in GraphRAG

O'Reilly on Data

NOVEMBER 19, 2024

While RAG leverages nearest neighbor metrics based on the relative similarity of texts, graphs allow for better recall of less intuitive connections. decomposes a complex task into a graph of subtasks, then uses LLMs to answer the subtasks while optimizing for costs across the graph.

Unstructured Data

Unstructured Data Structured Data Statistics Modeling

AWS Glue Data Quality is Generally Available

AWS Big Data

JUNE 6, 2023

We are excited to announce the General Availability of AWS Glue Data Quality. Our journey started by working backward from our customers who create, manage, and operate data lakes and data warehouses for analytics and machine learning. It takes days for data engineers to identify and implement data quality rules.

Data Quality

Data Quality Statistics Data Lake Visualization

Two Downs Make Two Ups: The Only Success Metrics That Matter For Your Data & Analytics Team

DataKitchen

MARCH 16, 2023

So it’s Monday, and you lead a data analytics team of perhaps 30 people. But wait, she asks you for your team metrics. Like most leaders of data analytic teams, you have been doing very little to quantify your team’s success. Where is your metrics report? What should be in that report about your data team?

Metrics

Metrics Data Analytics Analytics Measurement

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

Furthermore, you can gain insights into the performance of your data transformations with detailed execution logs and metrics, all accessible through the dbt Cloud interface. Cost management and optimization – Because Athena charges based on the amount of data scanned by each query, cost optimization is critical.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

Get The Most Out Of Smart Business Intelligence Reporting

datapine

JANUARY 21, 2020

The balance sheet gives an overview of the main metrics which can easily define trends and the way company assets are being managed. Operational optimization and forecasting. Cost optimization. Another important factor to consider is cost optimization. Enhanced data quality. It doesn’t stop here.

Business Intelligence

Business Intelligence Reporting Cost-Benefit Dashboards

7 ways gen AI can create more work than it saves

CIO Business Intelligence

NOVEMBER 13, 2024

The company has already rolled out a gen AI assistant and is also looking to use AI and LLMs to optimize every process. One is going through the big areas where we have operational services and look at every process to be optimized using artificial intelligence and large language models. We’re doing two things,” he says.

IT

IT Consulting ROI Cost-Benefit

Getting started with AWS Glue Data Quality from the AWS Glue Data Catalog

AWS Big Data

JUNE 6, 2023

Data consumers lose trust in data if it isn’t accurate and recent, making data quality essential for undertaking optimal and correct decisions. Evaluation of the accuracy and freshness of data is a common task for engineers. Currently, various tools are available to evaluate data quality.

Data Quality

Data Quality Data-driven Data Lake Metrics

What are model governance and model operations?

O'Reilly on Data

JUNE 19, 2019

In a previous post , we noted some key attributes that distinguish a machine learning project: Unlike traditional software where the goal is to meet a functional specification, in ML the goal is to optimize a metric. Quality depends not just on code, but also on data, tuning, regular updates, and retraining.

Modeling

Modeling Machine Learning Testing Metrics

What you need to know about product management for AI

O'Reilly on Data

MARCH 31, 2020

At measurement-obsessed companies, every part of their product experience is quantified and adjusted to optimize user experience. These companies eventually moved beyond using data to inform product design decisions. Without large amounts of good raw and labeled training data, solving most AI problems is not possible.

Management

Management Machine Learning Experimentation Metrics

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

Despite their advantages, traditional data lake architectures often grapple with challenges such as understanding deviations from the most optimal state of the table over time, identifying issues in data pipelines, and monitoring a large number of tables. It is essential for optimizing read and write performance.

Metadata

Metadata Snapshot Data Lake Metrics

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

AWS Big Data

SEPTEMBER 11, 2024

The application supports custom workflows to allow demand and supply planning teams to collaborate, plan, source, and fulfill customer orders, then track fulfillment metrics via persona-based operational and management reports and dashboards. The data quality (DQ) checks are managed using DQ configurations stored in Aurora PostgreSQL tables.

Data Architecture

Data Architecture Optimization Data Warehouse Metadata

Bridging the Gap: How ‘Data in Place’ and ‘Data in Use’ Define Complete Data Observability

DataKitchen

SEPTEMBER 21, 2023

L1 is usually the raw, unprocessed data ingested directly from various sources; L2 is an intermediate layer featuring data that has undergone some form of transformation or cleaning; and L3 contains highly processed, optimized, and typically ready for analytics and decision-making processes. What is Data in Use?

Testing

Testing Data Quality Predictive Modeling Metrics

The Journey to DataOps Success: Key Takeaways from Transformation Trailblazers

DataKitchen

APRIL 26, 2021

At Workiva, they recognized that they are only as good as their data, so they centered their initial DataOps efforts around lowering errors. Hodges commented, “Our first focus was to up our game around data quality and lowering errors in production. Multiple Metrics for Success. At GSK, success is all about adoption.

Measurement

Measurement Metrics Data-driven Dashboards

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Domain ownership recognizes that the teams generating the data have the deepest understanding of it and are therefore best suited to manage, govern, and share it effectively. This principle makes sure data accountability remains close to the source, fostering higher data quality and relevance.

Metadata

Metadata Data Governance Data Quality Data-driven

Steps taken to build Sevita’s first enterprise data platform

CIO Business Intelligence

OCTOBER 23, 2024

For the first time, we’re consolidating data to create real-time dashboards for revenue forecasting, resource optimization, and labor utilization. Data literacy across the company was a challenge because, as is often the case, we were all describing our business data a little differently. How is the new platform helping?

Enterprise

Enterprise Dashboards KPI Data Lake

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

For container terminal operators, data-driven decision-making and efficient data sharing are vital to optimizing operations and boosting supply chain efficiency. The data science and AI teams are able to explore and use new data sources as they become available through Amazon DataZone.

IoT

IoT Machine Learning Metadata Data-driven

Why Data Driven Decision Making is Your Path To Business Success

datapine

APRIL 16, 2019

While sometimes it’s okay to follow your instincts, the vast majority of your business-based decisions should be backed by metrics, facts, or figures related to your aims, goals, or initiatives that can ensure a stable backbone to your management reports and business operations. In most cases, this can prove detrimental to the business.

Data-driven

Data-driven Dashboards Visualization Cost-Benefit

Digital KPIs: The secret to measuring transformational success

CIO Business Intelligence

JANUARY 23, 2024

For example, McKinsey suggests five metrics for digital CEOs , including the financial return on digital investments, the percentage of leaders’ incentives linked to digital, and the percentage of the annual tech budget spent on bold digital initiatives. As a result, outcome-based metrics should be your guide.

Measurement

Measurement Digital Transformation KPI Metrics

30 Best Manufacturing KPIs and Metric Examples for 2020 Reporting

Jet Global

MARCH 4, 2020

A manufacturing Key Performance Indicator (KPI) or metric is a well defined and quantifiable measure that the manufacturing industry uses to gauge its performance over time. Manufacturing companies specifically use KPIs to monitor, analyze, and optimize operations, often comparing their efficiencies to those of competitors in the same sector.

Manufacturing

Manufacturing Metrics Reporting KPI

Why HR professionals struggle with big data

CIO Business Intelligence

FEBRUARY 20, 2025

While data tends to be used in tactical-operational areas such as HR reporting and controlling, there is still room for improvement in the strategic area of people analytics. Most use master data to make daily processes more efficient and to optimize the use of existing resources.

Big Data

Big Data Measurement Visualization Machine Learning

How AWS helped Altron Group accelerate their vision for optimized customer engagement

AWS Big Data

JULY 13, 2023

Data quality for account and customer data – Altron wanted to enable data quality and data governance best practices. Goals – Lay the foundation for a data platform that can be used in the future by internal and external stakeholders.

Optimization

Optimization B2B Data Quality Sales

Dear Avinash: Attribution Modeling, Org Culture, Deeper Analysis

Occam's Razor

AUGUST 13, 2012

The questions reveal a bunch of things we used to worry about, and continue to, like data quality and creating data driven cultures. That means: All of these metrics are off. I can use that to hypothesize what an optimal budget allocation might look like. EU Cookies!)

Modeling

Modeling Metrics Data Quality Data-driven

Deep automation: A CIO weapon for turning disruption into opportunity

CIO Business Intelligence

AUGUST 13, 2024

Unlike traditional approaches, deep automation is holistic, adaptive, and evolutive, prioritizing human-machine partnership and customer experience for optimal efficiency and impact. AI-integrated tractors, planters, and harvesters form a data-driven team, optimizing tasks and empowering farmers.

Data-driven

Data-driven Metrics Optimization Deep Learning

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Migrating to Amazon Redshift offers organizations the potential for improved price-performance, enhanced data processing, faster query response times, and better integration with technologies such as machine learning (ML) and artificial intelligence (AI).

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Data Quality vs Data Condition: The Power of Context

Anmut

JULY 12, 2021

Data without context is just meaningless noise, and any effort to improve or extract value from your data without considering the larger business context is doomed to fall short.? Unfortunately, traditional approaches to data remediation often focus on technical data quality in isolation from the broader data and business ecosystem.

Data Quality

Data Quality Measurement Metrics Dashboards

6 Case Studies on The Benefits of Business Intelligence And Analytics

datapine

JANUARY 31, 2022

As Dan Jeavons Data Science Manager at Shell stated: “what we try to do is to think about minimal viable products that are going to have a significant business impact immediately and use that to inform the KPIs that really matter to the business”. Business intelligence and analytics allow users to know their businesses on a deeper level.

Business Intelligence

Business Intelligence Analytics Cost-Benefit ROI

Data Observability and Monitoring with DataOps

DataKitchen

MAY 10, 2021

Some will argue that observability is nothing more than testing and monitoring applications using tests, metrics, logs, and other artifacts. That’s a fair point, and it places emphasis on what is most important – what best practices should data teams employ to apply observability to data analytics. It’s not about data quality .

Testing

Testing Manufacturing Data Quality Statistics

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

Prior to the creation of the data lake, Orca’s data was distributed among various data silos, each owned by a different team with its own data pipelines and technology stack. Moreover, running advanced analytics and ML on disparate data sources proved challenging.

Data Lake

Data Lake Analytics Snapshot Data Quality

Why Your Data Governance Strategy is Failing

Alation

OCTOBER 5, 2021

The main reasons that a company’s data strategy and governance protocols fail to deliver are somewhat universal, regardless of the industry sector. Without a doubt, no company can achieve lasting profitability and sustainable growth with a poorly constructed data governance methodology. Incomplete data. Lack of commitment.

Data Governance

Data Governance Strategy Data Quality Metrics

Unlocking the potential of generative AI in the software development life cycle

CIO Business Intelligence

SEPTEMBER 10, 2024

Result: 40%-50% fewer UAT issues Streamlining workflows: GenAI analyzes post-deployment metrics to optimize SDLC workflows for faster, more reliable development. Invest in data quality: GenAI models are only as good as the data they’re trained on -with GenAI, mistakes can be amplified at speed.

Software

Software Digital Transformation Testing Advertising

How Fujitsu implemented a global data mesh architecture and democratized data

AWS Big Data

MAY 1, 2024

To provide a variety of products, services, and solutions that are better suited to customers and society in each region, we have built business processes and systems that are optimized for each region and its market. Foundation – This role encompasses the data steward and governance team. Each role has sub-roles.

Dashboards

Dashboards Publishing Data-driven Cost-Benefit

How Can BI Consulting Services Help Foster Data-driven Decisions

BizAcuity

NOVEMBER 13, 2024

Beyond mere data collection, BI consulting helps businesses create a cohesive data strategy that aligns with organizational goals. This approach involves everything from identifying key metrics to implementing analytics systems and designing dashboards.

Consulting

Consulting Data-driven Cost-Benefit Business Intelligence

7 enterprise data strategy trends

CIO Business Intelligence

NOVEMBER 22, 2022

The next step in every organization’s data strategy, Guan says, should be investing in and leveraging artificial intelligence and machine learning to unlock more value out of their data. CIOs should first understand the different approaches to observing data and how it differs from quality management,” he notes.

Data Strategy

Data Strategy Strategy Enterprise Consulting

The Race For Data Quality in a Medallion Architecture

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Webinars

Trending Sources

Unlocking Data Team Success: Are You Process-Centric or Data-Centric?

Webinars

Introducing AWS Glue Data Quality anomaly detection

Get started with AWS Glue Data Quality dynamic rules for ETL pipelines

Data’s dark secret: Why poor quality cripples AI and growth

Set up alerts and orchestrate data quality rules with AWS Glue Data Quality

Top 10 Analytics And Business Intelligence Trends For 2020

The DataOps Vendor Landscape, 2021

7 types of tech debt that could cripple your business

Measure performance of AWS Glue Data Quality for ETL pipelines

Unbundling the Graph in GraphRAG

AWS Glue Data Quality is Generally Available

Two Downs Make Two Ups: The Only Success Metrics That Matter For Your Data & Analytics Team

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

Get The Most Out Of Smart Business Intelligence Reporting

7 ways gen AI can create more work than it saves

Getting started with AWS Glue Data Quality from the AWS Glue Data Catalog

What are model governance and model operations?

What you need to know about product management for AI

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

Bridging the Gap: How ‘Data in Place’ and ‘Data in Use’ Define Complete Data Observability

The Journey to DataOps Success: Key Takeaways from Transformation Trailblazers

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Steps taken to build Sevita’s first enterprise data platform

How EUROGATE established a data mesh architecture using Amazon DataZone

Why Data Driven Decision Making is Your Path To Business Success

Digital KPIs: The secret to measuring transformational success

30 Best Manufacturing KPIs and Metric Examples for 2020 Reporting

Why HR professionals struggle with big data

How AWS helped Altron Group accelerate their vision for optimized customer engagement

Dear Avinash: Attribution Modeling, Org Culture, Deeper Analysis

Deep automation: A CIO weapon for turning disruption into opportunity

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Data Quality vs Data Condition: The Power of Context

6 Case Studies on The Benefits of Business Intelligence And Analytics

Data Observability and Monitoring with DataOps

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Why Your Data Governance Strategy is Failing

Unlocking the potential of generative AI in the software development life cycle

How Fujitsu implemented a global data mesh architecture and democratized data

How Can BI Consulting Services Help Foster Data-driven Decisions

7 enterprise data strategy trends

Stay Connected