Data Quality, Data Warehouse and Publishing

Data Quality

Data Warehouse

Publishing

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data.

Data Quality

Data Quality Metrics Data-driven Management

How I Broke Our SLA and Delighted Our Customer

DataKitchen

MAY 17, 2025

One of our key data warehouse refreshes had failed. No new data. The refresh was long past its deadline, the projects key data engineer was on vacation, and I was playing backup. At the moment, I was flying home from a data quality conference. The data didnt arrive on time. No dashboard updates.

Testing

Testing Data Quality Data Warehouse Dashboards

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

Amazon SageMaker Lakehouse , now generally available, unifies all your data across Amazon Simple Storage Service (Amazon S3) data lakes and Amazon Redshift data warehouses, helping you build powerful analytics and AI/ML applications on a single copy of data. Having confidence in your data is key.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Unifying these necessitates additional data processing, requiring each business unit to provision and maintain a separate data warehouse. This burdens business units focused solely on consuming the curated data for analysis and not concerned with data management tasks, cleansing, or comprehensive data processing.

Data Lake

Data Lake Data Warehouse Data Governance Publishing

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Data quality is crucial in data pipelines because it directly impacts the validity of the business insights derived from the data. Today, many organizations use AWS Glue Data Quality to define and enforce data quality rules on their data at rest and in transit.

Data Quality

Data Quality Visualization Metadata Key Performance Indicator

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Domain ownership recognizes that the teams generating the data have the deepest understanding of it and are therefore best suited to manage, govern, and share it effectively. This principle makes sure data accountability remains close to the source, fostering higher data quality and relevance.

Metadata

Metadata Data Governance Data Quality Data-driven

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Plug-and-play integration : A seamless, plug-and-play integration between data producers and consumers should facilitate rapid use of new data sets and enable quick proof of concepts, such as in the data science teams. As part of the required data, CHE data is shared using Amazon DataZone.

IoT

IoT Machine Learning Metadata Data-driven

Getting started with AWS Glue Data Quality from the AWS Glue Data Catalog

AWS Big Data

JUNE 6, 2023

Data consumers lose trust in data if it isn’t accurate and recent, making data quality essential for undertaking optimal and correct decisions. Evaluation of the accuracy and freshness of data is a common task for engineers. Currently, various tools are available to evaluate data quality.

Data Quality

Data Quality Data-driven Data Lake Metrics

Take Your SQL Skills To The Next Level With These Popular SQL Books

datapine

SEPTEMBER 27, 2022

Our next book is dedicated to anyone who wants to start a career as a data scientist and is looking to get all the knowledge and skills in a way that is accessible and well-structured. Originally published in 2018, the book has a second edition that was released in January of 2022. 4) “SQL Performance Explained” by Markus Winand.

Business Intelligence

Business Intelligence Data Warehouse Data Processing Data mining

Set up advanced rules to validate quality of multiple datasets with AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

Poor-quality data can lead to incorrect insights, bad decisions, and lost opportunities. AWS Glue Data Quality measures and monitors the quality of your dataset. It supports both data quality at rest and data quality in AWS Glue extract, transform, and load (ETL) pipelines.

Data Quality

Data Quality Data Lake Visualization Data-driven

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

AWS Big Data

SEPTEMBER 11, 2024

This also includes building an industry standard integrated data repository as a single source of truth, operational reporting through real time metrics, data quality monitoring, 24/7 helpdesk, and revenue forecasting through financial projections and supply availability projections.

Data Architecture

Data Architecture Optimization Data Warehouse Metadata

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities. This makes sure the new data platform can meet current and future business goals.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

With this new functionality, customers can create up-to-date replicas of their data from applications such as Salesforce, ServiceNow, and Zendesk in an Amazon SageMaker Lakehouse and Amazon Redshift. SageMaker Lakehouse gives you the flexibility to access and query your data in-place with all Apache Iceberg compatible tools and engines.

Data Integration

Data Integration Data Lake Statistics Data-driven

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

Data virtualization is ideal in any situation where the is necessary: Information coming from diverse data sources. Multi-channel publishing of data services. How does Data Virtualization manage data quality requirements? How does Data Virtualization complement Data Warehousing and SOA Architectures?

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

AWS Big Data

MARCH 7, 2023

Data lakes are more focused around storing and maintaining all the data in an organization in one place. And unlike data warehouses, which are primarily analytical stores, a data hub is a combination of all types of repositories—analytical, transactional, operational, reference, and data I/O services, along with governance processes.

Analytics

Analytics Data Warehouse Data Lake Metadata

Governing data in relational databases using Amazon DataZone

AWS Big Data

MAY 7, 2024

It also makes it easier for engineers, data scientists, product managers, analysts, and business users to access data throughout an organization to discover, use, and collaborate to derive data-driven insights. The producer also needs to manage and publish the data asset so it’s discoverable throughout the organization.

Metadata

Metadata Data Lake Data Processing Data-driven

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

AWS Big Data

JULY 25, 2024

Solution To address the challenge, ATPCO sought inspiration from a modern data mesh architecture. In Amazon DataZone, data owners can publish their data and its business catalog (metadata) to ATPCO’s DataZone domain. Data consumers can then search for relevant data assets using these human-friendly metadata terms.

Data Lake

Data Lake Metadata Sales Publishing

Automate large-scale data validation using Amazon EMR and Apache Griffin

AWS Big Data

APRIL 4, 2024

Griffin is an open source data quality solution for big data, which supports both batch and streaming mode. In today’s data-driven landscape, where organizations deal with petabytes of data, the need for automated data validation frameworks has become increasingly critical.

Data Quality

Data Quality Data Lake Data Warehouse Data-driven

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

Since its uniquely metadata-driven, the abstraction layer of a data fabric makes it easier to model, integrate and query any data sources, build data pipelines, and integrate data in real-time. This improves data engineering productivity and time-to-value for data consumers. What’s a data mesh?

Management

Management Metadata Data Architecture Data Lake

Build Write-Audit-Publish pattern with Apache Iceberg branching and AWS Glue Data Quality

AWS Big Data

DECEMBER 9, 2024

Given the importance of data in the world today, organizations face the dual challenges of managing large-scale, continuously incoming data while vetting its quality and reliability. AWS Glue is a serverless data integration service that you can use to effectively monitor and manage data quality through AWS Glue Data Quality.

Data Quality

Data Quality Publishing Snapshot Data Lake

How to Pinpoint Where Your Organization Wins (and Loses) with Data

CIO Business Intelligence

NOVEMBER 29, 2022

Layering technology on the overall data architecture introduces more complexity. Today, data architecture challenges and integration complexity impact the speed of innovation, data quality, data security, data governance, and just about anything important around generating value from data.

Data Architecture

Data Architecture Data Integration IoT Data-driven

AWS Lake Formation 2022 year in review

AWS Big Data

JANUARY 31, 2023

Data governance is increasingly top-of-mind for customers as they recognize data as one of their most important assets. Effective data governance enables better decision-making by improving data quality, reducing data management costs, and ensuring secure access to data for stakeholders.

Data Lake

Data Lake Data Governance Data Architecture Machine Learning

The Modern Data Lakehouse: An Architectural Innovation

Cloudera

SEPTEMBER 9, 2022

This is the promise of the modern data lakehouse architecture. analyst Sumit Pal, in “Exploring Lakehouse Architecture and Use Cases,” published January 11, 2022: “Data lakehouses integrate and unify the capabilities of data warehouses and data lakes, aiming to support AI, BI, ML, and data engineering on a single platform.”

Metadata

Metadata Machine Learning Unstructured Data Data Lake

Salesforce and the (single source of) Truth about Customer 360

Andrew White

DECEMBER 4, 2019

If I am moved to write research about a vendor, I’ll write it and publish it behind our pay wall, in the assumption the advice is valuable. If you read my blog regularly then you know I rarely write about IT vendors. No single application vendor solves single source of truth since, by their own definition, they sell multiple applications.

Digital Transformation

Digital Transformation Data Quality Data Integration Data Warehouse

Data Mesh 101: What it is and Why You Should Care

Ontotext

FEBRUARY 12, 2024

It proposes a technological, architectural, and organizational approach to solving data management problems by breaking up the monolithic data platform and de-centralizing data management across different domain teams and services. Some examples of data products are data sets, tables, machine learning models, and APIs.

IT Metadata Data Quality Data Lake

Unlocking the Power of AI with a Real-Time Data Strategy

CIO Business Intelligence

FEBRUARY 14, 2023

It’s clear how these real-time data sources generate data streams that need new data and ML models for accurate decisions. Data quality is crucial for real-time actions because decisions often can’t be taken back.

Data Strategy

Data Strategy Strategy Machine Learning Data-driven

The How and Why of Data Cleansing

Jet Global

FEBRUARY 25, 2025

Data cleansing is the process of identifying and correcting errors, inconsistencies, and inaccuracies in a dataset to ensure its quality, accuracy, and reliability. This process is crucial for businesses that rely on data-driven decision-making, as poor data quality can lead to costly mistakes and inefficiencies.

Cost-Benefit

Cost-Benefit Data Collection Finance Reporting

New Age of Data Curation: Challenges, Best Practices, and Solutions

Alation

JUNE 30, 2022

And as new technology allowed for more publishers and created a higher volume of content, information curation thrived. In today’s data-driven world, many data workers are struggling with high volumes of often redundant data… and many long for a data user’s version of Wikipedia. Data quality can change with time.

Metadata

Metadata Data Warehouse Data Quality Data-driven

Watch Out for Business Intelligence "Gotchas"

Howard Dresner

JANUARY 4, 2015

He has published two books on the subject, The Performance Management Revolution — Business Results through Insight and Action, and Profiles in Performance — Business Intelligence Journeys and the Roadmap for Change.

Business Intelligence

Business Intelligence Recreation/Entertainment Sales Data Quality

What is Data Mesh?

Ontotext

NOVEMBER 16, 2023

Data mesh solves this by promoting data autonomy, allowing users to make decisions about domains without a centralized gatekeeper. It also improves development velocity with better data governance and access with improved data quality aligned with business needs.

Metadata

Metadata Data-driven Data Quality Data Architecture

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

AWS Big Data

MARCH 3, 2023

It has been well published since the State of DevOps 2019 DORA Metrics were published that with DevOps, companies can deploy software 208 times more often and 106 times faster, recover from incidents 2,604 times faster, and release 7 times fewer defects. Finally, data integrity is of paramount importance.

Software

Software Data Lake Testing Cost-Benefit

Understanding BI Tools in Today’s Market

Smarten

NOVEMBER 26, 2023

You may be interested to know that TechJury reports seven out of ten businesses rate data discovery as very important, and that the top three business intelligence trends are data visualization, data quality management and self-service business intelligence.

Marketing

Marketing Business Intelligence Key Performance Indicator Prescriptive Analytics

5 Ways Data Engineers Can Support Data Governance

Alation

JANUARY 26, 2023

Offer the right tools Data stewardship is greatly simplified when the right tools are on hand. So ask yourself, does your steward have the software to spot issues with data quality, for example? 2) Always Remember Compliance Source: Unsplash There are now many different data privacy and security laws worldwide.

Data Governance

Data Governance Strategy Data Quality Data Collection

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Andrew White

JANUARY 11, 2021

This was for the Chief Data Officer, or head of data and analytics. Gartner also published the same piece of research for other roles, such as Application and Software Engineering. See recorded webinars: Emerging Practices for a Data-driven Strategy. Link Data to Business Outcomes. Very interesting.

Data Analytics

Data Analytics Analytics Data-driven Finance

Data Science, Past & Future

Domino Data Lab

JULY 22, 2019

The data governance, however, is still pretty much over on the data warehouse. Toward the end of the 2000s is when you first started getting teams and industry, as Josh Willis was showing really brilliantly last night, you first started getting some teams identified as “data science” teams. You know what?

Data Science

Data Science Machine Learning Data Governance Modeling

Themes and Conferences per Pacoid, Episode 6

Domino Data Lab

FEBRUARY 4, 2019

In other words, your talk didn’t quite stand out enough to put onstage, but you still get “publish or perish” credits for presenting. That approach probably created data silos between divisions, due to costs, budgets, accounting procedures, etc. A free mini-book about the second survey, Evolving Data Infrastructure, just published.

Data Science

Data Science Experimentation Machine Learning Data-driven

The Importance of Data Quality in Financial Reporting

Jet Global

DECEMBER 23, 2021

Data quality has always been at the heart of financial reporting , but with rampant growth in data volumes, more complex reporting requirements and increasingly diverse data sources, there is a palpable sense that some data, may be eluding everyday data governance and control. Data Quality Audit.

Data Quality

Data Quality Reporting Recreation/Entertainment Finance

What is a Data Pipeline?

Jet Global

MAY 9, 2024

The key components of a data pipeline are typically: Data Sources : The origin of the data, such as a relational database , data warehouse, data lake , file, API, or other data store. This can include tasks such as data ingestion, cleansing, filtering, aggregation, or standardization.

Data Lake

Data Lake Data Warehouse Business Intelligence Machine Learning

Enhance Trino Performance With Simba’s Powerful Connectivity

Jet Global

JANUARY 30, 2025

Preventing Data Swamps: Best Practices for Clean Data Preventing data swamps is crucial to preserving the value and usability of data lakes, as unmanaged data can quickly become chaotic and undermine decision-making.

Data Lake

Data Lake Data-driven Optimization Enterprise

What is Data Mapping?

Jet Global

FEBRUARY 23, 2024

The quick and dirty definition of data mapping is the process of connecting different types of data from various data sources. Data mapping is a crucial step in data modeling and can help organizations achieve their business goals by enabling data integration, migration, transformation, and quality.

Data Warehouse

Data Warehouse Reporting Data Transformation Visualization

How to Bridge the Skills Gap With Automation for JD Edwards

Jet Global

FEBRUARY 21, 2025

This trend, coupled with evolving work patterns like remote work and the gig economy, has significantly impacted traditional talent acquisition and retention strategies, making it increasingly challenging to find and retain qualified finance talent.

Finance

Finance Reporting Forecasting Software

Unlock The Power of Your Data With These 19 Big Data & Data Analytics Books

datapine

AUGUST 29, 2022

Due to this book being published recently, there are not any written reviews available. 4) Big Data: Principles and Best Practices Of Scalable Real-Time Data Systems by Nathan Marz and James Warren. Best for: the seasoned BI professional who is ready to think deep and hard about important issues in data analytics and big data.

Big Data

Big Data Data Analytics Analytics Data mining

Top 10 Reasons to Acquire a Product Information Management Solution (PIM or PXM)

Jet Global

FEBRUARY 23, 2024

A Centralized Hub for Data Data silos are the number one inhibitor to commerce success regardless of your business model. Through effective workflow, data quality, and governance tools, a PIM ensures that disparate content is transformed into a company-wide strategic asset. Publish with Ease Publishing from a PIM is easy.

Management

Management Sales Publishing Data Quality

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

How I Broke Our SLA and Delighted Our Customer

Webinars

Trending Sources

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Webinars

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

Data’s dark secret: Why poor quality cripples AI and growth

Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

How EUROGATE established a data mesh architecture using Amazon DataZone

Getting started with AWS Glue Data Quality from the AWS Glue Data Catalog

Take Your SQL Skills To The Next Level With These Popular SQL Books

Set up advanced rules to validate quality of multiple datasets with AWS Glue Data Quality

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Biggest Trends in Data Visualization Taking Shape in 2022

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

Governing data in relational databases using Amazon DataZone

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

Automate large-scale data validation using Amazon EMR and Apache Griffin

Augmented data management: Data fabric versus data mesh

Build Write-Audit-Publish pattern with Apache Iceberg branching and AWS Glue Data Quality

How to Pinpoint Where Your Organization Wins (and Loses) with Data

AWS Lake Formation 2022 year in review

The Modern Data Lakehouse: An Architectural Innovation

Salesforce and the (single source of) Truth about Customer 360

Data Mesh 101: What it is and Why You Should Care

Unlocking the Power of AI with a Real-Time Data Strategy

The How and Why of Data Cleansing

New Age of Data Curation: Challenges, Best Practices, and Solutions

Watch Out for Business Intelligence "Gotchas"

What is Data Mesh?

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

Understanding BI Tools in Today’s Market

5 Ways Data Engineers Can Support Data Governance

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Data Science, Past & Future

Themes and Conferences per Pacoid, Episode 6

The Importance of Data Quality in Financial Reporting

What is a Data Pipeline?

Enhance Trino Performance With Simba’s Powerful Connectivity

What is Data Mapping?

How to Bridge the Skills Gap With Automation for JD Edwards

Unlock The Power of Your Data With These 19 Big Data & Data Analytics Books

Top 10 Reasons to Acquire a Product Information Management Solution (PIM or PXM)

Stay Connected