article thumbnail

The Race For Data Quality in a Medallion Architecture

DataKitchen

The Race For Data Quality In A Medallion Architecture The Medallion architecture pattern is gaining traction among data teams. It is a layered approach to managing and transforming data. It sounds great, but how do you prove the data is correct at each layer? How do you ensure data quality in every layer ?

article thumbnail

Build Write-Audit-Publish pattern with Apache Iceberg branching and AWS Glue Data Quality

AWS Big Data

Equally crucial is the ability to segregate and audit problematic data, not just for maintaining data integrity, but also for regulatory compliance, error analysis, and potential data recovery. We discuss two common strategies to verify the quality of published data.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data.

article thumbnail

Amazon DataZone now integrates with AWS Glue Data Quality and external data quality solutions

AWS Big Data

Today, we are pleased to announce that Amazon DataZone is now able to present data quality information for data assets. Other organizations monitor the quality of their data through third-party solutions. Additionally, Amazon DataZone now offers APIs for importing data quality scores from external systems.

article thumbnail

What Is Entity Resolution? How It Works & Why It Matters

Entity Resolution Sometimes referred to as data matching or fuzzy matching, entity resolution, is critical for data quality, analytics, graph visualization and AI. Advanced entity resolution using AI is crucial because it efficiently and easily solves many of today’s data quality and analytics problems.

article thumbnail

Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone

AWS Big Data

Data quality is crucial in data pipelines because it directly impacts the validity of the business insights derived from the data. Today, many organizations use AWS Glue Data Quality to define and enforce data quality rules on their data at rest and in transit.

article thumbnail

Navigating the Storm: How Data Engineering Teams Can Overcome a Data Quality Crisis

DataKitchen

Navigating the Storm: How Data Engineering Teams Can Overcome a Data Quality Crisis Ah, the data quality crisis. It’s that moment when your carefully crafted data pipelines start spewing out numbers that make as much sense as a cat trying to bark. You’ve got yourself a recipe for data disaster.