Data Architecture and Data Quality

The Race For Data Quality in a Medallion Architecture

DataKitchen

NOVEMBER 5, 2024

The Race For Data Quality In A Medallion Architecture The Medallion architecture pattern is gaining traction among data teams. It is a layered approach to managing and transforming data. It sounds great, but how do you prove the data is correct at each layer?

Data Quality

Data Quality Testing Metrics Reporting

Are enterprises ready to adopt AI at scale?

CIO Business Intelligence

OCTOBER 30, 2024

The path to achieving AI at scale is paved with myriad challenges: data quality and availability, deployment, and integration with existing systems among them. Another challenge here stems from the existing architecture within these organizations.

Enterprise

Enterprise Data Architecture Unstructured Data Insurance

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Amazon DataZone now integrates with AWS Glue Data Quality and external data quality solutions

AWS Big Data

APRIL 3, 2024

Today, we are pleased to announce that Amazon DataZone is now able to present data quality information for data assets. Other organizations monitor the quality of their data through third-party solutions. Additionally, Amazon DataZone now offers APIs for importing data quality scores from external systems.

Data Quality

Data Quality Visualization Metadata Metrics

Visualize data quality scores and metrics generated by AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

AWS Glue Data Quality allows you to measure and monitor the quality of data in your data repositories. It’s important for business users to be able to see quality scores and metrics to make confident business decisions and debug data quality issues. An AWS Glue crawler crawls the results.

Data Quality

Data Quality Metrics Visualization Dashboards

Through the Looking Glass: What Does Data Quality Mean for Unstructured Data?

TDAN

DECEMBER 4, 2024

We have lots of data conferences here. I’ve taken to asking a question at these conferences: What does data quality mean for unstructured data? Over the years, I’ve seen a trend — more and more emphasis on AI. This is my version of […]

Unstructured Data

Unstructured Data Data Quality Data Architecture Modeling

Data architecture strategy for data quality

IBM Big Data Hub

JANUARY 5, 2023

Poor data quality is one of the top barriers faced by organizations aspiring to be more data-driven. Ill-timed business decisions and misinformed business processes, missed revenue opportunities, failed business initiatives and complex data systems can all stem from data quality issues.

Data Architecture

Data Architecture Data Quality Strategy Data Lake

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

AWS Big Data

OCTOBER 9, 2024

Some customers build custom in-house data parity frameworks to validate data during migration. Others use open source data quality products for data parity use cases. This takes away important person hours from the actual migration effort into building and maintaining a data parity framework.

Data Quality

Data Quality Data Lake Data Warehouse Metrics

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

With this launch, you can query data regardless of where it is stored with support for a wide range of use cases, including analytics, ad-hoc querying, data science, machine learning, and generative AI. We’ve simplified data architectures, saving you time and costs on unnecessary data movement, data duplication, and custom solutions.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

Data Architecture and Strategy in the AI Era

Cloudera

MARCH 28, 2024

But, even with the backdrop of an AI-dominated future, many organizations still find themselves struggling with everything from managing data volumes and complexity to security concerns to rapidly proliferating data silos and governance challenges. The benefits are clear, and there’s plenty of potential that comes with AI adoption.

Data Architecture

Data Architecture Strategy Data Lake Data-driven

Modern Data Architecture for Telecommunications

Cloudera

SEPTEMBER 6, 2022

Data has continued to grow both in scale and in importance through this period, and today telecommunications companies are increasingly seeing data architecture as an independent organizational challenge, not merely an item on an IT checklist. Why telco should consider modern data architecture. The challenges.

Data Architecture

Data Architecture Cost-Benefit Digital Transformation Business Driver

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

AWS Big Data

SEPTEMBER 11, 2024

This complex process involves suppliers, logistics, quality control, and delivery. This post describes how HPE Aruba automated their Supply Chain management pipeline, and re-architected and deployed their data solution by adopting a modern data architecture on AWS.

Data Architecture

Data Architecture Optimization Data Warehouse Metadata

What is a Data Mesh?

DataKitchen

AUGUST 3, 2021

The data mesh design pattern breaks giant, monolithic enterprise data architectures into subsystems or domains, each managed by a dedicated team. The communication between business units and data professionals is usually incomplete and inconsistent. Introduction to Data Mesh. Source: Thoughtworks.

Data Architecture

Data Architecture Data Lake Cost-Benefit Data Warehouse

Automated data governance with AWS Glue Data Quality, sensitive data detection, and AWS Lake Formation

AWS Big Data

OCTOBER 10, 2023

Due to the volume, velocity, and variety of data being ingested in data lakes, it can get challenging to develop and maintain policies and procedures to ensure data governance at scale for your data lake. Data confidentiality and data quality are the two essential themes for data governance.

Data Quality

Data Quality Data Governance Data Lake Testing

How to Manage Risk with Modern Data Architectures

Cloudera

JUNE 29, 2023

To improve the way they model and manage risk, institutions must modernize their data management and data governance practices. Implementing a modern data architecture makes it possible for financial institutions to break down legacy data silos, simplifying data management, governance, and integration — and driving down costs.

Data Architecture

Data Architecture Risk Management Risk Management

7 types of tech debt that could cripple your business

CIO Business Intelligence

MARCH 25, 2025

Data debt that undermines decision-making In Digital Trailblazer , I share a story of a private company that reported a profitable year to the board, only to return after the holiday to find that data quality issues and calculation mistakes turned it into an unprofitable one.

Risk

Risk Cost-Benefit Data-driven Digital Transformation

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

AUGUST 30, 2022

Legacy data sharing involves proliferating copies of data, creating data management, and security challenges. Data quality issues deter trust and hinder accurate analytics. Modern data architectures. Deploying modern data architectures. Forrester ).

Data Architecture

Data Architecture Data Lake Data Warehouse Metadata

Data Quality in Six Verbs

Jim Harris

JANUARY 1, 2016

1 — Investigate Data quality is not exactly a riddle wrapped in a mystery inside an enigma. However, understanding your data is essential to using it effectively and improving its quality. In order for you to make sense of those data elements, you require business context.

Data Quality

Data Quality Data Governance Cost-Benefit ROI

Data integrity vs. data quality: Is there a difference?

IBM Big Data Hub

JULY 13, 2023

When we talk about data integrity, we’re referring to the overarching completeness, accuracy, consistency, accessibility, and security of an organization’s data. Together, these factors determine the reliability of the organization’s data. Data quality Data quality is essentially the measure of data integrity.

Data Quality

Data Quality Data Integration Metadata Cost-Benefit

6 Big Data Mistakes You Must Avoid At All Costs

Smart Data Collective

FEBRUARY 23, 2021

To help you identify and resolve these mistakes, we’ve put together this guide on the various big data mistakes that marketers tend to make. Big Data Mistakes You Must Avoid. Here are some common big data mistakes you must avoid to ensure that your campaigns aren’t affected. Ignoring Data Quality.

Big Data

Big Data Visualization Data Quality Data-driven

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

This enables you to extract insights from your data without the complexity of managing infrastructure. dbt has emerged as a leading framework, allowing data teams to transform and manage data pipelines effectively. With dbt, teams can define data quality checks and access controls as part of their transformation workflow.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

The future of data: A 5-pillar approach to modern data management

CIO Business Intelligence

DECEMBER 11, 2024

Furthermore, generally speaking, data should not be split across multiple databases on different cloud providers to achieve cloud neutrality. Not my original quote, but a cardinal sin of cloud-native data architecture is copying data from one location to another.

Management

Management Data Governance Data Science Reporting

Manage concurrent write conflicts in Apache Iceberg on the AWS Glue Data Catalog

AWS Big Data

APRIL 8, 2025

In modern data architectures, Apache Iceberg has emerged as a popular table format for data lakes, offering key features including ACID transactions and concurrent write support. Both operations target the same partition based on customer_id , leading to potential conflicts because theyre modifying an overlapping dataset.

Snapshot

Snapshot Management Metadata Big Data

New Data Architectures are too Data-Store-Centric

Data Virtualization

FEBRUARY 28, 2020

Too often the design of new data architectures is based on old principles: they are still very data-store-centric. They consist of many physical data stores in which data is stored repeatedly and redundantly. Over time, new types of data stores,

Data Architecture

Data Architecture Data Lake Digital Transformation Data Quality

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Big Data Hub

AUGUST 4, 2023

Today, the way businesses use data is much more fluid; data literate employees use data across hundreds of apps, analyze data for better decision-making, and access data from numerous locations. This results in more marketable AI-driven products and greater accountability.

Data Architecture

Data Architecture Data Lake Machine Learning Data Governance

What is data governance? Best practices for managing data assets

CIO Business Intelligence

MARCH 24, 2023

The Business Application Research Center (BARC) warns that data governance is a highly complex, ongoing program, not a “big bang initiative,” and it runs the risk of participants losing trust and interest over time. Informatica Axon Informatica Axon is a collection hub and data marketplace for supporting programs.

Data Governance

Data Governance Management Metadata Data Quality

The essential check list for effective data democratization

CIO Business Intelligence

JANUARY 20, 2023

A big part of preparing data to be shared is an exercise in data normalization, says Juan Orlandini, chief architect and distinguished engineer at Insight Enterprises. Data formats and data architectures are often inconsistent, and data might even be incomplete.

Data Lake

Data Lake Data-driven Finance Data Architecture

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

AWS Big Data

FEBRUARY 27, 2024

Governance and self-service – The Bluestone Data Platform provides a governed, curated, and self-service avenue for all data use cases. AWS services like AWS Lake Formation in conjunction with Atlan help govern data access and policies. Ben Vengerovsky is a Data Platform Product Manager at Bluestone.

Data-driven

Data-driven Data Lake Data Quality Data Governance

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Need for a data mesh architecture Because entities in the EUROGATE group generate vast amounts of data from various sourcesacross departments, locations, and technologiesthe traditional centralized data architecture struggles to keep up with the demands for real-time insights, agility, and scalability.

IoT

IoT Machine Learning Metadata Data-driven

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

erwin

JULY 12, 2019

It also helps enterprises put these strategic capabilities into action by: Understanding their business, technology and data architectures and their inter-relationships, aligning them with their goals and defining the people, processes and technologies required to achieve compliance.

Data Governance

Data Governance Management Metadata Risk Management

Building a Beautiful Data Lakehouse

CIO Business Intelligence

MARCH 9, 2022

They conveniently store data in a flat architecture that can be queried in aggregate and offer the speed and lower cost required for big data analytics. On the other hand, they don’t support transactions or enforce data quality. Each ETL step risks introducing failures or bugs that reduce data quality. .

Data Lake

Data Lake Unstructured Data Data Warehouse Big Data

Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

erwin

OCTOBER 24, 2019

The complexities of metadata management can be addressed with a strong data management strategy coupled with metadata management software to enable the data quality the business requires. Organizations then can take a data-driven approach to business transformation , speed to insights, and risk management.

Metadata

Metadata Management Data-driven Data Architecture

What Separates Hybrid Cloud and ‘True’ Hybrid Cloud?

Cloudera

MAY 14, 2024

More than that, though, harnessing the potential of these technologies requires quality data—without it, the output from an AI implementation can end up inefficient or wholly inaccurate. Meaningful results, and a scalable, flexible data architecture demand a ‘true’ hybrid cloud approach to data management.

Data Architecture

Data Architecture Data Governance Unstructured Data Structured Data

Broken Data – What You Don’t Know Will Hurt You – Part 1

TDAN

JUNE 18, 2019

The first step to fixing any problem is to understand that problem—this is a significant point of failure when it comes to data. Most organizations agree that they have data issues, categorized as data quality. However, this definition is […].

Data Quality

Data Quality Data Architecture IT Data Governance

Data Professional Introspective: Data Architecture and the Role of Business

TDAN

APRIL 16, 2019

The phrase “data architecture” often has different connotations across an organization depending on where their job role is. For instance, most of my earlier career roles were within IT, though throughout the last decade or so, has been primarily working with business line staff.

Data Architecture

Data Architecture IT Data Quality Data Governance

How to Pinpoint Where Your Organization Wins (and Loses) with Data

CIO Business Intelligence

NOVEMBER 29, 2022

A sea of complexity For years, data ecosystems have gotten more complex due to discrete (and not necessarily strategic) data-platform decisions aimed at addressing new projects, use cases, or initiatives. Layering technology on the overall data architecture introduces more complexity.

Data Architecture

Data Architecture Data Integration IoT Data-driven

A Day in the Life of a DataOps Engineer

DataKitchen

OCTOBER 11, 2021

First, you must understand the existing challenges of the data team, including the data architecture and end-to-end toolchain. Based on business rules, additional data quality tests check the dimensional model after the ETL job completes. A DataOps implementation project consists of three steps.

Testing

Testing Metadata Dashboards Statistics

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Migrating to Amazon Redshift offers organizations the potential for improved price-performance, enhanced data processing, faster query response times, and better integration with technologies such as machine learning (ML) and artificial intelligence (AI).

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

How the right data and AI foundation can empower a successful ESG strategy

IBM Big Data Hub

APRIL 10, 2023

A well-designed data architecture should support business intelligence and analysis, automation, and AI—all of which can help organizations to quickly seize market opportunities, build customer value, drive major efficiencies, and respond to risks such as supply chain disruptions.

Strategy

Strategy Data Architecture Cost-Benefit Reporting

Data Governance 2.0: The CIO’s Guide to Collaborative Data Governance

erwin

DECEMBER 6, 2019

Enterprise Data Management Methodology : DG is foundational to enterprise data management. metadata management, enterprise data architecture, data quality management), DG will be a struggle. Without the other essential components (e.g.,

Data Governance

Data Governance Metadata Enterprise Data-driven

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

While traditional extract, transform, and load (ETL) processes have long been a staple of data integration due to its flexibility, for common use cases such as replication and ingestion, they often prove time-consuming, complex, and less adaptable to the fast-changing demands of modern data architectures.

Data Integration

Data Integration Data Lake Statistics Data-driven

The steep cost of a poor data management strategy

CIO Business Intelligence

JUNE 9, 2023

A few years ago, Gartner found that “organizations estimate the average cost of poor data quality at $12.8 million per year.’” Beyond lost revenue, data quality issues can also result in wasted resources and a damaged reputation. Learn more about data architectures in my article here.

Strategy

Strategy Management Key Performance Indicator Cost-Benefit

Breaking down data silos for digital success

CIO Business Intelligence

NOVEMBER 7, 2023

This team has helped the company to align data across business areas; establish a data governance function to enable trust, privacy, and security of the data; and invest in the talent and technology needed to build a holistic data architecture across Lexmark, Gupta says.

Data Warehouse

Data Warehouse Digital Transformation Data-driven Reporting

Ignoring data lifecycle management is putting your business at risk

CIO Business Intelligence

JULY 28, 2023

Modern data platforms can stop enterprises from drowning in a sea of data by integrating AI and ML to enable more efficient, accessible data. It also helps to overcome the challenges of shadow data, which enterprise security policies do not recognize or cover.

Risk

Risk Management Data mining Cost-Benefit

The Race For Data Quality in a Medallion Architecture

Are enterprises ready to adopt AI at scale?

Webinars

Trending Sources

Data’s dark secret: Why poor quality cripples AI and growth

Webinars

Amazon DataZone now integrates with AWS Glue Data Quality and external data quality solutions

Visualize data quality scores and metrics generated by AWS Glue Data Quality

Through the Looking Glass: What Does Data Quality Mean for Unstructured Data?

Data architecture strategy for data quality

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Data Architecture and Strategy in the AI Era

Modern Data Architecture for Telecommunications

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

What is a Data Mesh?

Automated data governance with AWS Glue Data Quality, sensitive data detection, and AWS Lake Formation

How to Manage Risk with Modern Data Architectures

7 types of tech debt that could cripple your business

Breaking State and Local Data Silos with Modern Data Architectures

Data Quality in Six Verbs

Data integrity vs. data quality: Is there a difference?

6 Big Data Mistakes You Must Avoid At All Costs

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

The future of data: A 5-pillar approach to modern data management

Manage concurrent write conflicts in Apache Iceberg on the AWS Glue Data Catalog

New Data Architectures are too Data-Store-Centric

Data democratization: How data architecture can drive business decisions and AI initiatives

What is data governance? Best practices for managing data assets

The essential check list for effective data democratization

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

How EUROGATE established a data mesh architecture using Amazon DataZone

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

Building a Beautiful Data Lakehouse

Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

What Separates Hybrid Cloud and ‘True’ Hybrid Cloud?

Broken Data – What You Don’t Know Will Hurt You – Part 1

Data Professional Introspective: Data Architecture and the Role of Business

How to Pinpoint Where Your Organization Wins (and Loses) with Data

A Day in the Life of a DataOps Engineer

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

How the right data and AI foundation can empower a successful ESG strategy

Data Governance 2.0: The CIO’s Guide to Collaborative Data Governance

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

The steep cost of a poor data management strategy

Breaking down data silos for digital success

Ignoring data lifecycle management is putting your business at risk

Stay Connected