Data Quality, Metadata and Publishing

The state of data quality in 2020

O'Reilly on Data

FEBRUARY 11, 2020

We suspected that data quality was a topic brimming with interest. The responses show a surfeit of concerns around data quality and some uncertainty about how best to address those concerns. Key survey results: The C-suite is engaged with data quality. Data quality might get worse before it gets better.

Data Quality

Data Quality Metadata Data Governance Publishing

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data.

Data Quality

Data Quality Metrics Data-driven Management

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

It’s 2025. Are your data strategies strong enough to de-risk AI adoption?

CIO Business Intelligence

DECEMBER 11, 2024

If youre not keeping up the fundamentals of data and data management, your ability to adopt AIat whatever stage you are at in your AI journeywill be impacted, Kulkarni points out. This in turn stimulates a more agile and adaptable approach to AI which can accelerate its uptake and the returns that the organisation can expect.

Risk

Risk Data Strategy Strategy Data Governance

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Domain ownership recognizes that the teams generating the data have the deepest understanding of it and are therefore best suited to manage, govern, and share it effectively. This principle makes sure data accountability remains close to the source, fostering higher data quality and relevance.

Metadata

Metadata Data Governance Data Quality Data-driven

RDF-Star: Metadata Complexity Simplified

Ontotext

JUNE 10, 2021

Not Every Graph is a Knowledge Graph: Schemas and Semantic Metadata Matter. To be able to automate these operations and maintain sufficient data quality, enterprises have started implementing the so-called data fabrics , that employ diverse metadata sourced from different systems. Such examples are provenance (e.g.

Metadata

Metadata Cost-Benefit OLAP Modeling

Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Data quality is crucial in data pipelines because it directly impacts the validity of the business insights derived from the data. Today, many organizations use AWS Glue Data Quality to define and enforce data quality rules on their data at rest and in transit.

Data Quality

Data Quality Visualization Metadata Key Performance Indicator

Amazon DataZone now integrates with AWS Glue Data Quality and external data quality solutions

AWS Big Data

APRIL 3, 2024

Today, we are pleased to announce that Amazon DataZone is now able to present data quality information for data assets. Other organizations monitor the quality of their data through third-party solutions. Additionally, Amazon DataZone now offers APIs for importing data quality scores from external systems.

Data Quality

Data Quality Visualization Metadata Metrics

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Plug-and-play integration : A seamless, plug-and-play integration between data producers and consumers should facilitate rapid use of new data sets and enable quick proof of concepts, such as in the data science teams. As part of the required data, CHE data is shared using Amazon DataZone.

IoT

IoT Machine Learning Metadata Data-driven

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

Data teams struggle to find a unified approach that enables effortless discovery, understanding, and assurance of data quality and security across various sources. Collaboration is seamless, with straightforward publishing and subscribing workflows, fostering a more connected and efficient work environment.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

Use open table format libraries on AWS Glue 5.0 for Apache Spark

AWS Big Data

DECEMBER 4, 2024

These formats, exemplified by Apache Iceberg, Apache Hudi, and Delta Lake, addresses persistent challenges in traditional data lake structures by offering an advanced combination of flexibility, performance, and governance capabilities. These are useful for flexible data lifecycle management.

Snapshot

Snapshot Metadata Data Lake Optimization

Data Intelligence and Its Role in Combating Covid-19

erwin

MARCH 30, 2020

To marry the epidemiological data to the population data it will require a tremendous amount of data intelligence about the: Source of the data; Currency of the data; Quality of the data; and. Unraveling Data Complexities with Metadata Management. Data lineage to support impact analysis.

Metadata

Metadata IT Data Governance Data Quality

Collibra Provides a Platform for Data Intelligence

David Menninger's Analyst Perspectives

OCTOBER 8, 2024

As I recently noted , the term “data intelligence” has been used by multiple providers across analytics and data for several years and is becoming more widespread as software providers respond to the need to provide enterprises with a holistic view of data production and consumption.

Data Quality

Data Quality Data Governance Enterprise Visualization

A Day in the Life of a DataOps Engineer

DataKitchen

OCTOBER 11, 2021

Figure 2: Example data pipeline with DataOps automation. In this project, I automated data extraction from SFTP, the public websites, and the email attachments. The automated orchestration published the data to an AWS S3 Data Lake. Monitoring Job Metadata. Adding Tests to Reduce Stress.

Testing

Testing Metadata Dashboards Statistics

What an Old Dictionary teaches us about Metadata

Jim Harris

MAY 5, 2017

Spelling, pronunciation, and examples of usage are included in the dictionary definition of a word, which is a good example of one of the many uses of metadata, namely to provide a definition, description, and context for data. In practice, I haven’t encountered a metadata dictionary that could deliver on that promise.

Metadata

Metadata Publishing Management IT

Metadata enrichment – highly scalable data classification and data discovery

IBM Big Data Hub

JULY 28, 2022

Metadata enrichment is about scaling the onboarding of new data into a governed data landscape by taking data and applying the appropriate business terms, data classes and quality assessments so it can be discovered, governed and utilized effectively. Public API. See the official public APIs.

Metadata

Metadata Machine Learning Data Quality Statistics

How Fujitsu implemented a global data mesh architecture and democratized data

AWS Big Data

MAY 1, 2024

Solution overview OneData defines three personas: Publisher – This role includes the organizational and management team of systems that serve as data sources. Responsibilities include: Load raw data from the data source system at the appropriate frequency. It is crucial in data governance and data management.

Dashboards

Dashboards Publishing Data-driven Cost-Benefit

Advance top 2025 data initiatives with analyst firm-recognized erwin by Quest

erwin

JANUARY 23, 2025

Data intelligence software is continuously evolving to enable organizations to efficiently and effectively advance new data initiatives. With a variety of providers and offerings addressing data intelligence and governance needs, it can be easy to feel overwhelmed in selecting the right solution for your enterprise.

Metadata

Metadata Data Quality Data Governance Software

Top 10 Metadata Management Influencers, Sites, and Blogs You Must Follow in 2021

Octopai

APRIL 19, 2021

Aptly named, metadata management is the process in which BI and Analytics teams manage metadata, which is the data that describes other data. In other words, data is the context and metadata is the content. Without metadata, BI teams are unable to understand the data’s full story.

Metadata

Metadata Management Business Intelligence Data Governance

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

erwin

JULY 12, 2019

It also helps enterprises put these strategic capabilities into action by: Understanding their business, technology and data architectures and their inter-relationships, aligning them with their goals and defining the people, processes and technologies required to achieve compliance. How erwin Can Help.

Data Governance

Data Governance Management Metadata Risk Management

What you need to know about product management for AI

O'Reilly on Data

MARCH 31, 2020

You might establish a baseline by replicating collaborative filtering models published by teams that built recommenders for MovieLens, Netflix, and Amazon. It may even be faster to launch this new recommender system, because the Disney data team has access to published research describing what worked for other teams.

Management

Management Machine Learning Experimentation Metrics

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

It’s the preferred choice when customers need more control and customization over the data integration process or require complex transformations. This flexibility makes Glue ETL suitable for scenarios where data must be transformed or enriched before analysis. The status and statistics of the CDC load are published into CloudWatch.

Data Integration

Data Integration Data Lake Statistics Data-driven

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

AWS Big Data

JULY 25, 2024

Instead of a central data platform team with a data warehouse or data lake serving as the clearinghouse of all data across the company, a data mesh architecture encourages distributed ownership of data by data producers who publish and curate their data as products, which can then be discovered, requested, and used by data consumers.

Data Lake

Data Lake Metadata Sales Publishing

What is BCBS 239 Compliance?

Octopai

JANUARY 19, 2020

BCBS 239 is a document published by that committee entitled, Principles for Effective Risk Data Aggregation and Risk Reporting. The document, first published in 2013, outlines best practices for global and domestic banks to identify, manage, and report risks, including credit, market, liquidity, and operational risks. .

Metadata

Metadata Risk Management Business Intelligence Data Governance

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

AWS Big Data

SEPTEMBER 11, 2024

This also includes building an industry standard integrated data repository as a single source of truth, operational reporting through real time metrics, data quality monitoring, 24/7 helpdesk, and revenue forecasting through financial projections and supply availability projections.

Data Architecture

Data Architecture Optimization Data Warehouse Metadata

Governing data in relational databases using Amazon DataZone

AWS Big Data

MAY 7, 2024

This post explains how you can extend the governance capabilities of Amazon DataZone to data assets hosted in relational databases based on MySQL, PostgreSQL, Oracle or SQL Server engines. Second, the data producer needs to consolidate the data asset’s metadata in the business catalog and enrich it with business metadata.

Metadata

Metadata Data Lake Data Processing Data-driven

What is a data fabric architecture?

IBM Big Data Hub

MARCH 25, 2022

Automated data enrichment : To create the knowledge catalog, you need automated data stewardship services. These services include the ability to auto-discover and classify data, to detect sensitive information, to analyze data quality, to link business terms to technical metadata and to publish data to the knowledge catalog.

Metadata

Metadata Data Quality Data Governance Data Integration

Why Data Governance Is Crucial for All Enterprise-Level Businesses

Cloudera

MARCH 3, 2022

The medical insurance company wasn’t hacked, but its customers’ data was compromised through a third-party vendor’s employee. In the 2020 O’Reilly Data Quality survey only 20% of respondents say their organizations publish information about data provenance or data lineage internally. From Bad to Worse.

Data Governance

Data Governance Enterprise Data Quality Metadata

HEMA accelerates their data governance journey with Amazon DataZone

AWS Big Data

DECEMBER 19, 2024

Data has become an invaluable asset for businesses, offering critical insights to drive strategic decision-making and operational optimization. The business end-users were given a tool to discover data assets produced within the mesh and seamlessly self-serve on their data sharing needs.

Data Governance

Data Governance Publishing Data-driven Metadata

How to Do Data Modeling the Right Way

erwin

MAY 27, 2020

What, then, should users look for in a data modeling product to support their governance/intelligence requirements in the data-driven enterprise? Nine Steps to Data Modeling. Provide metadata and schema visualization regardless of where data is stored.

Modeling

Modeling Metadata Data Governance Visualization

The Semantic Web: 20 Years And a Handful of Enterprise Knowledge Graphs Later

Ontotext

JULY 29, 2021

The second one is the Linked Open Data (LOD): a cloud of interlinked structured datasets published without centralized control across thousands of servers. There are more than 80 million pages with semantic, machine interpretable metadata , according to the Schema.org standard. Take this restaurant, for example.

Enterprise

Enterprise Metadata Knowledge Discovery Management

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

Data virtualization is ideal in any situation where the is necessary: Information coming from diverse data sources. Multi-channel publishing of data services. How does Data Virtualization manage data quality requirements? Real-time information. Agile requirements and fast deployment times.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

6 DataOps Best Practices to Increase Your Data Analytics Output AND Your Data Quality

Octopai

OCTOBER 26, 2022

DataOps is an approach to best practices for data management that increases the quantity of data analytics products a data team can develop and deploy in a given time while drastically improving the level of data quality. Did you just have a spectacular new idea for a data analytics product?

Data Quality

Data Quality Data Analytics Analytics Manufacturing

AWS Lake Formation 2023 year in review

AWS Big Data

JANUARY 18, 2024

Easily and securely prepare, share, and query data – This session shows how you can use Lake Formation and the AWS Glue Data Catalog to share data without copying, transform and prepare data without coding, and query data. This enhancement simplifies many use cases to avoid metadata duplication.

Data Lake

Data Lake Metadata Data Governance Statistics

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

Gartner defines a data fabric as “a design concept that serves as an integrated layer of data and connecting processes. The data fabric architectural approach can simplify data access in an organization and facilitate self-service data consumption at scale.

Management

Management Metadata Data Architecture Data Lake

The Modern Data Lakehouse: An Architectural Innovation

Cloudera

SEPTEMBER 9, 2022

As a data analyst or data scientist, we would all love to be able to do all these things, and much more. This is the promise of the modern data lakehouse architecture. Schema evolution: With fast-moving data and real-time data ingestion, we need new ways to keep up with data quality, consistency, accuracy, and overall integrity.

Metadata

Metadata Machine Learning Unstructured Data Data Lake

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

In this blog, we’ll delve into the critical role of governance and data modeling tools in supporting a seamless data mesh implementation and explore how erwin tools can be used in that role. erwin also provides data governance, metadata management and data lineage software called erwin Data Intelligence by Quest.

Metadata

Metadata Data Quality Data Governance Modeling

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Business units can simply share data and collaborate by publishing and subscribing to the data assets. The Central IT team (Spoke N) subscribes the data from individual business units and consumes this data using Redshift Spectrum. Similarly, individual business units produce their own domain-specific data.

Data Lake

Data Lake Data Warehouse Data Governance Publishing

You Cannot Get to the Moon on a Bike!

Ontotext

JANUARY 10, 2024

Limiting growth by (data integration) complexity Most operational IT systems in an enterprise have been developed to serve a single business function and they use the simplest possible model for this. In both cases, semantic metadata is the glue that turns knowledge graphs into hubs of data, metadata, and content.

Metadata

Metadata Slice and Dice Data Integration Enterprise

Enabling Integration and Interoperability Across the Grid with Knowledge Graphs

Ontotext

JULY 15, 2024

It also adds flexibility in accommodating new kinds of data, including metadata about existing data points that lets users infer new relationships and other facts about the data in the graph. Schemas are an example of how the right metadata can add value to the data it describes.

Contextual Data

Contextual Data Metadata Data Quality Publishing

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

AWS Big Data

MARCH 7, 2023

Sources Data can be loaded from multiple sources, such as systems of record, data generated from applications, operational data stores, enterprise-wide reference data and metadata, data from vendors and partners, machine-generated data, social sources, and web sources.

Analytics

Analytics Data Warehouse Data Lake Metadata

Automating Model Risk Compliance: Model Development

DataRobot Blog

MAY 10, 2022

It has been over a decade since the Federal Reserve Board (FRB) and the Office of the Comptroller of the Currency (OCC) published its seminal guidance focused on Model Risk Management ( SR 11-7 & OCC Bulletin 2011-12 , respectively). To reference SR 11-7: .

Risk

Risk Modeling Machine Learning Data Quality

Build Write-Audit-Publish pattern with Apache Iceberg branching and AWS Glue Data Quality

AWS Big Data

DECEMBER 9, 2024

Given the importance of data in the world today, organizations face the dual challenges of managing large-scale, continuously incoming data while vetting its quality and reliability. AWS Glue is a serverless data integration service that you can use to effectively monitor and manage data quality through AWS Glue Data Quality.

Data Quality

Data Quality Publishing Snapshot Data Lake

The Audience for Data Catalogs and Data Intelligence

Alation

JUNE 21, 2022

Analysts didn’t just want to catalog data sources, they wanted to include dashboards, reports, and visualizations. Why start with a data source and build a visualization, if you can just find a visualization that already exists, complete with metadata about it? Data engineers want to catalog data pipelines.

Metadata

Metadata Data Quality Visualization Data Lake

The state of data quality in 2020

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Webinars

Trending Sources

Data’s dark secret: Why poor quality cripples AI and growth

Webinars

It’s 2025. Are your data strategies strong enough to de-risk AI adoption?

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

RDF-Star: Metadata Complexity Simplified

Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone

Amazon DataZone now integrates with AWS Glue Data Quality and external data quality solutions

How EUROGATE established a data mesh architecture using Amazon DataZone

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Use open table format libraries on AWS Glue 5.0 for Apache Spark

Data Intelligence and Its Role in Combating Covid-19

Collibra Provides a Platform for Data Intelligence

A Day in the Life of a DataOps Engineer

What an Old Dictionary teaches us about Metadata

Metadata enrichment – highly scalable data classification and data discovery

How Fujitsu implemented a global data mesh architecture and democratized data

Advance top 2025 data initiatives with analyst firm-recognized erwin by Quest

Top 10 Metadata Management Influencers, Sites, and Blogs You Must Follow in 2021

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

What you need to know about product management for AI

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

What is BCBS 239 Compliance?

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

Governing data in relational databases using Amazon DataZone

What is a data fabric architecture?

Why Data Governance Is Crucial for All Enterprise-Level Businesses

HEMA accelerates their data governance journey with Amazon DataZone

How to Do Data Modeling the Right Way

The Semantic Web: 20 Years And a Handful of Enterprise Knowledge Graphs Later

Biggest Trends in Data Visualization Taking Shape in 2022

6 DataOps Best Practices to Increase Your Data Analytics Output AND Your Data Quality

AWS Lake Formation 2023 year in review

Augmented data management: Data fabric versus data mesh

The Modern Data Lakehouse: An Architectural Innovation

Empowering data mesh: The tools to deliver BI excellence

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

You Cannot Get to the Moon on a Bike!

Enabling Integration and Interoperability Across the Grid with Knowledge Graphs

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

Automating Model Risk Compliance: Model Development

Build Write-Audit-Publish pattern with Apache Iceberg branching and AWS Glue Data Quality

The Audience for Data Catalogs and Data Intelligence

Stay Connected