Data Quality, Data Transformation and Information

Data Quality

Data Transformation

Information

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data.

Data Quality

Data Quality Metrics Data-driven Management

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Set up alerts and orchestrate data quality rules with AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

Alerts and notifications play a crucial role in maintaining data quality because they facilitate prompt and efficient responses to any data quality issues that may arise within a dataset. This proactive approach helps mitigate the risk of making decisions based on inaccurate information.

Data Quality

Data Quality Metrics Data-driven Visualization

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

Data collections are the ones and zeroes that encode the actionable insights (patterns, trends, relationships) that we seek to extract from our data through machine learning and data science. The insights are used to produce informative content for stakeholders (decision-makers, business users, and clients).

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Domain ownership recognizes that the teams generating the data have the deepest understanding of it and are therefore best suited to manage, govern, and share it effectively. This principle makes sure data accountability remains close to the source, fostering higher data quality and relevance.

Metadata

Metadata Data Governance Data Quality Data-driven

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

For container terminal operators, data-driven decision-making and efficient data sharing are vital to optimizing operations and boosting supply chain efficiency. In addition to real-time analytics and visualization, the data needs to be shared for long-term data analytics and machine learning applications.

IoT

IoT Machine Learning Metadata Data-driven

Navigating the Chaos of Unruly Data: Solutions for Data Teams

DataKitchen

NOVEMBER 10, 2023

Extrinsic Control Deficit: Many of these changes stem from tools and processes beyond the immediate control of the data team. Unregulated ETL/ELT Processes: The absence of stringent data quality tests in ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) processes further exacerbates the problem.

Data Quality

Data Quality Testing Data Lake Data Integration

Functional Gaps in Your Data Transformation Testing Tools?

Wayne Yaddow

FEBRUARY 11, 2025

Managing tests of complex data transformations when automated data testing tools lack important features? Photo by Marvin Meyer on Unsplash Introduction Data transformations are at the core of modern business intelligence, blending and converting disparate datasets into coherent, reliable outputs.

Testing

Testing Data Transformation Data Quality Statistics

Alation & Bigeye: A Potent Partnership for Data Quality

Alation

DECEMBER 7, 2021

Alation and Bigeye have partnered to bring data observability and data quality monitoring into the data catalog. Read to learn how our newly combined capabilities put more trustworthy, quality data into the hands of those who are best equipped to leverage it. trillion each year due to poor data quality.

Data Quality

Data Quality Data-driven Metrics Dashboards

Available Now! Automated Testing for Data Transformations

Wayne Yaddow

FEBRUARY 18, 2025

Selecting the strategies and tools for validating data transformations and data conversions in your data pipelines. Introduction Data transformations and data conversions are crucial to ensure that raw data is organized, processed, and ready for useful analysis.

Testing

Testing Data Transformation Data-driven Data Quality

Data Engineers Are Using AI to Verify Data Transformations

Wayne Yaddow

FEBRUARY 26, 2025

AI is transforming how senior data engineers and data scientists validate data transformations and conversions. Artificial intelligence-based verification approaches aid in the detection of anomalies, the enforcement of data integrity, and the optimization of pipelines for improved efficiency.

Data Transformation

Data Transformation Testing Data-driven Data Quality

What is Data Lineage? Top 5 Benefits of Data Lineage

erwin

APRIL 29, 2020

Business terms and data policies should be implemented through standardized and documented business rules. Compliance with these business rules can be tracked through data lineage, incorporating auditability and validation controls across data transformations and pipelines to generate alerts when there are non-compliant data instances.

Key Performance Indicator

Key Performance Indicator Metadata Data Governance Data Quality

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. Data Virtualization allows accessing them from a single point, replicating them only when strictly necessary.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

In this post, we delve into a case study for a retail use case, exploring how the Data Build Tool (dbt) was used effectively within an AWS environment to build a high-performing, efficient, and modern data platform. It does this by helping teams handle the T in ETL (extract, transform, and load) processes.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

DECEMBER 13, 2023

In addition to using native managed AWS services that BMS didn’t need to worry about upgrading, BMS was looking to offer an ETL service to non-technical business users that could visually compose data transformation workflows and seamlessly run them on the AWS Glue Apache Spark-based serverless data integration engine.

Metadata

Metadata Data Lake Visualization Data Quality

8 data strategy mistakes to avoid

CIO Business Intelligence

JANUARY 24, 2024

Similar to disaster recovery, business continuity, and information security, data strategy needs to be well thought out and defined to inform the rest, while providing a foundation from which to build a strong business.” Overlooking these data resources is a big mistake.

Data Strategy

Data Strategy Strategy Unstructured Data Data Governance

Breaking down data silos for digital success

CIO Business Intelligence

NOVEMBER 7, 2023

For years, IT and business leaders have been talking about breaking down the data silos that exist within their organizations. Given the importance of sharing information among diverse disciplines in the era of digital transformation, this concept is arguably as important as ever. There’s also the issue of bias.

Data Warehouse

Data Warehouse Digital Transformation Data-driven Reporting

Data Integrity, the Basis for Reliable Insights

Sisense

AUGUST 28, 2020

However, as a data team member, you know how important data integrity (and a whole host of other aspects of data management) is. In this article, we’ll dig into the core aspects of data integrity, what processes ensure it, and how to deal with data that doesn’t meet your standards.

Data Integration

Data Integration Testing Data Quality Data-driven

Straumann Group is transforming dentistry with data, AI

CIO Business Intelligence

FEBRUARY 16, 2023

But to augment its various businesses with ML and AI, Iyengar’s team first had to break down data silos within the organization and transform the company’s data operations. Digitizing was our first stake at the table in our data journey,” he says. The offensive side?

Unstructured Data

Unstructured Data Data Lake Prescriptive Analytics Data Warehouse

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

However, you might face significant challenges when planning for a large-scale data warehouse migration. Discovery of workload and integrations Conducting discovery and assessment for migrating a large on-premises data warehouse to Amazon Redshift is a critical step in the migration process.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

The Best Data Management Tools For Small Businesses

Smart Data Collective

APRIL 29, 2020

The techniques for managing organisational data in a standardised approach that minimises inefficiency. Extraction, Transform, Load (ETL). The extraction of raw data, transforming to a suitable format for business needs, and loading into a data warehouse. Data transformation. Microsoft Azure.

Management

Management Data Warehouse Digital Transformation Dashboards

The What & Why of Data Governance

erwin

MARCH 4, 2021

But when IT-driven data management and business-oriented data governance work together in terms of both personnel, processes and technology, decisions can be made and their impacts determined based on a full inventory of reliable information. Virginia residents also would be able to opt out of data collection.

Data Governance

Data Governance Digital Transformation Data-driven Cost-Benefit

Top 6 Benefits of Automating End-to-End Data Lineage

erwin

SEPTEMBER 17, 2020

Replace manual and recurring tasks for fast, reliable data lineage and overall data governance. It’s paramount that organizations understand the benefits of automating end-to-end data lineage. Critically, it makes it easier to get a clear view of how information is created and flows into, across and outside an enterprise.

Cost-Benefit

Cost-Benefit Data Governance Metadata Reporting

Use fuzzy string matching to approximate duplicate records in Amazon Redshift

AWS Big Data

FEBRUARY 8, 2023

It’s common to ingest multiple data sources into Amazon Redshift to perform analytics. Often, each data source will have its own processes of creating and maintaining data, which can lead to data quality challenges within and across sources. Answering questions as simple as “How many unique customers do we have?”

Data Quality

Data Quality Testing Data Warehouse Unstructured Data

The importance of data ingestion and integration for enterprise AI

IBM Big Data Hub

JANUARY 9, 2024

The entire generative AI pipeline hinges on the data pipelines that empower it, making it imperative to take the correct precautions. 4 key components to ensure reliable data ingestion Data quality and governance: Data quality means ensuring the security of data sources, maintaining holistic data and providing clear metadata.

Enterprise

Enterprise Data Integration Data Quality Contextual Data

AzureML and CRISP-DM – a Framework to help the Business Intelligence professional move to AI

Jen Stirrup

SEPTEMBER 30, 2021

Before we dive in, let’s define strands of AI, Machine Learning and Data Science: Business intelligence (BI) leverages software and services to transform data into actionable insights that inform an organization’s strategic and tactical business decisions.

Business Intelligence

Business Intelligence Data mining Machine Learning Testing

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

Prior to the creation of the data lake, Orca’s data was distributed among various data silos, each owned by a different team with its own data pipelines and technology stack. Moreover, running advanced analytics and ML on disparate data sources proved challenging.

Data Lake

Data Lake Analytics Snapshot Data Quality

An AI Chat Bot Wrote This Blog Post …

DataKitchen

DECEMBER 9, 2022

By streamlining data-related workflows and enabling real-time collaboration, DataOps can help organizations to quickly turn data into insights, and to put those insights into action. ChatGPT> DataOps observability is a critical aspect of modern data analytics and machine learning.

Machine Learning

Machine Learning Data-driven Optimization Data Analytics

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Ontotext

NOVEMBER 18, 2021

OntoRefine is a data transformation tool that lets you unite plenty of data formats and get them into your triplestore. Now that the data is in the database, we can start benefiting from the RDF technology’s strengths. One of the core upsides of storing your data in that format is inference.

Visualization

Visualization Reporting Metadata Enterprise

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Ontotext

FEBRUARY 12, 2024

This is especially beneficial when teams need to increase data product velocity with trust and data quality, reduce communication costs, and help data solutions align with business objectives. What does data mesh do that other approaches can’t?

Data-driven

Data-driven Data Lake Data Quality Business Objectives

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

Alation

OCTOBER 18, 2022

Making this data visible in the data catalog will let data teams share their work, support re-use, and empower everyone to better understand and trust data. Data Transformation in the Modern Data Stack. Data engineering plays a critical role in distributing data to a wide audience.

Metadata

Metadata Metrics Recreation/Entertainment Data Quality

Self-service vs Centralized Data Management: How to Leverage Data Lineage to Empower and Control

Octopai

JUNE 22, 2023

On the other hand, centralized data management emphasizes a more structured and governed approach. Data is managed and controlled by a dedicated team of data professionals, ensuring data quality, security, and compliance. This approach offers greater control and reduces the risk of data inconsistencies.

Management

Management Data-driven Data Quality Data Governance

A step-by-step guide to setting up a data governance program

IBM Big Data Hub

FEBRUARY 9, 2023

In our last blog , we delved into the seven most prevalent data challenges that can be addressed with effective data governance. Today we will share our approach to developing a data governance program to drive data transformation and fuel a data-driven culture.

Data Governance

Data Governance Business Objectives Data Quality Measurement

Harnessing Streaming Data: Insights at the Speed of Life

Sisense

OCTOBER 15, 2020

Every data professional knows that ensuring data quality is vital to producing usable query results. Streaming data can be extra challenging in this regard, as it tends to be “dirty,” with new fields that are added without warning and frequent mistakes in the data collection process.

Dashboards

Dashboards IoT Optimization Internet of Things

Automate discovery of data relationships using ML and Amazon Neptune graph technology

AWS Big Data

APRIL 19, 2023

Background The success of a data-driven organization recognizes data as a key enabler to increase and sustain innovation. The goal of a data product is to solve the long-standing issue of data silos and data quality. Suppose a consumer is browsing the Customer data product in the data mesh marketplace.

Technology

Technology Data-driven Machine Learning Sales

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

AWS Big Data

DECEMBER 21, 2023

AWS Glue provides both visual and code-based interfaces to make data integration effortless. Using a native AWS Glue connector increases agility, simplifies data movement, and improves data quality. For more information, see Setting up networking for development for AWS Glue. Choose Create connection.

Analytics

Analytics IT Data Lake Visualization

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

In this blog, we’ll delve into the critical role of governance and data modeling tools in supporting a seamless data mesh implementation and explore how erwin tools can be used in that role. erwin also provides data governance, metadata management and data lineage software called erwin Data Intelligence by Quest.

Metadata

Metadata Data Quality Data Governance Modeling

Database vs. Data Warehouse: What’s the Difference?

Jet Global

MAY 28, 2019

The decision will come down to a database vs a data warehouse—but let’s start by explaining what each is and why they are used. All About That (Data)Base. A database is, by definition, ‘any collection of data organized for storage, accessibility, and retrieval.’ Let’s look at why: Data Quality and Consistency.

Data Warehouse

Data Warehouse Reporting Business Intelligence Sales

Drive Growth with Data-Driven Strategies: Introducing Zenia Graph’s Salesforce Accelerator

Ontotext

MARCH 20, 2024

In today’s data-driven world, businesses are drowning in a sea of information. Traditional data integration methods struggle to bridge these gaps, hampered by high costs, data quality concerns, and inconsistencies. This allows your teams to make informed decisions based on real data, not just intuition.

Data-driven

Data-driven Strategy Sales Data Integration

Self-Serve Data Prep CAN Be Easy AND Sophisticated!

Smarten

AUGUST 4, 2023

As your users become accustomed to augmented analytics, they will want the ability to quickly, and easily, gather data, integrated from disparate data sources. Users can then prepare that data – transforming, shaping, reducing, combining, exploring, cleaning, sampling, and aggregating data, to get the dataset users wish to analyze.

Data Lake

Data Lake Machine Learning Data Integration Data Quality

Why The Public Sector Needs Data Governance

Alation

NOVEMBER 22, 2022

What Is Data Governance In The Public Sector? Effective data governance for the public sector enables entities to ensure data quality, enhance security, protect privacy, and meet compliance requirements. With so much focus on compliance, democratizing data for self-service analytics can present a challenge.

Data Governance

Data Governance Metadata Data-driven Unstructured Data

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

erwin

JANUARY 11, 2019

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms , including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. So questions linger about whether transformed data can be trusted.

Data Governance

Data Governance Risk Metadata Management

5 best open source data flow lineage tools

Octopai

AUGUST 11, 2024

Just as a navigation app provides a detailed map of roads, guiding you from your starting point to your destination while highlighting every turn and intersection, data flow lineage offers a comprehensive view of data movement and transformations throughout its lifecycle. For more information, Contact Us.

Metadata

Metadata Visualization Data Quality Data Governance

The Rising Need for Data Governance in Healthcare

Alation

OCTOBER 28, 2021

Leaders are asking how they might use data to drive smarter decision making to support this new model and improve medical treatments that lead to better outcomes. Healthcare organizations need to manage and protect sensitive information in a consistent, secure, and organized way. Why Is Data Governance in Healthcare Important?

Data Governance

Data Governance Measurement Data Quality Metrics

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Data’s dark secret: Why poor quality cripples AI and growth

Webinars

Trending Sources

Set up alerts and orchestrate data quality rules with AWS Glue Data Quality

Webinars

SAP Datasphere Powers Business at the Speed of Data

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

How EUROGATE established a data mesh architecture using Amazon DataZone

Navigating the Chaos of Unruly Data: Solutions for Data Teams

Functional Gaps in Your Data Transformation Testing Tools?

Alation & Bigeye: A Potent Partnership for Data Quality

Available Now! Automated Testing for Data Transformations

Data Engineers Are Using AI to Verify Data Transformations

What is Data Lineage? Top 5 Benefits of Data Lineage

Biggest Trends in Data Visualization Taking Shape in 2022

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

8 data strategy mistakes to avoid

Breaking down data silos for digital success

Data Integrity, the Basis for Reliable Insights

Straumann Group is transforming dentistry with data, AI

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

The Best Data Management Tools For Small Businesses

The What & Why of Data Governance

Top 6 Benefits of Automating End-to-End Data Lineage

­­Use fuzzy string matching to approximate duplicate records in Amazon Redshift

The importance of data ingestion and integration for enterprise AI

AzureML and CRISP-DM – a Framework to help the Business Intelligence professional move to AI

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

An AI Chat Bot Wrote This Blog Post …

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

Self-service vs Centralized Data Management: How to Leverage Data Lineage to Empower and Control

A step-by-step guide to setting up a data governance program

Harnessing Streaming Data: Insights at the Speed of Life

Automate discovery of data relationships using ML and Amazon Neptune graph technology

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

Empowering data mesh: The tools to deliver BI excellence

Database vs. Data Warehouse: What’s the Difference?

Drive Growth with Data-Driven Strategies: Introducing Zenia Graph’s Salesforce Accelerator

Self-Serve Data Prep CAN Be Easy AND Sophisticated!

Why The Public Sector Needs Data Governance

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

5 best open source data flow lineage tools

The Rising Need for Data Governance in Healthcare

Stay Connected

Use fuzzy string matching to approximate duplicate records in Amazon Redshift