Analytics, Data Quality and Data Transformation

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data.

Data Quality

Data Quality Metrics Data-driven Management

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

At AWS, we are committed to empowering organizations with tools that streamline data analytics and transformation processes. This integration enables data teams to efficiently transform and manage data using Athena with dbt Cloud’s robust features, enhancing the overall data workflow experience.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Set up alerts and orchestrate data quality rules with AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

Alerts and notifications play a crucial role in maintaining data quality because they facilitate prompt and efficient responses to any data quality issues that may arise within a dataset. This proactive approach helps mitigate the risk of making decisions based on inaccurate information.

Data Quality

Data Quality Metrics Data-driven Visualization

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

Here are just 10 of the many key features of Datasphere that were covered during the launch day announcements : Datasphere works with the SAP Analytics Cloud and runs on the existing SAP BTP (Business Technology Platform), with all the essential features: security, access control, high availability. Datasphere is not just for data managers.

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

What is DataOps? Collaborative, cross-functional analytics

CIO Business Intelligence

DECEMBER 22, 2022

DataOps (data operations) is an agile, process-oriented methodology for developing and delivering analytics. It brings together DevOps teams with data engineers and data scientists to provide the tools, processes, and organizational structures to support the data-focused enterprise. What is DataOps?

Analytics

Analytics Machine Learning Data mining Software

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Domain ownership recognizes that the teams generating the data have the deepest understanding of it and are therefore best suited to manage, govern, and share it effectively. This principle makes sure data accountability remains close to the source, fostering higher data quality and relevance.

Metadata

Metadata Data Governance Data Quality Data-driven

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

For container terminal operators, data-driven decision-making and efficient data sharing are vital to optimizing operations and boosting supply chain efficiency. Enhance agility by localizing changes within business domains and clear data contracts. Eliminate centralized bottlenecks and complex data pipelines.

IoT

IoT Machine Learning Metadata Data-driven

Development Strategies to Prevent Data Quality Issues in Production (Part 1)

Wayne Yaddow

MARCH 3, 2025

When implementing automated validation, AI-driven regression testing, real-time canary pipelines, synthetic data generation, freshness enforcement, KPI tracking, and CI/CD automation, organizations can shift from reactive data observability to proactive data quality assurance. Summary: Why thisorder?

Data Quality

Data Quality Strategy ROI Testing

Ensuring Data Transformation Results with Great Expectations

Wayne Yaddow

MARCH 12, 2025

However, Great Expectations (GX ) sets itself apart as a robust, open-source framework that helps data teams maintain consistent and transparent data quality standards. Data quality rules are codified into structured Expectation Suites by Great Expectations instead of relying on ad-hoc scripts or manual checks.

Data Transformation

Data Transformation Data Quality Testing Data Warehouse

Complex Data Transformations — Test Planning Best Practices

Wayne Yaddow

FEBRUARY 21, 2025

Complex Data TransformationsTest Planning Best Practices Ensuring data accuracy with structured testing and best practices Photo by Taylor Vick on Unsplash Introduction Data transformations and conversions are crucial for data pipelines, enabling organizations to process, integrate, and refine raw data into meaningful insights.

Testing

Testing Data Transformation Data Quality Data Integration

Ensuring Data Transformation Quality with dbt Core

Wayne Yaddow

MARCH 14, 2025

How dbt Core aids data teams test, validate, and monitor complex data transformations and conversions Photo by NASA on Unsplash Introduction dbt Core, an open-source framework for developing, testing, and documenting SQL-based data transformations, has become a must-have tool for modern data teams as the complexity of data pipelines grows.

Data Transformation

Data Transformation Testing Unstructured Data Data Quality

The Journey to DataOps Success: Key Takeaways from Transformation Trailblazers

DataKitchen

APRIL 26, 2021

In early April 2021, DataKItchen sat down with Jonathan Hodges, VP Data Management & Analytics, at Workiva ; Chuck Smith, VP of R&D Data Strategy at GlaxoSmithKline (GSK) ; and Chris Bergh, CEO and Head Chef at DataKitchen, to find out about their enterprise DataOps transformation journey, including key successes and lessons learned.

Measurement

Measurement Metrics Data-driven Dashboards

Data Engineers Are Using AI to Verify Data Transformations

Wayne Yaddow

FEBRUARY 26, 2025

AI is transforming how senior data engineers and data scientists validate data transformations and conversions. Artificial intelligence-based verification approaches aid in the detection of anomalies, the enforcement of data integrity, and the optimization of pipelines for improved efficiency.

Data Transformation

Data Transformation Testing Data-driven Data Quality

From Raw Inputs to Polished Outputs: The Art of Testing Data Transformations

Wayne Yaddow

MARCH 5, 2025

In this post, well see the fundamental procedures, tools, and techniques that data engineers, data scientists, and QA/testing teams use to ensure high-quality data as soon as its deployed. First, we look at how unit and integration tests uncover transformation errors at an early stage.

Testing

Testing Data Transformation Statistics Metadata

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

AWS Big Data

DECEMBER 21, 2023

As the volume and complexity of analytics workloads continue to grow, customers are looking for more efficient and cost-effective ways to ingest and analyse data. OpenSearch Service is used for multiple purposes, such as observability, search analytics, consolidation, cost savings, compliance, and integration.

Analytics

Analytics IT Data Lake Visualization

Key Challenges Affecting Data Transformations—Dev and Testing

Wayne Yaddow

FEBRUARY 6, 2025

Common challenges and practical mitigation strategies for reliable data transformations. Photo by Mika Baumeister on Unsplash Introduction Data transformations are important processes in data engineering, enabling organizations to structure, enrich, and integrate data for analytics , reporting, and operational decision-making.

Testing

Testing Data Transformation Data-driven Manufacturing

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

One key component that plays a central role in modern data architectures is the data lake, which allows organizations to store and analyze large amounts of data in a cost-effective manner and run advanced analytics and machine learning (ML) at scale. To overcome these issues, Orca decided to build a data lake.

Data Lake

Data Lake Analytics Snapshot Data Quality

Is your data supply chain a liability?

CIO Business Intelligence

JUNE 23, 2022

Yet as companies fight for skilled analyst roles to utilize data to make better decisions , they often fall short in improving the data supply chain and resulting data quality. Without a solid data supply-chain management practices in place, data quality often suffers. First mile/last mile impacts.

Data Quality

Data Quality Key Performance Indicator Metrics KPI

What is Data Lineage? Top 5 Benefits of Data Lineage

erwin

APRIL 29, 2020

Business terms and data policies should be implemented through standardized and documented business rules. Compliance with these business rules can be tracked through data lineage, incorporating auditability and validation controls across data transformations and pipelines to generate alerts when there are non-compliant data instances.

Metadata

Metadata Key Performance Indicator Data Governance Data Quality

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

It does this by helping teams handle the T in ETL (extract, transform, and load) processes. It allows users to write data transformation code, run it, and test the output, all within the framework it provides. As part of their cloud modernization initiative, they sought to migrate and modernize their legacy data platform.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

DECEMBER 13, 2023

In addition to using native managed AWS services that BMS didn’t need to worry about upgrading, BMS was looking to offer an ETL service to non-technical business users that could visually compose data transformation workflows and seamlessly run them on the AWS Glue Apache Spark-based serverless data integration engine.

Metadata

Metadata Data Lake Visualization Data Quality

8 data strategy mistakes to avoid

CIO Business Intelligence

JANUARY 24, 2024

Building a successful data strategy at scale goes beyond collecting and analyzing data,” says Ryan Swann, chief data analytics officer at financial services firm Vanguard. Establishing data governance rules helps organizations comply with these regulations, reducing the risk of legal and financial penalties.

Data Strategy

Data Strategy Strategy Unstructured Data Data Governance

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. If we talk about Big Data, data visualization is crucial to more successfully drive high-level decision making.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Breaking down data silos for digital success

CIO Business Intelligence

NOVEMBER 7, 2023

Given the importance of sharing information among diverse disciplines in the era of digital transformation, this concept is arguably as important as ever. The aim is to normalize, aggregate, and eventually make available to analysts across the organization data that originates in various pockets of the enterprise.

Data Warehouse

Data Warehouse Digital Transformation Data-driven Reporting

The Best Data Management Tools For Small Businesses

Smart Data Collective

APRIL 29, 2020

The extraction of raw data, transforming to a suitable format for business needs, and loading into a data warehouse. Data transformation. This process helps to transform raw data into clean data that can be analysed and aggregated. Data analytics and visualisation.

Management

Management Data Warehouse Digital Transformation Dashboards

Straumann Group is transforming dentistry with data, AI

CIO Business Intelligence

FEBRUARY 16, 2023

But to augment its various businesses with ML and AI, Iyengar’s team first had to break down data silos within the organization and transform the company’s data operations. Digitizing was our first stake at the table in our data journey,” he says. Analytics, Artificial Intelligence, Data Management, Predictive Analytics

Unstructured Data

Unstructured Data Data Lake Prescriptive Analytics Data Warehouse

Data Integrity, the Basis for Reliable Insights

Sisense

AUGUST 28, 2020

Uncomfortable truth incoming: Most people in your organization don’t think about the quality of their data from intake to production of insights. However, as a data team member, you know how important data integrity (and a whole host of other aspects of data management) is.

Data Integration

Data Integration Testing Data Quality Data-driven

AzureML and CRISP-DM – a Framework to help the Business Intelligence professional move to AI

Jen Stirrup

SEPTEMBER 30, 2021

Azure ML can become a part of the data ecosystem in an organization, but this requires a mindshift from working with Business Intelligence to more advanced analytics. How can we can adopt a mindshift from Business Intelligence to advanced analytics using Azure ML?

Business Intelligence

Business Intelligence Data mining Machine Learning Testing

Time for New Partnership Paradigms to Be Future-fit

CIO Business Intelligence

DECEMBER 6, 2023

Airbus was conceiving an ambitious plan to develop an open aviation data platform, Skywise, as a single platform of reference for all major aviation players that would enable them to improve their operational performance and business results and support Airbus’ own digital transformation.

Digital Transformation

Digital Transformation Software Cost-Benefit Manufacturing

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

As data volumes continue to grow exponentially, traditional data warehousing solutions may struggle to keep up with the increasing demands for scalability, performance, and advanced analytics. However, you might face significant challenges when planning for a large-scale data warehouse migration.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Use fuzzy string matching to approximate duplicate records in Amazon Redshift

AWS Big Data

FEBRUARY 8, 2023

Amazon Redshift enables you to run complex SQL analytics at scale and performance on terabytes to petabytes of structured and unstructured data, and make the insights widely available through popular business intelligence (BI) and analytics tools. Answering questions as simple as “How many unique customers do we have?”

Data Quality

Data Quality Testing Data Warehouse Unstructured Data

The What & Why of Data Governance

erwin

MARCH 4, 2021

Why should companies care about data governance? erwin’s 2020 State of Data Governance and Automation report found that better decision-making is the primary driver for data governance (62 percent), with analytics secondary (51 percent), and regulatory compliance coming in third (48 percent).

Data Governance

Data Governance Digital Transformation Data-driven Cost-Benefit

The importance of data ingestion and integration for enterprise AI

IBM Big Data Hub

JANUARY 9, 2024

The entire generative AI pipeline hinges on the data pipelines that empower it, making it imperative to take the correct precautions. 4 key components to ensure reliable data ingestion Data quality and governance: Data quality means ensuring the security of data sources, maintaining holistic data and providing clear metadata.

Enterprise

Enterprise Data Integration Data Quality Contextual Data

Harnessing Streaming Data: Insights at the Speed of Life

Sisense

OCTOBER 15, 2020

Dealing with Data is your window into the ways data teams are tackling the challenges of this new world to help their companies and their customers thrive. Streaming data analytics is expected to grow into a $38.6 Every data professional knows that ensuring data quality is vital to producing usable query results.

Dashboards

Dashboards IoT Optimization Internet of Things

An AI Chat Bot Wrote This Blog Post …

DataKitchen

DECEMBER 9, 2022

ChatGPT> DataOps, or data operations, is a set of practices and technologies that organizations use to improve the speed, quality, and reliability of their data analytics processes. Overall, DataOps is an essential component of modern data-driven organizations. Query> DataOps.

Machine Learning

Machine Learning Data-driven Optimization Data Analytics

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Ontotext

NOVEMBER 18, 2021

Picture this – you start with the perfect use case for your data analytics product. And all of them are asking hard questions: “Can you integrate my data, with my particular format?”, “How well can you scale?”, “How many visualizations do you offer?”. Nowadays, data analytics doesn’t exist on its own.

Visualization

Visualization Reporting Metadata Enterprise

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Ontotext

FEBRUARY 12, 2024

This is especially beneficial when teams need to increase data product velocity with trust and data quality, reduce communication costs, and help data solutions align with business objectives. In most enterprises, data is needed and produced by many business units but owned and trusted by no one.

Data-driven

Data-driven Data Lake Data Quality Business Objectives

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

In this blog, we’ll delve into the critical role of governance and data modeling tools in supporting a seamless data mesh implementation and explore how erwin tools can be used in that role. erwin also provides data governance, metadata management and data lineage software called erwin Data Intelligence by Quest.

Metadata

Metadata Data Quality Data Governance Modeling

A step-by-step guide to setting up a data governance program

IBM Big Data Hub

FEBRUARY 9, 2023

In our last blog , we delved into the seven most prevalent data challenges that can be addressed with effective data governance. Today we will share our approach to developing a data governance program to drive data transformation and fuel a data-driven culture.

Data Governance

Data Governance Business Objectives Data Quality Measurement

Self-Serve Data Prep CAN Be Easy AND Sophisticated!

Smarten

AUGUST 4, 2023

The Right Self-Serve Data Preparation Solution is Sophisticated, Easy-to-Use and Ensures User Adoption! When your enterprise decides to roll out analytics for business users, it is important to implement the right solution. Sophisticated Functionality – Don’t sacrifice functionality to get ease-of-use.

Data Lake

Data Lake Machine Learning Data Integration Data Quality

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

Alation

OCTOBER 18, 2022

Making this data visible in the data catalog will let data teams share their work, support re-use, and empower everyone to better understand and trust data. Data Transformation in the Modern Data Stack. Data engineering plays a critical role in distributing data to a wide audience.

Metadata

Metadata Metrics Recreation/Entertainment Data Quality

Automate discovery of data relationships using ML and Amazon Neptune graph technology

AWS Big Data

APRIL 19, 2023

Background The success of a data-driven organization recognizes data as a key enabler to increase and sustain innovation. The goal of a data product is to solve the long-standing issue of data silos and data quality. It follows what is called a distributed system architecture.

Technology

Technology Data-driven Machine Learning Sales

Database vs. Data Warehouse: What’s the Difference?

Jet Global

MAY 28, 2019

High level, a data warehouse is a collection of business data from multiple sources used optimized for reporting, analytics and decision making. This automated process of extracting, transforming, and loading data into a data warehouse is commonly called ETL and it’s a huge advantage for analyzing your data.

Data Warehouse

Data Warehouse Reporting Business Intelligence Sales

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Data’s dark secret: Why poor quality cripples AI and growth

Webinars

Trending Sources

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

Webinars

Set up alerts and orchestrate data quality rules with AWS Glue Data Quality

SAP Datasphere Powers Business at the Speed of Data

What is DataOps? Collaborative, cross-functional analytics

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

How EUROGATE established a data mesh architecture using Amazon DataZone

Development Strategies to Prevent Data Quality Issues in Production (Part 1)

Ensuring Data Transformation Results with Great Expectations

Complex Data Transformations — Test Planning Best Practices

Ensuring Data Transformation Quality with dbt Core

The Journey to DataOps Success: Key Takeaways from Transformation Trailblazers

Data Engineers Are Using AI to Verify Data Transformations

From Raw Inputs to Polished Outputs: The Art of Testing Data Transformations

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

Key Challenges Affecting Data Transformations—Dev and Testing

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Is your data supply chain a liability?

What is Data Lineage? Top 5 Benefits of Data Lineage

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

8 data strategy mistakes to avoid

Biggest Trends in Data Visualization Taking Shape in 2022

Breaking down data silos for digital success

The Best Data Management Tools For Small Businesses

Straumann Group is transforming dentistry with data, AI

Data Integrity, the Basis for Reliable Insights

AzureML and CRISP-DM – a Framework to help the Business Intelligence professional move to AI

Time for New Partnership Paradigms to Be Future-fit

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

­­Use fuzzy string matching to approximate duplicate records in Amazon Redshift

The What & Why of Data Governance

The importance of data ingestion and integration for enterprise AI

Harnessing Streaming Data: Insights at the Speed of Life

An AI Chat Bot Wrote This Blog Post …

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Empowering data mesh: The tools to deliver BI excellence

A step-by-step guide to setting up a data governance program

Self-Serve Data Prep CAN Be Easy AND Sophisticated!

Alation and dbt Unlock Metadata and Increase Modern Data Stack Visibility

Automate discovery of data relationships using ML and Amazon Neptune graph technology

Database vs. Data Warehouse: What’s the Difference?

Stay Connected

Use fuzzy string matching to approximate duplicate records in Amazon Redshift