Data Transformation, Data-driven and Document

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data. 10) Data Quality Solutions: Key Attributes.

Data Quality

Data Quality Metrics Data-driven Management

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

At AWS, we are committed to empowering organizations with tools that streamline data analytics and transformation processes. This integration enables data teams to efficiently transform and manage data using Athena with dbt Cloud’s robust features, enhancing the overall data workflow experience.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

Streamline AI-driven analytics with governance: Integrating Tableau with Amazon DataZone

AWS Big Data

OCTOBER 30, 2024

Amazon DataZone is a data management service that makes it faster and easier for customers to catalog, discover, share, and govern data stored across AWS, on premises, and from third-party sources. Using Amazon DataZone lets us avoid building and maintaining an in-house platform, allowing our developers to focus on tailored solutions.

Analytics

Analytics Visualization Data Governance Data-driven

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Amazon OpenSearch Service launches flow builder to empower rapid AI search innovation

AWS Big Data

MAY 2, 2025

Through a visual designer, you can configure custom AI search flowsa series of AI-driven data enrichments performed during ingestion and search. Each processor applies a type of data transform such as encoding text into vector embeddings, or summarizing search results with a chatbot AI service.

Machine Learning

Machine Learning Visualization Dashboards Metadata

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

We live in a data-rich, insights-rich, and content-rich world. Data collections are the ones and zeroes that encode the actionable insights (patterns, trends, relationships) that we seek to extract from our data through machine learning and data science. Plus, AI can also help find key insights encoded in data.

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis.

Data Warehouse

Data Warehouse Analytics Testing Sales

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

Data is the foundation of innovation, agility and competitive advantage in todays digital economy. As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Data quality is no longer a back-office concern.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

AWS Big Data

OCTOBER 30, 2024

Amazon DataZone now launched authentication supports through the Amazon Athena JDBC driver, allowing data users to seamlessly query their subscribed data lake assets via popular business intelligence (BI) and analytics tools like Tableau, Power BI, Excel, SQL Workbench, DBeaver, and more.

Visualization

Visualization Data Lake Testing Data Governance

Ensuring Data Transformation Quality with dbt Core

Wayne Yaddow

MARCH 14, 2025

How dbt Core aids data teams test, validate, and monitor complex data transformations and conversions Photo by NASA on Unsplash Introduction dbt Core, an open-source framework for developing, testing, and documenting SQL-based data transformations, has become a must-have tool for modern data teams as the complexity of data pipelines grows.

Data Transformation

Data Transformation Testing Unstructured Data Data Quality

Top 6 Benefits of Automating End-to-End Data Lineage

erwin

SEPTEMBER 17, 2020

Replace manual and recurring tasks for fast, reliable data lineage and overall data governance. It’s paramount that organizations understand the benefits of automating end-to-end data lineage. The importance of end-to-end data lineage is widely understood and ignoring it is risky business. Doing Data Lineage Right.

Cost-Benefit

Cost-Benefit Data Governance Metadata Reporting

Available Now! Automated Testing for Data Transformations

Wayne Yaddow

FEBRUARY 18, 2025

Selecting the strategies and tools for validating data transformations and data conversions in your data pipelines. Introduction Data transformations and data conversions are crucial to ensure that raw data is organized, processed, and ready for useful analysis.

Testing

Testing Data Transformation Data-driven Data Quality

Key Challenges Affecting Data Transformations—Dev and Testing

Wayne Yaddow

FEBRUARY 6, 2025

Common challenges and practical mitigation strategies for reliable data transformations. Photo by Mika Baumeister on Unsplash Introduction Data transformations are important processes in data engineering, enabling organizations to structure, enrich, and integrate data for analytics , reporting, and operational decision-making.

Testing

Testing Data Transformation Data-driven Manufacturing

What is Data Lineage? Top 5 Benefits of Data Lineage

erwin

APRIL 29, 2020

Data lineage is the journey data takes from its creation through its transformations over time. Tracing the source of data is an arduous task. With all these diverse data sources, and if systems are integrated, it is difficult to understand the complicated data web they form much less get a simple visual flow.

Key Performance Indicator

Key Performance Indicator Metadata Data Governance Data Quality

Turning the page

Cloudera

JUNE 1, 2021

This means we can double down on our strategy – continuing to win the Hybrid Data Cloud battle in the IT department AND building new, easy-to-use cloud solutions for the line of business. It also means we can complete our business transformation with the systems, processes and people that support a new operating model. .

Uncertainty

Uncertainty Cost-Benefit Risk Strategy

Breaking down data silos for digital success

CIO Business Intelligence

NOVEMBER 7, 2023

For years, IT and business leaders have been talking about breaking down the data silos that exist within their organizations. Given the importance of sharing information among diverse disciplines in the era of digital transformation, this concept is arguably as important as ever. What are the challenges and potential rewards?

Data Warehouse

Data Warehouse Digital Transformation Data-driven Reporting

How Your Finance Team Can Lead Your Enterprise Data Transformation

Alation

OCTOBER 26, 2021

Today’s best-performing organizations embrace data for strategic decision-making. Because of the criticality of the data they deal with, we think that finance teams should lead the enterprise adoption of data and analytics solutions. This is because accurate data is “table stakes” for finance teams.

Finance

Finance Data Transformation Enterprise Metrics

Migrate from Apache Solr to OpenSearch

AWS Big Data

JULY 18, 2024

OpenSearch is an open source, distributed search engine suitable for a wide array of use-cases such as ecommerce search, enterprise search (content management search, document search, knowledge management search, and so on), site search, application search, and semantic search. OpenSearch also includes capabilities to ingest and analyze data.

Dashboards

Dashboards Testing Data-driven Visualization

Cloudera’s Open Data Lakehouse Supercharged with dbt Core(tm)

Cloudera

OCTOBER 7, 2022

dbt allows data teams to produce trusted data sets for reporting, ML modeling, and operational workflows using SQL, with a simple workflow that follows software engineering best practices like modularity, portability, and continuous integration/continuous development (CI/CD). The Open Data Lakehouse . Introduction.

Data Warehouse

Data Warehouse Data Transformation Machine Learning Data Lake

The Best Data Management Tools For Small Businesses

Smart Data Collective

APRIL 29, 2020

As the world is gradually becoming more dependent on data, the services, tools and infrastructure are all the more important for businesses in every sector. Data management has become a fundamental business concern, and especially for businesses that are going through a digital transformation. What is data management?

Management

Management Data Warehouse Digital Transformation Dashboards

Talk Data to Me: Why Employee Data Literacy Matters

erwin

MARCH 26, 2020

Organizations are flooded with data, so they’re scrambling to find ways to derive meaningful insights from it – and then act on them to improve the bottom line. In today’s data-driven business, enabling employees to access and understand the data that’s relevant to their roles allows them to use data and put those insights into action.

Data-driven

Data-driven Unstructured Data Enterprise Machine Learning

How CFM built a well-governed and scalable data-engineering platform using Amazon EMR for financial features generation

AWS Big Data

SEPTEMBER 13, 2024

In recent years, driven by the commoditization of data storage and processing solutions, the industry has seen a growing number of systematic investment management firms switch to alternative data sources to drive their investment decisions. Each team is the sole owner of its AWS account.

Interactive

Interactive Strategy Cost-Benefit Data Governance

Unlock scalable analytics with a secure connectivity pattern in AWS Glue to read from or write to Snowflake

AWS Big Data

AUGUST 19, 2024

In today’s data-driven world, the ability to seamlessly integrate and utilize diverse data sources is critical for gaining actionable insights and driving innovation. Use case Consider a large ecommerce company that relies heavily on data-driven insights to optimize its operations, marketing strategies, and customer experiences.

Analytics

Analytics Data-driven Data Integration Data Lake

Automate discovery of data relationships using ML and Amazon Neptune graph technology

AWS Big Data

APRIL 19, 2023

Data mesh is a new approach to data management. Companies across industries are using a data mesh to decentralize data management to improve data agility and get value from data. This is especially true in a large enterprise with thousands of data products.

Technology

Technology Data-driven Machine Learning Sales

Apply fine-grained access and transformation on the SUPER data type in Amazon Redshift

AWS Big Data

JUNE 19, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing ETL (extract, transform, and load), business intelligence (BI), and reporting tools.

Data Warehouse

Data Warehouse Testing Sales Structured Data

Unparalleled Productivity: The Power of Cloudera Copilot for Cloudera Machine Learning

Cloudera

JUNE 24, 2024

In the fast-evolving landscape of data science and machine learning, efficiency is not just desirable—it’s essential. Imagine a world where every data practitioner, from seasoned data scientists to budding developers, has an intelligent assistant at their fingertips.

Machine Learning

Machine Learning Data Science Data-driven Testing

Integrating healthcare apps and data with FHIR + HL7

IBM Big Data Hub

NOVEMBER 20, 2023

Today’s healthcare providers use a wide variety of applications and data across a broad ecosystem of partners to manage their daily workflows. Integrating these applications and data is critical to their success, allowing them to deliver patient care efficiently and effectively. What is HL7? What are the benefits of FHIR?

Cost-Benefit

Cost-Benefit Data-driven Data Transformation Management

Simplify data transfer: Google BigQuery to Amazon S3 using Amazon AppFlow

AWS Big Data

OCTOBER 5, 2023

In today’s data-driven world, the ability to effortlessly move and analyze data across diverse platforms is essential. Amazon AppFlow , a fully managed data integration service, has been at the forefront of streamlining data transfer between AWS services, software as a service (SaaS) applications, and now Google BigQuery.

Data Warehouse

Data Warehouse Machine Learning Data Integration Data-driven

Unlock scalable analytics with AWS Glue and Google BigQuery

AWS Big Data

OCTOBER 27, 2023

Data integration is the foundation of robust data analytics. It encompasses the discovery, preparation, and composition of data from diverse sources. In the modern data landscape, accessing, integrating, and transforming data from diverse sources is a vital process for data-driven decision-making.

Analytics

Analytics Visualization Data Integration Cost-Benefit

Use AWS Glue to streamline SFTP data processing

AWS Big Data

AUGUST 13, 2024

In today’s data-driven world, seamless integration and transformation of data across diverse sources into actionable insights is paramount. With AWS Glue, you can discover and connect to hundreds of diverse data sources and manage your data in a centralized data catalog.

Data Processing

Data Processing Visualization Data Lake Data Processing

A step-by-step guide to setting up a data governance program

IBM Big Data Hub

FEBRUARY 9, 2023

In our last blog , we delved into the seven most prevalent data challenges that can be addressed with effective data governance. Today we will share our approach to developing a data governance program to drive data transformation and fuel a data-driven culture. Don’t try to do everything at once!

Data Governance

Data Governance Business Objectives Data Quality Measurement

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

erwin

JANUARY 11, 2019

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms , including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. So questions linger about whether transformed data can be trusted.

Data Governance

Data Governance Risk Metadata Management

Sure, Trust Your Data… Until It Breaks Everything: How Automated Data Lineage Saves the Day

Octopai

JUNE 9, 2024

In today’s data-centric world, organizations often tout data as their most valuable asset. However, many struggle to maintain reliable, trustworthy data amidst complex, evolving environments. This challenge is especially critical for executives responsible for data strategy and operations.

IT

IT Data-driven Predictive Analytics Data Strategy

Enable data analytics with Talend and Amazon Redshift Serverless

AWS Big Data

JULY 25, 2023

Today, in order to accelerate and scale data analytics, companies are looking for an approach to minimize infrastructure management and predict computing needs for different types of workloads, including spikes and ad hoc analytics. Solution overview The integration of Talend with Amazon Redshift adds new features and capabilities.

Data Analytics

Data Analytics Analytics Data Warehouse Data Processing

Cloudera DataFlow Designer: The Key to Agile Data Pipeline Development

Cloudera

MARCH 14, 2023

We just announced the general availability of Cloudera DataFlow Designer , bringing self-service data flow development to all CDP Public Cloud customers. In this blog post we will put these capabilities in context and dive deeper into how the built-in, end-to-end data flow life cycle enables self-service data pipeline development.

Testing

Testing Publishing Metadata Interactive

Prevent Rain Clouds Along Your Snowflake Migration

CDW Research Hub

OCTOBER 25, 2019

As we review data transformation and modernization strategies with our clients, we find many are investigating Snowflake as a data warehouse solution due to its ease of use, speed, and increased flexibility over a traditional data warehouse offering. In this post, we focus on data migration and ongoing transformation.

Data Warehouse

Data Warehouse Testing Strategy Data-driven

How to Build a Successful Metadata Management Framework

Alation

JUNE 28, 2022

Collecting and using data to make informed decisions is the new foundation for businesses. The key term here is usable : Anyone can be data rich, and collect vast troves of data. This is where metadata, or the data about data, comes into play. A metadata management framework does the same for your data analysts.

Metadata

Metadata Management Data Governance Machine Learning

Why The Public Sector Needs Data Governance

Alation

NOVEMBER 22, 2022

What Is Data Governance In The Public Sector? Effective data governance for the public sector enables entities to ensure data quality, enhance security, protect privacy, and meet compliance requirements. With so much focus on compliance, democratizing data for self-service analytics can present a challenge.

Data Governance

Data Governance Metadata Data-driven Unstructured Data

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

Chances are, you’ve heard of the term “modern data stack” before. In this article, I will explain the modern data stack in detail, list some benefits, and discuss what the future holds. What Is the Modern Data Stack? It is known to have benefits in handling data due to its robustness, speed, and scalability.

Data Warehouse

Data Warehouse Cost-Benefit Data Science Data Transformation

Smart Factories: Artificial Intelligence and Automation for Reduced OPEX in Manufacturing

DataRobot Blog

MARCH 10, 2022

These initiatives utilize interconnected devices and automated machines that create a hyperbolic increase in data volumes. This type of growth has stressed legacy data management systems and makes it nearly impossible to implement a profitable data-centered solution. High-level example of a common machine learning lifecycle.

Manufacturing

Manufacturing IoT Machine Learning Forecasting

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

AWS Big Data

MARCH 3, 2023

In this post, we share how the AWS Data Lab helped Tricentis to improve their software as a service (SaaS) Tricentis Analytics platform with insights powered by Amazon Redshift. Although Tricentis has amassed such data over a decade, the data remains untapped for valuable insights.

Software

Software Data Lake Testing Dashboards

Manual Feature Engineering

Domino Data Lab

AUGUST 20, 2019

Many thanks to AWP Pearson for the permission to excerpt “Manual Feature Engineering: Manipulating Data for Fun and Profit” from the book, Machine Learning with Python for Everyone by Mark E. Feature engineering is useful for data scientists when assessing tradeoff decisions regarding the impact of their ML models.

Testing

Testing Modeling Interactive Measurement

What is Data Mapping?

Jet Global

FEBRUARY 23, 2024

The quick and dirty definition of data mapping is the process of connecting different types of data from various data sources. Data mapping is a crucial step in data modeling and can help organizations achieve their business goals by enabling data integration, migration, transformation, and quality.

Data Warehouse

Data Warehouse Reporting Data Transformation Visualization

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

By leveraging data analysis to solve high-value business problems, they will become more efficient. This is in contrast to traditional BI, which extracts insight from data outside of the app. that gathers data from many sources. These tools prep that data for analysis and then provide reporting on it from a central viewpoint.

Analytics

Analytics Cost-Benefit Visualization Dashboards

Discover Efficient Data Extraction Through Replication With Angles Enterprise for Oracle

Jet Global

NOVEMBER 7, 2023

When extracting your financial and operational reporting data from a cloud ERP, your enterprise organization needs accurate, cost-efficient, user-friendly insights into that data. While real-time extraction is historically faster, your team needs the reliability of the replication process for your cloud data extraction.

Enterprise

Enterprise Data Warehouse Operational Reporting Reporting

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

Webinars

Trending Sources

Streamline AI-driven analytics with governance: Integrating Tableau with Amazon DataZone

Webinars

Amazon OpenSearch Service launches flow builder to empower rapid AI search innovation

SAP Datasphere Powers Business at the Speed of Data

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Data’s dark secret: Why poor quality cripples AI and growth

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

Ensuring Data Transformation Quality with dbt Core

Top 6 Benefits of Automating End-to-End Data Lineage

Available Now! Automated Testing for Data Transformations

Key Challenges Affecting Data Transformations—Dev and Testing

What is Data Lineage? Top 5 Benefits of Data Lineage

Turning the page

Breaking down data silos for digital success

How Your Finance Team Can Lead Your Enterprise Data Transformation

Migrate from Apache Solr to OpenSearch

Cloudera’s Open Data Lakehouse Supercharged with dbt Core(tm)

The Best Data Management Tools For Small Businesses

Talk Data to Me: Why Employee Data Literacy Matters

How CFM built a well-governed and scalable data-engineering platform using Amazon EMR for financial features generation

Unlock scalable analytics with a secure connectivity pattern in AWS Glue to read from or write to Snowflake

Automate discovery of data relationships using ML and Amazon Neptune graph technology

Apply fine-grained access and transformation on the SUPER data type in Amazon Redshift

Unparalleled Productivity: The Power of Cloudera Copilot for Cloudera Machine Learning

Integrating healthcare apps and data with FHIR + HL7

Simplify data transfer: Google BigQuery to Amazon S3 using Amazon AppFlow

Unlock scalable analytics with AWS Glue and Google BigQuery

Use AWS Glue to streamline SFTP data processing

A step-by-step guide to setting up a data governance program

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

Sure, Trust Your Data… Until It Breaks Everything: How Automated Data Lineage Saves the Day

Enable data analytics with Talend and Amazon Redshift Serverless

Cloudera DataFlow Designer: The Key to Agile Data Pipeline Development

Prevent Rain Clouds Along Your Snowflake Migration

How to Build a Successful Metadata Management Framework

Why The Public Sector Needs Data Governance

The Modern Data Stack Explained: What The Future Holds

Smart Factories: Artificial Intelligence and Automation for Reduced OPEX in Manufacturing

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

Manual Feature Engineering

What is Data Mapping?

What Is Embedded Analytics?

Discover Efficient Data Extraction Through Replication With Angles Enterprise for Oracle

Stay Connected