Data Leaders Brief

Bigeye Enable Monitoring, Quality and Lineage of Data

David Menninger's Analyst Perspectives

NOVEMBER 19, 2024

I previously explained that data observability software has become a critical component of data-driven decision-making. Data observability addresses one of the most significant impediments to generating value from data by providing an environment for monitoring the quality and reliability of data on a continual basis.

Data Quality

Data Quality Dashboards Data-driven Machine Learning

The state of data quality in 2020

O'Reilly on Data

FEBRUARY 11, 2020

We suspected that data quality was a topic brimming with interest. The responses show a surfeit of concerns around data quality and some uncertainty about how best to address those concerns. Key survey results: The C-suite is engaged with data quality. Data quality might get worse before it gets better.

Data Quality

Data Quality Metadata Data Governance Publishing

Why companies are in need of data lineage solutions

O'Reilly on Data

APRIL 25, 2019

The O’Reilly Data Show Podcast: Neelesh Salian on data lineage, data governance, and evolving data platforms. In this episode of the Data Show , I spoke with Neelesh Salian , software engineer at Stitch Fix , a company that combines machine learning and human expertise to personalize shopping.

Machine Learning

Machine Learning Data Governance Software Technology

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

The Symbiotic Relationship Between Data Governance and AI

David Menninger's Analyst Perspectives

MAY 14, 2025

Data governance has always been a critical part of the data and analytics landscape. However, for many years, it was seen as a preventive function to limit access to data and ensure compliance with security and data privacy requirements. Data governance is integral to an overall data intelligence strategy.

Data Governance

Data Governance Data Quality Data-driven Metadata

Octopai Acquisition Enhances Metadata Management to Trust Data Across Entire Data Estate

Cloudera

NOVEMBER 13, 2024

We are excited to announce the acquisition of Octopai , a leading data lineage and catalog platform that provides data discovery and governance for enterprises to enhance their data-driven decision making.

Metadata

Metadata Management Data Governance Data-driven

Becoming a machine learning company means investing in foundational technologies

O'Reilly on Data

MAY 21, 2019

Companies successfully adopt machine learning either by building on existing data products and services, or by modernizing existing models and algorithms. In this post, I share slides and notes from a keynote I gave at the Strata Data Conference in London earlier this year. Use ML to unlock new data types—e.g.,

Machine Learning

Machine Learning Technology Deep Learning Data Science

Deep automation in machine learning

O'Reilly on Data

DECEMBER 19, 2018

We need to do more than automate model building with autoML; we need to automate tasks at every stage of the data pipeline. In a previous post , we talked about applications of machine learning (ML) to software development, which included a tour through sample tools in data science and for managing data infrastructure.

Machine Learning

Machine Learning Software Metadata Testing

Core technologies and tools for AI, big data, and cloud computing

O'Reilly on Data

FEBRUARY 11, 2019

In a forthcoming survey, “Evolving Data Infrastructure,” we found strong interest in machine learning (ML) among respondents across geographic regions. Many companies are just beginning to address the interplay between their suite of AI, big data, and cloud technologies. Temporal data and time-series analytics. Deep Learning.

Big Data

Big Data Technology Machine Learning Deep Learning

Artificial intelligence and machine learning adoption in European enterprise

O'Reilly on Data

FEBRUARY 4, 2019

In a recent survey , we explored how companies were adjusting to the growing importance of machine learning and analytics, while also preparing for the explosion in the number of data sources. You can find full results from the survey in the free report “Evolving Data Infrastructure”.). (You

Machine Learning

Machine Learning Enterprise IoT Big Data

Bridging the gap between mainframe data and hybrid cloud environments

CIO Business Intelligence

FEBRUARY 27, 2025

A high hurdle many enterprises have yet to overcome is accessing mainframe data via the cloud. Mainframes hold an enormous amount of critical and sensitive business data including transactional information, healthcare records, customer data, and inventory metrics.

Metadata

Metadata Data Lake Cost-Benefit Forecasting

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

In today’s rapidly evolving financial landscape, data is the bedrock of innovation, enhancing customer and employee experiences and securing a competitive edge. Like many large financial institutions, ANZ Institutional Division operated with siloed data practices and centralized data management teams.

Metadata

Metadata Data Governance Data Quality Data-driven

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

Data is the foundation of innovation, agility and competitive advantage in todays digital economy. As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Data quality is no longer a back-office concern.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Addressing Data Mesh Technical Challenges with DataOps

DataKitchen

AUGUST 9, 2021

Below is our third post (3 of 5) on combining data mesh with DataOps to foster greater innovation while addressing the challenges of a decentralized architecture. We’ve talked about data mesh in organizational terms (see our first post, “ What is a Data Mesh? ”) and how team structure supports agility. Source: Thoughtworks.

Testing

Testing Data Lake Metadata Publishing

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

We live in a data-rich, insights-rich, and content-rich world. Data collections are the ones and zeroes that encode the actionable insights (patterns, trends, relationships) that we seek to extract from our data through machine learning and data science. Plus, AI can also help find key insights encoded in data.

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

AI adoption in the enterprise 2020

O'Reilly on Data

MARCH 18, 2020

The update sheds light on what AI adoption looks like in the enterprise— hint: deployments are shifting from prototype to production—the popularity of specific techniques and tools, the challenges experienced by adopters, and so on. Most companies that were evaluating or experimenting with AI are now using it in production deployments.

Enterprise

Enterprise Deep Learning Data Governance Risk

7 data trends on our radar

O'Reilly on Data

JANUARY 8, 2019

From infrastructure to tools to training, Ben Lorica looks at what’s ahead for data. Whether you’re a business leader or a practitioner, here are key data trends to watch and explore in the months ahead. Increasing focus on building data culture, organization, and training. Cloud for data infrastructure.

Machine Learning

Machine Learning IoT Internet of Things Data Science

The DataOps Vendor Landscape, 2021

DataKitchen

APRIL 13, 2021

This is not surprising given that DataOps enables enterprise data teams to generate significant business value from their data. Companies that implement DataOps find that they are able to reduce cycle times from weeks (or months) to days, virtually eliminate data errors, increase collaboration, and dramatically improve productivity.

Testing

Testing Machine Learning Consulting Data Science

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

AWS Big Data

OCTOBER 21, 2024

Data is the most significant asset of any organization. However, enterprises often encounter challenges with data silos, insufficient access controls, poor governance, and quality issues. Embracing data as a product is the key to address these challenges and foster a data-driven culture.

Sales

Sales Data-driven Data Processing Key Performance Indicator

Collibra Provides a Platform for Data Intelligence

David Menninger's Analyst Perspectives

OCTOBER 8, 2024

As I recently noted , the term “data intelligence” has been used by multiple providers across analytics and data for several years and is becoming more widespread as software providers respond to the need to provide enterprises with a holistic view of data production and consumption.

Data Quality

Data Quality Data Governance Enterprise Visualization

Amazon DataZone introduces OpenLineage-compatible data lineage visualization in preview

AWS Big Data

JULY 8, 2024

We are excited to announce the preview of API-driven, OpenLineage-compatible data lineage in Amazon DataZone to help you capture, store, and visualize lineage of data movement and transformations of data assets on Amazon DataZone.

Visualization

Visualization Metadata Publishing Sales

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

This week on the keynote stages at AWS re:Invent 2024, you heard from Matt Garman, CEO, AWS, and Swami Sivasubramanian, VP of AI and Data, AWS, speak about the next generation of Amazon SageMaker , the center for all of your data, analytics, and AI. The relationship between analytics and AI is rapidly evolving.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

Build a high-performance quant research platform with Apache Iceberg

AWS Big Data

JANUARY 9, 2025

In this post, we focus on data management implementation options such as accessing data directly in Amazon Simple Storage Service (Amazon S3), using popular data formats like Parquet, or using open table formats like Iceberg. Data management is the foundation of quantitative research.

Metadata

Metadata Snapshot Cost-Benefit Optimization

Accelerating Drug Discovery and Development with DataOps

DataKitchen

AUGUST 13, 2021

If a company can use data to identify compounds more quickly and accelerate the development process, it can monetize its drug pipeline more effectively. DataOps automation provides a way to boost innovation and improve collaboration related to data in pharmaceutical research and development (R&D). Mastery of Heterogeneous Tools.

Testing

Testing Dashboards Marketing Measurement

Managing machine learning in the enterprise: Lessons from banking and health care

O'Reilly on Data

JULY 15, 2019

As companies use machine learning (ML) and AI technologies across a broader suite of products and services, it’s clear that new tools, best practices, and new organizational structures will be needed. Regulators behind SR 11-7 also emphasize the importance of data—specifically data quality , relevance , and documentation.

Machine Learning

Machine Learning Management Enterprise Risk Management

Map and Monitor Your Data Journey

DataKitchen

OCTOBER 12, 2022

Can you draw a map of all the paths data takes from source systems to production insight delivery? How many tools, technologies, configurations, and paths do your data take during its production process? What is the ‘run-time lineage’ of data in your organization?

Technology

Technology IT

Data as a Product: Needs and Requirements

David Menninger's Analyst Perspectives

OCTOBER 2, 2024

I previously wrote about data mesh as a cultural and organizational approach to distributed data processing. Data mesh has four key principles—domain-oriented ownership, data as a product, self-serve data infrastructure and federated governance—each of which is being widely adopted.

Key Performance Indicator

Key Performance Indicator Data Governance Data Warehouse Enterprise

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

In this blog, I will demonstrate the value of Cloudera DataFlow (CDF) , the edge-to-cloud streaming data platform available on the Cloudera Data Platform (CDP) , as a Data integration and Democratization fabric. Introduction to the Data Mesh Architecture and its Required Capabilities. Components of a Data Mesh.

Metadata

Metadata Cost-Benefit Enterprise Interactive

Doing Cloud Migration and Data Governance Right the First Time

erwin

OCTOBER 8, 2020

So if you’re going to move from your data from on-premise legacy data stores and warehouse systems to the cloud, you should do it right the first time. And as you make this transition, you need to understand what data you have, know where it is located, and govern it along the way. Then you must bulk load the legacy data.

Data Governance

Data Governance Metadata Testing Data Lake

Data Observability and Monitoring with DataOps

DataKitchen

MAY 10, 2021

Data errors impact decision-making. Data errors infringe on work-life balance. Data errors also affect careers. If you have been in the data profession for any length of time, you probably know what it means to face a mob of stakeholders who are angry about inaccurate or late analytics.

Testing

Testing Manufacturing Data Quality Statistics

What is Data Lineage? Top 5 Benefits of Data Lineage

erwin

APRIL 29, 2020

Data lineage is the journey data takes from its creation through its transformations over time. Tracing the source of data is an arduous task. With all these diverse data sources, and if systems are integrated, it is difficult to understand the complicated data web they form much less get a simple visual flow.

Key Performance Indicator

Key Performance Indicator Metadata Data Governance Data Quality

A Data Prediction for 2025

DataKitchen

FEBRUARY 2, 2023

We’ve read many predictions for 2023 in the data field: they cover excellent topics like data mesh, observability, governance, lakehouses, LLMs, etc. What will the world of data tools be like at the end of 2025? Central IT Data Teams focus on standards, compliance, and cost reduction. Recession: the party is over.

Metadata

Metadata Testing Data Science Risk

The Role of Model Governance in Machine Learning and Artificial Intelligence

Domino Data Lab

AUGUST 6, 2021

This includes: Model lineage, from data acquisition to model building Model versions in production, as they are updated based on new data Model health in production with model monitoring principles Model usage and basic functionality in production Model costs. First is the data the model is using.

Machine Learning

Machine Learning Modeling Testing Data Science

Do I Need a Data Catalog?

erwin

JUNE 26, 2020

If you’re serious about a data-driven strategy , you’re going to need a data catalog. Organizations need a data catalog because it enables them to create a seamless way for employees to access and consume data and business assets in an organized manner. This also diminishes the value of data as an asset.

Metadata

Metadata Cost-Benefit Measurement Data-driven

Introducing Native Connector for Google BigQuery: Boosting Data Lineage, Migration, and Discovery

Octopai

APRIL 24, 2023

This new native integration enhances our data lineage solution by providing seamless integration with one of the most powerful cloud-based data warehouses, benefiting data teams and enabling support for a broader range of data lineage, discovery, and catalog.

Cost-Benefit

Cost-Benefit Data Warehouse Data-driven Data Governance

7 Benefits of Metadata Management

erwin

FEBRUARY 19, 2021

Metadata management is key to wringing all the value possible from data assets. However, most organizations don’t use all the data at their disposal to reach deeper conclusions about how to drive revenue, achieve regulatory compliance or accomplish other strategic objectives. Quite simply, metadata is data about data.

Metadata

Metadata Management Data Quality Cost-Benefit

The Need For Personalized Data Journeys for Your Data Consumers

DataKitchen

OCTOBER 20, 2023

In today’s data-driven landscape, Data and Analytics Teams i ncreasingly face a unique set of challenges presented by Demanding Data Consumers who require a personalized level of Data Observability. Data Observability platforms often need to deliver this level of customization.

Insurance

Insurance Metadata Data-driven Data Quality

The quest for high-quality data

O'Reilly on Data

JUNE 18, 2019

Machine learning solutions for data integration, cleaning, and data generation are beginning to emerge. “AI AI starts with ‘good’ data” is a statement that receives wide agreement from data scientists, analysts, and business owners. The problem is even more magnified in the case of structured enterprise data.

Machine Learning

Machine Learning Data Quality Statistics Modeling

Why data observability is essential to AI governance

erwin

DECEMBER 9, 2024

When it comes to using AI and machine learning across your organization, there are many good reasons to provide your data and analytics community with an intelligent data foundation. For instance, Large Language Models (LLMs) are known to ultimately perform better when data is structured. Lets give a for instance.

Metadata

Metadata Data Quality Sales Modeling

Data Intelligence and Its Role in Combating Covid-19

erwin

MARCH 30, 2020

Data intelligence has a critical role to play in the supercomputing battle against Covid-19. While leveraging supercomputing power is a tremendous asset in our fight to combat this global pandemic, in order to deliver life-saving insights, you really have to understand what data you have and where it came from.

Metadata

Metadata IT Data Governance Data Quality

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

This premier event showcased groundbreaking advancements, keynotes from AWS leadership, hands-on technical sessions, and exciting product launches. Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights.

Analytics

Analytics Data Lake Metadata Data Warehouse

2024 Gartner Market Guide To DataOps

DataKitchen

AUGUST 16, 2024

As the pioneer in the DataOps category, we are proud to have laid the groundwork for what has become an essential approach to managing data operations in today’s fast-paced business environment. At DataKitchen, we think of this is a ‘meta-orchestration’ of the code and tools acting upon the data.

Marketing

Marketing Data Quality Testing Metadata

How to Use a Data Lineage Tool to Ensure Data Quality

Octopai

MARCH 23, 2022

Dirty Meat… and Dirty Data. Mass production. But even though “dirty meat” is a small concern, “dirty data” is the scourge of any industry that relies heavily on information systems. While “dirty data” doesn’t sound as threatening as “dirty meat” (after all, it’s your computer ingesting it, not you), don’t be deceived.

Data Quality

Data Quality Reporting Modeling Interactive

Data Governance and Metadata Management: You Can’t Have One Without the Other

erwin

FEBRUARY 13, 2020

When an organization’s data governance and metadata management programs work in harmony, then everything is easier. Data governance is a complex but critical practice. There’s always more data to handle, much of it unstructured; more data sources, like IoT, more points of integration, and more regulatory compliance requirements.

Metadata

Metadata Data Governance Management Cost-Benefit

What is Data Mesh ?

Octopai

APRIL 20, 2023

Data mesh is an approach to data architecture that is intentionally distributed, where data is owned and governed by domain-specific teams who treat the data as a product to be consumed by other domain-specific teams. What are the principles behind data mesh architecture?

Data-driven

Data-driven Data Architecture Sales Interactive

Bigeye Enable Monitoring, Quality and Lineage of Data

The state of data quality in 2020

Webinars

Trending Sources

Why companies are in need of data lineage solutions

Webinars

The Symbiotic Relationship Between Data Governance and AI

Octopai Acquisition Enhances Metadata Management to Trust Data Across Entire Data Estate

Becoming a machine learning company means investing in foundational technologies

Deep automation in machine learning

Core technologies and tools for AI, big data, and cloud computing

Artificial intelligence and machine learning adoption in European enterprise

Bridging the gap between mainframe data and hybrid cloud environments

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Data’s dark secret: Why poor quality cripples AI and growth

Addressing Data Mesh Technical Challenges with DataOps

SAP Datasphere Powers Business at the Speed of Data

AI adoption in the enterprise 2020

7 data trends on our radar

The DataOps Vendor Landscape, 2021

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

Collibra Provides a Platform for Data Intelligence

Amazon DataZone introduces OpenLineage-compatible data lineage visualization in preview

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Build a high-performance quant research platform with Apache Iceberg

Accelerating Drug Discovery and Development with DataOps

Managing machine learning in the enterprise: Lessons from banking and health care

Map and Monitor Your Data Journey

Data as a Product: Needs and Requirements

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Doing Cloud Migration and Data Governance Right the First Time

Data Observability and Monitoring with DataOps

What is Data Lineage? Top 5 Benefits of Data Lineage

A Data Prediction for 2025

The Role of Model Governance in Machine Learning and Artificial Intelligence

Do I Need a Data Catalog?

Introducing Native Connector for Google BigQuery: Boosting Data Lineage, Migration, and Discovery

7 Benefits of Metadata Management

The Need For Personalized Data Journeys for Your Data Consumers

The quest for high-quality data

Why data observability is essential to AI governance

Data Intelligence and Its Role in Combating Covid-19

Top analytics announcements of AWS re:Invent 2024

2024 Gartner Market Guide To DataOps

How to Use a Data Lineage Tool to Ensure Data Quality

Data Governance and Metadata Management: You Can’t Have One Without the Other

What is Data Mesh ?

Stay Connected