Data Quality, Document and Modeling

Unbundling the Graph in GraphRAG

O'Reilly on Data

NOVEMBER 19, 2024

Reasons for using RAG are clear: large language models (LLMs), which are effectively syntax engines, tend to “hallucinate” by inventing answers from pieces of their training data. See the primary sources “ REALM: Retrieval-Augmented Language Model Pre-Training ” by Kelvin Guu, et al., Split each document into chunks.

Unstructured Data

Unstructured Data Structured Data Statistics Modeling

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data.

Data Quality

Data Quality Metrics Data-driven Management

When is data too clean to be useful for enterprise AI?

CIO Business Intelligence

NOVEMBER 27, 2024

Once the province of the data warehouse team, data management has increasingly become a C-suite priority, with data quality seen as key for both customer experience and business performance. But along with siloed data and compliance concerns , poor data quality is holding back enterprise AI projects.

Enterprise

Enterprise Data Quality Structured Data Modeling

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Why you should care about debugging machine learning models

O'Reilly on Data

DECEMBER 12, 2019

Not least is the broadening realization that ML models can fail. And that’s why model debugging, the art and science of understanding and fixing problems in ML models, is so critical to the future of ML. Because all ML models make mistakes, everyone who cares about ML should also care about model debugging. [1]

Machine Learning

Machine Learning Modeling Testing Risk Management

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

7 types of tech debt that could cripple your business

CIO Business Intelligence

MARCH 25, 2025

Data debt that undermines decision-making In Digital Trailblazer , I share a story of a private company that reported a profitable year to the board, only to return after the holiday to find that data quality issues and calculation mistakes turned it into an unprofitable one.

Risk

Risk Cost-Benefit Data-driven Digital Transformation

Managing machine learning in the enterprise: Lessons from banking and health care

O'Reilly on Data

JULY 15, 2019

In recent posts, we described requisite foundational technologies needed to sustain machine learning practices within organizations, and specialized tools for model development, model governance, and model operations/testing/monitoring. Sources of model risk. Model risk management. Image by Ben Lorica.

Machine Learning

Machine Learning Management Enterprise Risk Management

The quest for high-quality data

O'Reilly on Data

JUNE 18, 2019

There has been a significant increase in our ability to build complex AI models for predictions, classifications, and various analytics tasks, and there’s an abundance of (fairly easy-to-use) tools that allow data scientists and analysts to provision complex models within days. Data integration and cleaning.

Machine Learning

Machine Learning Data Quality Statistics Modeling

Top 10 Analytics And Business Intelligence Trends For 2020

datapine

NOVEMBER 27, 2019

Companies are no longer wondering if data visualizations improve analyses but what is the best way to tell each data-story. 2020 will be the year of data quality management and data discovery: clean and secure data combined with a simple and powerful presentation. 1) Data Quality Management (DQM).

Business Intelligence

Business Intelligence Analytics Prescriptive Analytics Data Quality

The unreasonable importance of data preparation

O'Reilly on Data

MARCH 24, 2020

In a world focused on buzzword-driven models and algorithms, you’d be forgiven for forgetting about the unreasonable importance of data preparation and quality: your models are only as good as the data you feed them. The model and the data specification become more important than the code.

Machine Learning

Machine Learning Statistics Data Quality Data Collection

What gives IT leaders pause as they look to integrate agentic AI with legacy infrastructure

CIO Business Intelligence

FEBRUARY 26, 2025

We actually started our AI journey using agents almost right out of the gate, says Gary Kotovets, chief data and analytics officer at Dun & Bradstreet. The knowledge management systems are up to date and support API calls, but gen AI models communicate in plain English. Thats what Cisco is doing.

IT

IT Enterprise Interactive Data Quality

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

Now, with support for dbt Cloud, you can access a managed, cloud-based environment that automates and enhances your data transformation workflows. This upgrade allows you to build, test, and deploy data models in dbt with greater ease and efficiency, using all the features that dbt Cloud provides.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

Generative AI in the Enterprise

O'Reilly on Data

NOVEMBER 28, 2023

And everyone has opinions about how these language models and art generation programs are going to change the nature of work, usher in the singularity, or perhaps even doom the human race. 16% of respondents working with AI are using open source models. A few have even tried out Bard or Claude, or run LLaMA 1 on their laptop.

Enterprise

Enterprise Testing Modeling Reporting

Beyond the hype: Do you really need an LLM for your data?

CIO Business Intelligence

FEBRUARY 6, 2025

The hype around large language models (LLMs) is undeniable. They promise to revolutionize how we interact with data, generating human-quality text, understanding natural language and transforming data in ways we never thought possible. In life sciences, simple statistical software can analyze patient data.

Unstructured Data

Unstructured Data Manufacturing Data Governance Sales

Accomplish Agile Business Intelligence & Analytics For Your Business

datapine

APRIL 15, 2020

Working software over comprehensive documentation. Business intelligence is moving away from the traditional engineering model: analysis, design, construction, testing, and implementation. In the traditional model communication between developers and business users is not a priority. Finalize documentation, where necessary.

Business Intelligence

Business Intelligence Analytics Testing Dashboards

DataKitchen Training And Certification Offerings

DataKitchen

MAY 7, 2024

DataKitchen Training And Certification Offerings For Individual contributors with a background in Data Analytics/Science/Engineering Overall Ideas and Principles of DataOps DataOps Cookbook (200 page book over 30,000 readers, free): DataOps Certificatio n (3 hours, online, free, signup online): DataOps Manifesto (over 30,000 signatures) One (..)

Data Quality

Data Quality Testing Consulting Metrics

The future of data: A 5-pillar approach to modern data management

CIO Business Intelligence

DECEMBER 11, 2024

Digital transformation started creating a digital presence of everything we do in our lives, and artificial intelligence (AI) and machine learning (ML) advancements in the past decade dramatically altered the data landscape. Implementing ML capabilities can help find the right thresholds.

Management

Management Data Governance Data Science Reporting

Business Strategies for Deploying Disruptive Tech: Generative AI and ChatGPT

Rocket-Powered Data Science

FEBRUARY 15, 2023

While generative AI has been around for several years , the arrival of ChatGPT (a conversational AI tool for all business occasions, built and trained from large language models) has been like a brilliant torch brought into a dark room, illuminating many previously unseen opportunities. So, if you have 1 trillion data points (g.,

Strategy

Strategy Experimentation Uncertainty Machine Learning

AI Product Management After Deployment

O'Reilly on Data

OCTOBER 13, 2020

Similarly, in “ Building Machine Learning Powered Applications: Going from Idea to Product ,” Emmanuel Ameisen states: “Indeed, exposing a model to users in production comes with a set of challenges that mirrors the ones that come with debugging a model.”.

Management

Management Machine Learning Metrics Modeling

Deep automation in machine learning

O'Reilly on Data

DECEMBER 19, 2018

We need to do more than automate model building with autoML; we need to automate tasks at every stage of the data pipeline. In a previous post , we talked about applications of machine learning (ML) to software development, which included a tour through sample tools in data science and for managing data infrastructure.

Machine Learning

Machine Learning Software Metadata Testing

Manage concurrent write conflicts in Apache Iceberg on the AWS Glue Data Catalog

AWS Big Data

APRIL 8, 2025

We will explore Icebergs concurrency model, examine common conflict scenarios, and provide practical implementation patterns of both automatic retry mechanisms and situations requiring custom conflict resolution logic for building resilient data pipelines. This scenario applies to any type of updates on an Iceberg table.

Snapshot

Snapshot Management Metadata Big Data

AI Governance: Act now, thrive later

CIO Business Intelligence

JANUARY 30, 2025

While there is a lot of effort and content that is now available, it tends to be at a higher level which will require work to be done to create a governance model specifically for your organization. Governance is action and there are many actions an organization can take to create and implement an effective AI governance model.

Testing

Testing Metrics Cost-Benefit Modeling

How to Do Data Modeling the Right Way

erwin

MAY 27, 2020

Data modeling supports collaboration among business stakeholders – with different job roles and skills – to coordinate with business objectives. Data resides everywhere in a business , on-premise and in private or public clouds. A single source of data truth helps companies begin to leverage data as a strategic asset.

Modeling

Modeling Metadata Data Governance Visualization

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

Data collections are the ones and zeroes that encode the actionable insights (patterns, trends, relationships) that we seek to extract from our data through machine learning and data science. The insights are used to produce informative content for stakeholders (decision-makers, business users, and clients).

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

The Gold Standard – The Key to Information Extraction and Data Quality Control

Ontotext

MAY 26, 2021

In natural language processing (NLP) and computational linguistics the Gold Standard typically represents a corpus of text or a set of documents, annotated or tagged with the desired results for the analysis – be it designation of the corresponding part of speech, syntactic parsing, concept or relationship.

Data Quality

Data Quality Machine Learning Measurement Metadata

Get your data AI-ready

CIO Business Intelligence

SEPTEMBER 12, 2024

The main reason is that it is difficult and time-consuming to consolidate, process, label, clean, and protect the information at scale to train AI models. An aircraft engine provider uses AI to manage thousands of technical documents required for engine certification, reducing administration time from 3-6 months to a few weeks.

Unstructured Data

Unstructured Data Data Quality Structured Data Machine Learning

Automating Model Risk Compliance: Model Development

DataRobot Blog

MAY 10, 2022

Addressing the Key Mandates of a Modern Model Risk Management Framework (MRM) When Leveraging Machine Learning . The regulatory guidance presented in these documents laid the foundation for evaluating and managing model risk for financial institutions across the United States.

Risk

Risk Modeling Machine Learning Data Quality

Is your data ready for AI? CIOs lack answers

CIO Business Intelligence

JUNE 5, 2024

Crucial data resides in hundreds of emails sent and received every day, on spreadsheets, in PowerPoint presentations, on videos, in pictures, in reports with graphs, in text documents, on web pages, in purchase orders, in utility bills, and on PDFs. That data is free flowing and does not reside in one place.

Management

Management Strategy Data Strategy Marketing

Data governance in the age of generative AI

AWS Big Data

FEBRUARY 29, 2024

Data is your generative AI differentiator, and a successful generative AI implementation depends on a robust data strategy incorporating a comprehensive data governance approach. Data governance is a critical building block across all these approaches, and we see two emerging areas of focus.

Data Governance

Data Governance Unstructured Data Metadata Data Lake

Gen AI can be the answer to your data problems — but not all of them

CIO Business Intelligence

JUNE 12, 2024

To help alleviate the complexity and extract insights, the foundation, using different AI models, is building an analytics layer on top of this database, having partnered with DataBricks and DataRobot. Some of the models are traditional machine learning (ML), and some, LaRovere says, are gen AI, including the new multi-modal advances.

Modeling

Modeling Testing Cost-Benefit Metadata

What’s Business Process Modeling Got to Do with It? – Choosing A BPM Tool

erwin

MARCH 21, 2019

With business process modeling (BPM) being a key component of data governance , choosing a BPM tool is part of a dilemma many businesses either have or will soon face. Historically, BPM didn’t necessarily have to be tied to an organization’s data governance initiative. Choosing a BPM Tool: An Overview. Silo Buster.

Modeling

Modeling Metadata Data Governance IT

A Few Proven Suggestions for Handling Large Data Sets

Smart Data Collective

SEPTEMBER 26, 2021

Data visualization enables you to: Make sense of the distributional characteristics of variables Easily identify data entry issues Choose suitable variables for data analysis Assess the outcome of predictive models Communicate the results to those interested. The central concept is the idea of a document.

Metadata

Metadata Visualization Unstructured Data Data mining

Top 10 Data Governance Trends for 2020: Data’s Real Value Comes Into Focus

erwin

JANUARY 3, 2020

While these will remain big data governance trends for 2020, we anticipate organizations will finally begin tapping into the true value of data as the foundation of the digital business model. Data Modeling: Drive Business Value and Underpin Governance with an Enterprise Data Model.

Data Governance

Data Governance Digital Transformation IoT Metadata

What is data governance? Best practices for managing data assets

CIO Business Intelligence

MARCH 24, 2023

It encompasses the people, processes, and technologies required to manage and protect data assets. The Data Management Association (DAMA) International defines it as the “planning, oversight, and control over management of data and the use of data and data-related sources.”

Data Governance

Data Governance Management Metadata Data Quality

Unlocking the potential of generative AI in the software development life cycle

CIO Business Intelligence

SEPTEMBER 10, 2024

This AI-augmented approach ensures that no critical feature falls through the cracks and that accurate requirements documents reduce the likelihood of defects. Invest in data quality: GenAI models are only as good as the data they’re trained on -with GenAI, mistakes can be amplified at speed.

Software

Software Digital Transformation Testing Advertising

5 surefire ways to derail a digital transformation (without knowing it)

CIO Business Intelligence

MAY 2, 2023

Worse is when prioritized initiatives don’t have a documented shared vision, including a definition of the customer, targeted value propositions, and achievable success criteria. One recent study shows that only 50% follow a product-centric operating model focusing on customer centricity and delivering delightful customer experiences.

Digital Transformation

Digital Transformation IT Data-driven Data Quality

Alation and Salesforce partner on data governance for Data Cloud

CIO Business Intelligence

SEPTEMBER 19, 2024

It will do this, it said, with bidirectional integration between its platform and Salesforce’s to seamlessly delivers data governance and end-to-end lineage within Salesforce Data Cloud. We look at the entire landscape of information that an enterprise has,” Sangani said. “As

Data Governance

Data Governance Metadata Unstructured Data Structured Data

Your Data Won’t Speak Unless You Ask It The Right Data Analysis Questions

datapine

JANUARY 24, 2021

This can include a multitude of processes, like data profiling, data quality management, or data cleaning, but we will focus on tips and questions to ask when analyzing data to gain the most cost-effective solution for an effective business strategy. 4) How can you ensure data quality?

IT

IT Statistics KPI Data-driven

7 Benefits of Metadata Management

erwin

FEBRUARY 19, 2021

A strong data management strategy and supporting technology enables the data quality the business requires, including data cataloging (integration of data sets from various sources), mapping, versioning, business rules and glossaries maintenance and metadata management (associations and lineage). Map data flows.

Metadata

Metadata Management Data Quality Cost-Benefit

Why enterprise CIOs need to plan for Microsoft gen AI

CIO Business Intelligence

AUGUST 14, 2024

It’s embedded in the applications we use every day and the security model overall is pretty airtight. Microsoft has also made investments beyond OpenAI, for example in Mistral and Meta’s LLAMA models, in its own small language models like Phi, and by partnering with providers like Cohere, Hugging Face, and Nvidia. That’s risky.”

Enterprise

Enterprise Cost-Benefit Experimentation Modeling

Better Forecasting with AI-Powered Time Series Modeling

DataRobot Blog

DECEMBER 15, 2022

Working from datasets you already have, a Time Series Forecasting model can help you better understand seasonality and cyclical behavior and make future-facing decisions, such as reducing inventory or staff planning. Prepare your data for Time Series Forecasting. Configuring an ML project. Settings for Time Series projects.

Forecasting

Forecasting Modeling ROI Machine Learning

NASA accelerates science with gen AI-powered search

CIO Business Intelligence

JANUARY 15, 2024

With seven operating centers, nine research facilities, and more than 18,000 staff, the agency continually generates an overwhelming amount of data, which it stores in more than 30 science data repositories across five topical areas — astrophysics, heliophysics, biological science, physical science, earth science, and planetary science.

Informatics

Informatics Metadata Data mining Digital Transformation

What’s the Current State of Data Governance and Automation?

erwin

JANUARY 30, 2020

The results of our new research show that organizations are still trying to master data governance, including adjusting their strategies to address changing priorities and overcoming challenges related to data discovery, preparation, quality and traceability. Organizations still depend too much on manual data management.

Data Governance

Data Governance Metadata Cost-Benefit Digital Transformation

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

erwin

JULY 12, 2019

Modern, strategic data governance , which involves both IT and the business, enables organizations to plan and document how they will discover and understand their data within context, track its physical existence and lineage, and maximize its security, quality and value. How erwin Can Help.

Data Governance

Data Governance Management Metadata Risk Management

Unbundling the Graph in GraphRAG

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Webinars

Trending Sources

When is data too clean to be useful for enterprise AI?

Webinars

Why you should care about debugging machine learning models

Data’s dark secret: Why poor quality cripples AI and growth

7 types of tech debt that could cripple your business

Managing machine learning in the enterprise: Lessons from banking and health care

The quest for high-quality data

Top 10 Analytics And Business Intelligence Trends For 2020

The unreasonable importance of data preparation

What gives IT leaders pause as they look to integrate agentic AI with legacy infrastructure

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

Generative AI in the Enterprise

Beyond the hype: Do you really need an LLM for your data?

Accomplish Agile Business Intelligence & Analytics For Your Business

DataKitchen Training And Certification Offerings

The future of data: A 5-pillar approach to modern data management

Business Strategies for Deploying Disruptive Tech: Generative AI and ChatGPT

AI Product Management After Deployment

Deep automation in machine learning

Manage concurrent write conflicts in Apache Iceberg on the AWS Glue Data Catalog

AI Governance: Act now, thrive later

How to Do Data Modeling the Right Way

SAP Datasphere Powers Business at the Speed of Data

The Gold Standard – The Key to Information Extraction and Data Quality Control

Get your data AI-ready

Automating Model Risk Compliance: Model Development

Is your data ready for AI? CIOs lack answers

Data governance in the age of generative AI

Gen AI can be the answer to your data problems — but not all of them

What’s Business Process Modeling Got to Do with It? – Choosing A BPM Tool

A Few Proven Suggestions for Handling Large Data Sets

Top 10 Data Governance Trends for 2020: Data’s Real Value Comes Into Focus

What is data governance? Best practices for managing data assets

Unlocking the potential of generative AI in the software development life cycle

5 surefire ways to derail a digital transformation (without knowing it)

Alation and Salesforce partner on data governance for Data Cloud

Your Data Won’t Speak Unless You Ask It The Right Data Analysis Questions

7 Benefits of Metadata Management

Why enterprise CIOs need to plan for Microsoft gen AI

Better Forecasting with AI-Powered Time Series Modeling

NASA accelerates science with gen AI-powered search

What’s the Current State of Data Governance and Automation?

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

Stay Connected