Machine Learning, Measurement and Testing

Why you should care about debugging machine learning models

O'Reilly on Data

DECEMBER 12, 2019

For all the excitement about machine learning (ML), there are serious impediments to its widespread adoption. In addition to newer innovations, the practice borrows from model risk management, traditional model diagnostics, and software testing. Not least is the broadening realization that ML models can fail. Residual analysis.

Machine Learning

Machine Learning Modeling Testing Risk Management

Managing machine learning in the enterprise: Lessons from banking and health care

O'Reilly on Data

JULY 15, 2019

As companies use machine learning (ML) and AI technologies across a broader suite of products and services, it’s clear that new tools, best practices, and new organizational structures will be needed. What cultural and organizational changes will be needed to accommodate the rise of machine and learning and AI?

Machine Learning

Machine Learning Management Enterprise Risk Management

The Race For Data Quality in a Medallion Architecture

DataKitchen

NOVEMBER 5, 2024

Data is typically organized into project-specific schemas optimized for business intelligence (BI) applications, advanced analytics, and machine learning. This involves setting up automated, column-by-column quality tests to quickly identify deviations from expected values and catch emerging issues before they impact downstream layers.

Data Quality

Data Quality Testing Metrics Reporting

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Measuring Bias in Machine Learning: The Statistical Bias Test

DataCamp

MAY 5, 2020

This tutorial will define statistical bias in a machine learning model and demonstrate how to perform the test on synthetic data.

Machine Learning

Machine Learning Statistics Testing Measurement

What are model governance and model operations?

O'Reilly on Data

JUNE 19, 2019

A look at the landscape of tools for building and deploying robust, production-ready machine learning models. Our surveys over the past couple of years have shown growing interest in machine learning (ML) among organizations from diverse industries. Model operations, testing, and monitoring.

Modeling

Modeling Machine Learning Testing Metrics

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

O'Reilly on Data

MARCH 25, 2025

Weve seen this across dozens of companies, and the teams that break out of this trap all adopt some version of Evaluation-Driven Development (EDD), where testing, monitoring, and evaluation drive every decision from the start. People have been building data products and machine learning products for the past couple of decades.

Testing

Testing Data-driven Software Measurement

The DataOps Vendor Landscape, 2021

DataKitchen

APRIL 13, 2021

Testing and Data Observability. We have also included vendors for the specific use cases of ModelOps, MLOps, DataGovOps and DataSecOps which apply DataOps principles to machine learning, AI, data governance, and data security operations. . Dagster / ElementL — A data orchestrator for machine learning, analytics, and ETL. .

Testing

Testing Machine Learning Consulting Data Science

Statistical Effect Size and Python Implementation

Analytics Vidhya

AUGUST 5, 2022

Hypothesis testing is used to look if there is any significant relationship, and we report it using a p-value. Measuring the strength of that relationship […]. Introduction One of the most important applications of Statistics is looking into how two or more variables relate.

Statistics

Statistics Measurement Testing Data Science

Bigeye Enable Monitoring, Quality and Lineage of Data

David Menninger's Analyst Perspectives

NOVEMBER 19, 2024

As a result, many data teams were not as productive as they might be, with time and effort spent on manually troubleshooting data-quality issues and testing data pipelines. The ability to monitor and measure improvements in data quality relies on instrumentation.

Data Quality

Data Quality Dashboards Data-driven Software

9 IT resolutions for 2025

CIO Business Intelligence

JANUARY 6, 2025

Wetmur says Morgan Stanley has been using modern data science, AI, and machine learning for years to analyze data and activity, pinpoint risks, and initiate mitigation, noting that teams at the firm have earned patents in this space. I am excited about the potential of generative AI, particularly in the security space, she says.

IT

IT Cost-Benefit Measurement Experimentation

What you need to know about product management for AI

O'Reilly on Data

MARCH 31, 2020

If you’re already a software product manager (PM), you have a head start on becoming a PM for artificial intelligence (AI) or machine learning (ML). AI products are automated systems that collect and learn from data to make user-facing decisions. We won’t go into the mathematics or engineering of modern machine learning here.

Management

Management Machine Learning Experimentation Metrics

Practical Skills for The AI Product Manager

O'Reilly on Data

MAY 14, 2020

This role includes everything a traditional PM does, but also requires an operational understanding of machine learning software development, along with a realistic view of its capabilities and limitations. In addition, the Research PM defines and measures the lifecycle of each research product that they support.

Management

Management Experimentation B2B Machine Learning

MLOps and DevOps: Why Data Makes It Different

O'Reilly on Data

OCTOBER 19, 2021

Much has been written about struggles of deploying machine learning projects to production. This approach has worked well for software development, so it is reasonable to assume that it could address struggles related to deploying machine learning in production too. An Overarching Concern: Correctness and Testing.

IT

IT Testing Experimentation Software

Business Strategies for Deploying Disruptive Tech: Generative AI and ChatGPT

Rocket-Powered Data Science

FEBRUARY 15, 2023

Most of these rules focus on the data, since data is ultimately the fuel, the input, the objective evidence, and the source of informative signals that are fed into all data science, analytics, machine learning, and AI models. Test early and often. Test and refine the chatbot. Suggestion: take a look at MACH architecture.)

Strategy

Strategy Experimentation Uncertainty Machine Learning

Machine Learning Project Checklist

DataRobot Blog

JULY 21, 2022

Download the Machine Learning Project Checklist. Planning Machine Learning Projects. Machine learning and AI empower organizations to analyze data, discover insights, and drive decision making from troves of data. More organizations are investing in machine learning than ever before.

Machine Learning

Machine Learning Metrics Modeling Testing

New Applied ML Prototypes Now Available in Cloudera Machine Learning

Cloudera

NOVEMBER 17, 2021

We are very excited to announce the release of five, yes FIVE new AMPs, now available in Cloudera Machine Learning (CML). In addition to the UI interface, Cloudera Machine Learning exposes a REST API that can be used to programmatically perform operations related to Projects, Jobs, Models, and Applications.

Machine Learning

Machine Learning Visualization Data Science Dashboards

AI Product Management After Deployment

O'Reilly on Data

OCTOBER 13, 2020

Similarly, in “ Building Machine Learning Powered Applications: Going from Idea to Product ,” Emmanuel Ameisen states: “Indeed, exposing a model to users in production comes with a set of challenges that mirrors the ones that come with debugging a model.”. require not only disclosure, but also monitored testing.

Management

Management Machine Learning Metrics Modeling

The Journey to DataOps Success: Key Takeaways from Transformation Trailblazers

DataKitchen

APRIL 26, 2021

GSK had been pursuing DataOps capabilities such as automation, containerization, automated testing and monitoring, and reusability, for several years. Workiva also prioritized improving the data lifecycle of machine learning models, which otherwise can be very time consuming for the team to monitor and deploy.

Measurement

Measurement Metrics Data-driven Dashboards

Interview with: Sankar Narayanan, Chief Practice Officer at Fractal Analytics

Corinium

JUNE 6, 2019

Fractal’s recommendation is to take an incremental, test and learn approach to analytics to fully demonstrate the program value before making larger capital investments. There is usually a steep learning curve in terms of “doing AI right”, which is invaluable. What is the most common mistake people make around data?

Insurance

Insurance Analytics Forecasting Deep Learning

How to Set AI Goals

O'Reilly on Data

SEPTEMBER 15, 2020

Technical sophistication: Sophistication measures a team’s ability to use advanced tools and techniques (e.g., PyTorch, TensorFlow, reinforcement learning, self-supervised learning). Technical competence: Competence measures a team’s ability to successfully deliver on initiatives and projects. Conclusion.

Advertising

Advertising Cost-Benefit ROI Machine Learning

How To Succeed As a DataOps Engineer

DataKitchen

NOVEMBER 20, 2021

A DataOps Engineer can make test data available on demand. We have automated testing and a system for exception reporting, where tests identify issues that need to be addressed. It then autogenerates QC tests based on those rules. You can track, measure and create graphs and reporting in an automated way.

Testing

Testing Machine Learning Data Warehouse Analytics

Data Quality Power Moves: Scorecards & Data Checks for Organizational Impact

DataKitchen

SEPTEMBER 18, 2024

DataOps introduces agility by advocating for: Measuring data quality early : Data quality leaders should begin measuring and assessing data quality even before perfect standards are in place. Early measurements provide valuable insights that can guide future improvements. Measuring and Refining : DataOps is an iterative process.

Scorecard

Scorecard Data Quality Measurement Testing

Protecting Your Cryptocurrency Wllets with Machine Learning

Smart Data Collective

JUNE 24, 2023

Fortunately, new advances in machine learning technology can help mitigate many of these risks. Therefore, you will want to make sure that your cryptocurrency wallet or service is protected by machine learning technology. In 2018, researchers used data mining and machine learning to detect Ponzi schemes in Ethereum.

Machine Learning

Machine Learning Broadcasting Risk Data mining

A Guide To Machine Learning Foundations Of Task Management Software

Smart Data Collective

JUNE 7, 2019

Machine learning is playing a very important role in improving the functionality of task management applications. However, recent advances in applying transfer learning to NLP allows us to train a custom language model in a matter of minutes on a modest GPU, using relatively small datasets,” writes author Euan Wielewski.

Machine Learning

Machine Learning Software Management Data Science

Scaling False Peaks

O'Reilly on Data

AUGUST 4, 2022

This kind of humility is likely to deliver more meaningful progress and a more measured understanding of such progress. Learning how to ace Space Invaders does not interfere with or displace the ability to carry out a chat conversation. For example, how many training examples does it take to learn something?

Machine Learning

Machine Learning Modeling Statistics Software

AWS Clean Rooms proof of concept scoping part 1: media measurement

AWS Big Data

DECEMBER 7, 2023

In this post, we outline planning a POC to measure media effectiveness in a paid advertising campaign. We chose to start this series with media measurement because “Results & Measurement” was the top ranked use case for data collaboration by customers in a recent survey the AWS Clean Rooms team conducted.

Measurement

Measurement Advertising Testing Data-driven

How REA Group approaches Amazon MSK cluster capacity planning

AWS Big Data

DECEMBER 5, 2024

This type of structure is foundational at REA for building microservices and timely data processing for real-time and batch use cases like time-sensitive outbound messaging, personalization, and machine learning (ML). We obtained a more comprehensive understanding of the cluster’s performance by conducting these various test scenarios.

Metrics

Metrics Dashboards Testing Optimization

Using Cloudera Machine Learning to Build a Predictive Maintenance Model for Jet Engines

Cloudera

OCTOBER 14, 2020

The objective of this blog is to show how to use Cloudera Machine Learning (CML) , running Cloudera Data Platform (CDP) , to build a predictive maintenance model based on advanced machine learning concepts. The Process. Airlines design their aircraft to operate at 99.999% reliability. Fig 1: Turbofan jet engine.

Machine Learning

Machine Learning Modeling Cost-Benefit Testing

Unbiased third-party testing is critical for network security

CIO Business Intelligence

JUNE 21, 2023

Third-party testing and validation can help CIOs find security products that do what they say they do and meet the specific infrastructure needs of their organization. Even worse, some technology testing firms still allow vendors to manipulate their methodologies to skew the test results in their favor.

Testing

Testing Measurement ROI Risk

Eight Top DataOps Trends for 2022

DataKitchen

NOVEMBER 29, 2021

Model developers will test for AI bias as part of their pre-deployment testing. Quality test suites will enforce “equity,” like any other performance metric. Continuous testing, monitoring and observability will prevent biased models from deploying or continuing to operate. Companies Commit to Remote.

Testing

Testing Data Lake Data Architecture Manufacturing

Lessons learned building natural language processing systems in health care

O'Reilly on Data

MARCH 7, 2019

If you don’t believe me, feel free to test it yourself with the six popular NLP cloud services and libraries listed below. In a test done during December 2018, of the six engines, the only medical term (which only two of them recognized) was Tylenol as a product. IBM Watson NLU. Azure Text Analytics. spaCy Named Entity Visualizer.

Deep Learning

Deep Learning Testing Machine Learning Modeling

Moving from Red AI to Green AI, Part 2: A Practitioner’s Guide to Efficient Machine Learning

DataRobot Blog

APRIL 22, 2022

In our previous post , we talked about how red AI means adding computational power to “buy” more accurate models in machine learning , and especially in deep learning. We also talked about the increased interest in green AI, in which we not only measure the quality of a model based on accuracy but also how big and complex it is.

Machine Learning

Machine Learning Measurement Deep Learning Manufacturing

Getinge’s digital transformation shows scaling and adapting in equal measure

CIO Business Intelligence

MAY 23, 2024

There is measurable progress, however, as data from the company’s connected products are collected in its own platform, where customers have access to information via a portal. “In The company is also applying machine learning (ML) to gather information from various public sources that can be used internally for market and product analysis.

Digital Transformation

Digital Transformation Measurement Cost-Benefit Data Warehouse

Data Observability and Monitoring with DataOps

DataKitchen

MAY 10, 2021

Some will argue that observability is nothing more than testing and monitoring applications using tests, metrics, logs, and other artifacts. Below we will explain how to virtually eliminate data errors using DataOps automation and the simple building blocks of data and analytics testing and monitoring. . Tie tests to alerts.

Testing

Testing Manufacturing Data Quality Statistics

Why HR professionals struggle with big data

CIO Business Intelligence

FEBRUARY 20, 2025

In addition, they can use statistical methods, algorithms and machine learning to more easily establish correlations and patterns, and thus make predictions about future developments and scenarios. If a database already exists, the available data must be tested and corrected.

Big Data

Big Data Measurement Visualization Machine Learning

Of Muffins and Machine Learning Models

Cloudera

FEBRUARY 16, 2022

In this example, the Machine Learning (ML) model struggles to differentiate between a chihuahua and a muffin. blueberry spacing) is a measure of the model’s interpretability. Machine Learning Model Lineage. Machine Learning Model Visibility . Figure 04: Applied Machine Learning Prototypes (AMPs).

Machine Learning

Machine Learning Modeling Metadata Recreation/Entertainment

What Is ‘Equity As Code,’ And How Can It Eliminate AI Bias?

DataKitchen

OCTOBER 28, 2021

Machine learning (ML) models are computer programs that draw inferences from data — usually lots of data. As the industry’s understanding of AI bias matures, model developers are getting better at defining and measuring bias. When you buy a car, you can be sure that the factory has tested every component and subsystem.

Testing

Testing IT Manufacturing Machine Learning

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

Data quality must be embedded into how data is structured, governed, measured and operationalized. Implementing Service Level Agreements (SLAs) for data quality and availability sets measurable standards, promoting responsibility and trust in data assets. Continuous measurement of data quality. Accountability and embedded SLAs.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Top 10 Analytics And Business Intelligence Trends For 2020

datapine

NOVEMBER 27, 2019

Often seen as the highest foe-friend of the human race in movies ( Skynet in Terminator, The Machines of Matrix or the Master Control Program of Tron), AI is not yet on the verge to destroy us, in spite the legit warnings of some reputed scientists and tech-entrepreneurs. 1 for data analytics trends in 2020.

Business Intelligence

Business Intelligence Analytics Prescriptive Analytics Data Quality

My top learning moments at Splunk.conf23

Rocket-Powered Data Science

JULY 21, 2023

Organizations are able to monitor integrity, quality drift, performance trends, real-time demand, SLA (service level agreement) compliance metrics, and anomalous behaviors (in devices, applications, and networks) to provide timely alerting, early warnings, and other confidence measures. “Don’t be a SOAR loser!

Machine Learning

Machine Learning Metrics Internet of Things Reporting

Incremental refresh for Amazon Redshift materialized views on data lake tables

AWS Big Data

NOVEMBER 8, 2024

You can use Amazon Redshift to analyze structured and semi-structured data and seamlessly query data lakes and operational databases, using AWS designed hardware and automated machine learning (ML)-based tuning to deliver top-tier price performance at scale. Amazon Redshift delivers price performance right out of the box.

Data Lake

Data Lake Data Warehouse Optimization Testing

How Data Integration and Machine Learning Improve Retention Marketing

Business Over Broadway

SEPTEMBER 27, 2018

In this paper, I show you how marketers can improve their customer retention efforts by 1) integrating disparate data silos and 2) employing machine learning predictive analytics. Your marketing strategy is only as good as your ability to deliver measurable results. genetic counseling, genetic testing).

Machine Learning

Machine Learning Data Integration Marketing Predictive Modeling

A Guide To The Methods, Benefits & Problems of The Interpretation of Data

datapine

JANUARY 6, 2022

Yet, before any serious data interpretation inquiry can begin, it should be understood that visual presentations of data findings are irrelevant unless a sound decision is made regarding scales of measurement. Interval: a measurement scale where data is grouped into categories with orderly and equal distances between the categories.

Visualization

Visualization Dashboards Cost-Benefit Measurement

Bridging the Gap: How ‘Data in Place’ and ‘Data in Use’ Define Complete Data Observability

DataKitchen

SEPTEMBER 21, 2023

In the context of Data in Place, validating data quality automatically with Business Domain Tests is imperative for ensuring the trustworthiness of your data assets. Running these automated tests as part of your DataOps and Data Observability strategy allows for early detection of discrepancies or errors.

Testing

Testing Data Quality Predictive Modeling Metrics

Why you should care about debugging machine learning models

Managing machine learning in the enterprise: Lessons from banking and health care

Webinars

Trending Sources

The Race For Data Quality in a Medallion Architecture

Webinars

Measuring Bias in Machine Learning: The Statistical Bias Test

What are model governance and model operations?

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

The DataOps Vendor Landscape, 2021

Statistical Effect Size and Python Implementation

Bigeye Enable Monitoring, Quality and Lineage of Data

9 IT resolutions for 2025

What you need to know about product management for AI

Practical Skills for The AI Product Manager

MLOps and DevOps: Why Data Makes It Different

Business Strategies for Deploying Disruptive Tech: Generative AI and ChatGPT

Machine Learning Project Checklist

New Applied ML Prototypes Now Available in Cloudera Machine Learning

AI Product Management After Deployment

The Journey to DataOps Success: Key Takeaways from Transformation Trailblazers

Interview with: Sankar Narayanan, Chief Practice Officer at Fractal Analytics

How to Set AI Goals

How To Succeed As a DataOps Engineer

Data Quality Power Moves: Scorecards & Data Checks for Organizational Impact

Protecting Your Cryptocurrency Wllets with Machine Learning

A Guide To Machine Learning Foundations Of Task Management Software

Scaling False Peaks

AWS Clean Rooms proof of concept scoping part 1: media measurement

How REA Group approaches Amazon MSK cluster capacity planning

Using Cloudera Machine Learning to Build a Predictive Maintenance Model for Jet Engines

Unbiased third-party testing is critical for network security

Eight Top DataOps Trends for 2022

Lessons learned building natural language processing systems in health care

Moving from Red AI to Green AI, Part 2: A Practitioner’s Guide to Efficient Machine Learning

Getinge’s digital transformation shows scaling and adapting in equal measure

Data Observability and Monitoring with DataOps

Why HR professionals struggle with big data

Of Muffins and Machine Learning Models

What Is ‘Equity As Code,’ And How Can It Eliminate AI Bias?

Data’s dark secret: Why poor quality cripples AI and growth

Top 10 Analytics And Business Intelligence Trends For 2020

My top learning moments at Splunk.conf23

Incremental refresh for Amazon Redshift materialized views on data lake tables

How Data Integration and Machine Learning Improve Retention Marketing

A Guide To The Methods, Benefits & Problems of The Interpretation of Data

Bridging the Gap: How ‘Data in Place’ and ‘Data in Use’ Define Complete Data Observability

Stay Connected