Experimentation, Metrics and Uncertainty

Experimentation

Metrics

Uncertainty

What you need to know about product management for AI

O'Reilly on Data

MARCH 31, 2020

Machine learning adds uncertainty. Underneath this uncertainty lies further uncertainty in the development process itself. There are strategies for dealing with all of this uncertainty–starting with the proverb from the early days of Agile: “ do the simplest thing that could possibly work.”

Management

Management Machine Learning Experimentation Metrics

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

O'Reilly on Data

MARCH 25, 2025

ML apps needed to be developed through cycles of experimentation (as were no longer able to reason about how theyll behave based on software specs). The skillset and the background of people building the applications were realigned: People who were at home with data and experimentation got involved! How do we do so?

Testing

Testing Data-driven Software Measurement

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

The Lean Analytics Cycle: Metrics > Hypothesis > Experiment > Act

Occam's Razor

APRIL 8, 2013

To win in business you need to follow this process: Metrics > Hypothesis > Experiment > Act. We are far too enamored with data collection and reporting the standard metrics we love because others love them because someone else said they were nice so many years ago. That metric is tied to a KPI.

Metrics

Metrics KPI Analytics Key Performance Indicator

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

AI Product Management After Deployment

O'Reilly on Data

OCTOBER 13, 2020

Ideally, AI PMs would steer development teams to incorporate I/O validation into the initial build of the production system, along with the instrumentation needed to monitor model accuracy and other technical performance metrics. But in practice, it is common for model I/O validation steps to be added later, when scaling an AI product.

Management

Management Machine Learning Metrics Modeling

Towards optimal experimentation in online systems

The Unofficial Google Data Science Blog

APRIL 23, 2024

the weight given to Likes in our video recommendation algorithm) while $Y$ is a vector of outcome measures such as different metrics of user experience (e.g., Crucially, it takes into account the uncertainty inherent in our experiments. Here, $X$ is a vector of tuning parameters that control the system's operating characteristics (e.g.

Experimentation

Experimentation Optimization Uncertainty Metrics

Uncertainties: Statistical, Representational, Interventional

The Unofficial Google Data Science Blog

DECEMBER 14, 2021

by AMIR NAJMI & MUKUND SUNDARARAJAN Data science is about decision making under uncertainty. Some of that uncertainty is the result of statistical inference, i.e., using a finite sample of observations for estimation. But there are other kinds of uncertainty, at least as important, that are not statistical in nature.

Uncertainty

Uncertainty Statistics Measurement Cost-Benefit

Why CIOs should invest in digital through economic headwinds

CIO Business Intelligence

NOVEMBER 15, 2022

Experiment with the “highly visible and highly hyped”: Gartner repeatedly pointed out that organisations that innovate during tough economic times “stay ahead of the pack”, with Mesaglio in particular calling for such experimentation to be public and visible.

Digital Transformation

Digital Transformation Machine Learning Experimentation Data-driven

Integrate sparse and dense vectors to enhance knowledge retrieval in RAG using Amazon OpenSearch Service

AWS Big Data

SEPTEMBER 5, 2024

Although the absolute metrics of the sparse vector model can’t surpass those of the best dense vector models, it possesses unique and advantageous characteristics. Experimental data selection For retrieval evaluation, we used to use the datasets from BeIR. We care more about the recall metric.

Metrics

Metrics Testing Experimentation Modeling

Product Management for AI

Domino Data Lab

JUNE 23, 2019

Skomoroch proposes that managing ML projects are challenging for organizations because shipping ML projects requires an experimental culture that fundamentally changes how many companies approach building and shipping software. Another pattern that I’ve seen in good PMs is that they’re very metric-driven.

Management

Management Machine Learning Experimentation Metrics

Estimating causal effects using geo experiments

The Unofficial Google Data Science Blog

MAY 31, 2016

A geo experiment is an experiment where the experimental units are defined by geographic regions. This means it is possible to specify exactly in which geos an ad campaign will be served – and to observe the ad spend and the response metric at the geo level. They are non-overlapping geo-targetable regions. by turning campaigns off).

Advertising

Advertising Testing Sales Statistics

Data scientist as scientist

The Unofficial Google Data Science Blog

OCTOBER 21, 2015

It is important to make clear distinctions among each of these, and to advance the state of knowledge through concerted observation, modeling and experimentation. Note also that this account does not involve ambiguity due to statistical uncertainty. We sliced and diced the experimental data in many many ways.

Slice and Dice

Slice and Dice Experimentation Data Science Data-driven

Misadventures in experiments for growth

The Unofficial Google Data Science Blog

APRIL 16, 2019

by MICHAEL FORTE Large-scale live experimentation is a big part of online product development. This means a small and growing product has to use experimentation differently and very carefully. This blog post is about experimentation in this regime. Such decisions involve an actual hypothesis test on specific metrics (e.g.

Experimentation

Experimentation Sales Metrics Measurement

Variance and significance in large-scale online services

The Unofficial Google Data Science Blog

JANUARY 14, 2016

Unlike experimentation in some other areas, LSOS experiments present a surprising challenge to statisticians — even though we operate in the realm of “big data”, the statistical uncertainty in our experiments can be substantial. We must therefore maintain statistical rigor in quantifying experimental uncertainty.

Experimentation

Experimentation Statistics Metrics Measurement

LSOS experiments: how I learned to stop worrying and love the variability

The Unofficial Google Data Science Blog

FEBRUARY 29, 2016

Despite a very large number of experimental units, the experiments conducted by LSOS cannot presume statistical significance of all effects they deem practically significant. The result is that experimenters can’t afford to be sloppy about quantifying uncertainty. At Google, we tend to refer to them as slices.

Experimentation

Experimentation Statistics Metrics Measurement

10 ways to kill your IT culture

CIO Business Intelligence

NOVEMBER 12, 2024

Cultivating high-performance teams , recruiting leaders, retaining talent, and continuously improving digital KPIs are hallmarks of strong IT cultures — but their metrics lag the CIO’s culture-improving programs. When changes are made without transparency or input from the team, it breeds uncertainty and resentment.

IT Digital Transformation Management Experimentation

How today’s enterprise architect juggles strategy, tech and innovation

CIO Business Intelligence

APRIL 16, 2025

Innovator/experimenter: enterprise architects look for new innovative opportunities to bring into the business and know how to frame and execute experiments to maximize the learnings. Observer-optimiser: Continuous monitoring, review and refinement is essential.

Enterprise

Enterprise Strategy Risk Business Objectives

A Field Guide to Rapidly Improving AI Products

O'Reilly on Data

APRIL 15, 2025

One client proudly showed me this evaluation dashboard: The kind of dashboard that foreshadows failure This is the tools trapthe belief that adopting the right tools or frameworks (in this case, generic metrics) will solve your AI problems. Second, too many metrics fragment your attention. When everything is important, nothing is.

Experimentation

Experimentation Testing Metrics Measurement

Data Leaders Brief

What you need to know about product management for AI

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

Webinars

Trending Sources

The Lean Analytics Cycle: Metrics > Hypothesis > Experiment > Act

Webinars

AI Product Management After Deployment

Towards optimal experimentation in online systems

Uncertainties: Statistical, Representational, Interventional

Why CIOs should invest in digital through economic headwinds

Integrate sparse and dense vectors to enhance knowledge retrieval in RAG using Amazon OpenSearch Service

Product Management for AI

Estimating causal effects using geo experiments

Data scientist as scientist

Misadventures in experiments for growth

Variance and significance in large-scale online services

LSOS experiments: how I learned to stop worrying and love the variability

10 ways to kill your IT culture

How today’s enterprise architect juggles strategy, tech and innovation

A Field Guide to Rapidly Improving AI Products

Stay Connected