Statistics, Testing and Uncertainty

Uncertainties: Statistical, Representational, Interventional

The Unofficial Google Data Science Blog

DECEMBER 14, 2021

by AMIR NAJMI & MUKUND SUNDARARAJAN Data science is about decision making under uncertainty. Some of that uncertainty is the result of statistical inference, i.e., using a finite sample of observations for estimation. But there are other kinds of uncertainty, at least as important, that are not statistical in nature.

Uncertainty

Uncertainty Statistics Measurement Cost-Benefit

What you need to know about product management for AI

O'Reilly on Data

MARCH 31, 2020

All you need to know for now is that machine learning uses statistical techniques to give computer systems the ability to “learn” by being trained on existing data. Machine learning adds uncertainty. This has serious implications for software testing, versioning, deployment, and other core development processes.

Management

Management Machine Learning Experimentation Metrics

Regulatory uncertainty overshadows gen AI despite pace of adoption

CIO Business Intelligence

AUGUST 24, 2023

It’s no surprise, then, that according to a June KPMG survey, uncertainty about the regulatory environment was the top barrier to implementing gen AI. So here are some of the strategies organizations are using to deploy gen AI in the face of regulatory uncertainty. We’re still in the pilot phases of evaluating LLMs,” he says.

Uncertainty

Uncertainty Risk Testing Enterprise

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Why HR professionals struggle with big data

CIO Business Intelligence

FEBRUARY 20, 2025

In addition, they can use statistical methods, algorithms and machine learning to more easily establish correlations and patterns, and thus make predictions about future developments and scenarios. If a database already exists, the available data must be tested and corrected. Subsequently, the reporting should be set up properly.

Big Data

Big Data Measurement Visualization Machine Learning

Humans-in-the-loop forecasting: integrating data science and business planning

The Unofficial Google Data Science Blog

DECEMBER 4, 2019

This classification is based on the purpose, horizon, update frequency and uncertainty of the forecast. With those stakes and the long forecast horizon, we do not rely on a single statistical model based on historical trends. A single model may also not shed light on the uncertainty range we actually face.

Forecasting

Forecasting Data Science Statistics Uncertainty

What are decision support systems? Sifting data for better business decisions

CIO Business Intelligence

NOVEMBER 14, 2022

A DSS supports the management, operations, and planning levels of an organization in making better decisions by assessing the significance of uncertainties and the tradeoffs involved in making one decision over another. Commonly used models include: Statistical models. They emphasize access to and manipulation of a model.

Data mining

Data mining Data-driven Statistics OLAP

Generative AI readiness is shockingly low – these 5 tips will boost it

CIO Business Intelligence

FEBRUARY 12, 2024

As genAI caught fire in 2023, many organizations rushed to test and learn from the technology and harness it to grow productivity and improve processes. Such bleak statistics suggest that indecision around how to proceed with genAI is paralyzing organizations and preventing them from developing strategies that will unlock value.

IT

IT Consulting Cost-Benefit Uncertainty

Hackers beware: Bootstrap sampling may be harmful

Data Science and Beyond

JANUARY 7, 2019

Bootstrap sampling techniques are very appealing, as they don’t require knowing much about statistics and opaque formulas. Instead, all one needs to do is resample the given data many times, and calculate the desired statistics. Don’t compare confidence intervals visually. Pitfall #1: Inaccurate confidence intervals.

Statistics

Statistics Uncertainty Testing Modeling

In AI we trust? Why we Need to Talk About Ethics and Governance (part 2 of 2)

Cloudera

DECEMBER 3, 2021

Systems should be designed with bias, causality and uncertainty in mind. Uncertainty is a measure of our confidence in the predictions made by a system. We need to understand and provide the greatest human oversight on systems with the greatest levels of uncertainty. System Design. Human Judgement & Oversight.

Uncertainty

Uncertainty Measurement Metrics Risk

11 dark secrets of data management

CIO Business Intelligence

JUNE 28, 2022

More importantly, we also have statistical models that draw error bars that delineate the limits of our analysis. Others are philosophical, testing our ability to reason about abstract qualities. Good data scientists can also reduce some of this uncertainty through cleansing.

Management

Management Internet of Things Statistics Data-driven

The Lean Analytics Cycle: Metrics > Hypothesis > Experiment > Act

Occam's Razor

APRIL 8, 2013

Sometimes, we escape the clutches of this sub optimal existence and do pick good metrics or engage in simple A/B testing. Testing out a new feature. Identify, hypothesize, test, react. But at the same time, they had to have a real test of an actual feature. You don’t need a beautiful beast to go out and test.

Metrics

Metrics KPI Analytics Key Performance Indicator

Belcorp reimagines R&D with AI

CIO Business Intelligence

JUNE 28, 2023

These circumstances have induced uncertainty across our entire business value chain,” says Venkat Gopalan, chief digital, data and technology officer, Belcorp. “As That, in turn, led to a slew of manual processes to make descriptive analysis of the test results. This allowed us to derive insights more easily.”

Digital Transformation

Digital Transformation Cost-Benefit Informatics Data mining

Quantitative and Qualitative Data: A Vital Combination

Sisense

OCTOBER 6, 2020

Most commonly, we think of data as numbers that show information such as sales figures, marketing data, payroll totals, financial statistics, and other data that can be counted and measured objectively. All descriptive statistics can be calculated using quantitative data. Digging into quantitative data. This is quantitative data.

Statistics

Statistics Unstructured Data Data-driven Visualization

Changing assignment weights with time-based confounders

The Unofficial Google Data Science Blog

JULY 22, 2020

For example, imagine a fantasy football site is considering displaying advanced player statistics. A ramp-up strategy may mitigate the risk of upsetting the site’s loyal users who perhaps have strong preferences for the current statistics that are shown. We offer two examples where this may be the case.

Experimentation

Experimentation Statistics Testing Knowledge Discovery

Generative AI that’s tailored for your business needs with watsonx.ai

IBM Big Data Hub

SEPTEMBER 28, 2023

Based on initial IBM Research evaluations and testing , across 11 different financial tasks, the results show that by training Granite-13B models with high-quality finance data, they are some of the top performing models on finance tasks, and have the potential to achieve either similar or even better performance than much larger models.

Testing

Testing Finance Enterprise Modeling

Towards optimal experimentation in online systems

The Unofficial Google Data Science Blog

APRIL 23, 2024

If $Y$ at that point is (statistically and practically) significantly better than our current operating point, and that point is deemed acceptable, we update the system parameters to this better value. Crucially, it takes into account the uncertainty inherent in our experiments.

Experimentation

Experimentation Optimization Uncertainty Metrics

4 Ways to Attract Top Talent by Combating Job Seeker’s Fears

Insight

MAY 15, 2020

In this time of terrifying uncertainty, some might focus on their own career journey over others. Wishlists are especially off-putting to women, who statistically will only apply to job opportunities if they meet 100% of the listed requirements, versus men applying when they meet 60%. However, many are looking out for their colleagues.

Uncertainty

Uncertainty Data Science Statistics Marketing

Product Management for AI

Domino Data Lab

JUNE 23, 2019

As a result, Skomoroch advocates getting “designers and data scientists, machine learning folks together and using real data and prototyping and testing” as quickly as possible. As quickly as possible, you want to get designers and data scientists, machine learning folks together and using real data and prototyping and testing.

Management

Management Machine Learning Experimentation Metrics

Tackling changed requirements with comprehensive modernization

BI-Survey

FEBRUARY 14, 2022

Overnight, the impact of uncertainty, dynamics and complexity on markets could no longer be ignored. Local events in an increasingly interconnected economy and uncertainties such as the climate crisis will continue to create high volatility and even chaos. The COVID-19 pandemic caught most companies unprepared.

Forecasting

Forecasting Uncertainty Measurement Cost-Benefit

Getting ready for artificial general intelligence with examples

IBM Big Data Hub

APRIL 18, 2024

LLMs like ChatGPT are trained on massive amounts of text data, allowing them to recognize patterns and statistical relationships within language. The AGI would need to handle uncertainty and make decisions with incomplete information. NLP techniques help them parse the nuances of human language, including grammar, syntax and context.

Cost-Benefit

Cost-Benefit Manufacturing Modeling Interactive

Using random effects models in prediction problems

The Unofficial Google Data Science Blog

MARCH 31, 2016

We often use statistical models to summarize the variation in our data, and random effects models are well suited for this — they are a form of ANOVA after all. In the context of prediction problems, another benefit is that the models produce an estimate of the uncertainty in their predictions: the predictive posterior distribution.

Modeling

Modeling Statistics Advertising Testing

Estimating causal effects using geo experiments

The Unofficial Google Data Science Blog

MAY 31, 2016

Similarly, we could test the effectiveness of a search ad compared to showing only organic search results. Structure of a geo experiment A typical geo experiment consists of two distinct time periods: pretest and test. After the test period finishes, the campaigns in the treatment group are reset to their original configurations.

Advertising

Advertising Testing Sales Statistics

Data scientist as scientist

The Unofficial Google Data Science Blog

OCTOBER 21, 2015

The beliefs of this community are always evolving, and the process of thoughtfully generating, testing, refuting and accepting ideas looks a lot like Science. Note also that this account does not involve ambiguity due to statistical uncertainty. the power grid, a streaming music service, the human body, the weather).

Slice and Dice

Slice and Dice Experimentation Data-driven Data Science

The trinity of errors in applying confidence intervals: An exploration using Statsmodels

O'Reilly on Data

DECEMBER 9, 2019

Because of this trifecta of errors, we need dynamic models that quantify the uncertainty inherent in our financial estimates and predictions. Practitioners in all social sciences, especially financial economics, use confidence intervals to quantify the uncertainty in their estimates and predictions.

Statistics

Statistics Uncertainty Risk Marketing

COVID-19 Data Spreads like a Virus

Juice Analytics

APRIL 17, 2020

He goes on to clarify by saying that some estimate that 80% of the cases don’t get tested, so we don’t actually know the real number of cases, but it’s likely very much higher. Forecasts are built by experts, using lots of assumptions, based on very complex and specifically applied statistical models.

Forecasting

Forecasting Snapshot Dashboards Uncertainty

Variance and significance in large-scale online services

The Unofficial Google Data Science Blog

JANUARY 14, 2016

Unlike experimentation in some other areas, LSOS experiments present a surprising challenge to statisticians — even though we operate in the realm of “big data”, the statistical uncertainty in our experiments can be substantial. We must therefore maintain statistical rigor in quantifying experimental uncertainty.

Experimentation

Experimentation Statistics Metrics Measurement

Measuring Validity and Reliability of Human Ratings

The Unofficial Google Data Science Blog

JULY 18, 2023

Editor's note : The relationship between reliability and validity are somewhat analogous to that between the notions of statistical uncertainty and representational uncertainty introduced in an earlier post. But for more complicated metrics like xRR, our preference is to bootstrap when measuring uncertainty.

Measurement

Measurement Metrics Uncertainty Slice and Dice

Misadventures in experiments for growth

The Unofficial Google Data Science Blog

APRIL 16, 2019

Such decisions involve an actual hypothesis test on specific metrics (e.g. On the other hand, fledgling products often have neither the statistical power to identify the effects of small incremental changes, nor the luxury to contemplate small improvements. Are the potential improvements realized and worthwhile?

Experimentation

Experimentation Sales Metrics Measurement

Predicting Movie Profitability and Risk at the Pre-production Phase

Insight

FEBRUARY 19, 2020

I held out 20% of this as a test set and used the remainder for training and validation. Below is the result of a single XGBoost model trained on 80% of the data and tested on the unseen held-out 20%. Scatterplot of the predicted ROI vs. the true ROI for the hold-out test set. Even then, some manual cleaning was needed (e.g.,

Risk

Risk ROI Modeling Metrics

Misleading Statistics Examples – Discover The Potential For Misuse of Statistics & Data In The Digital Age

datapine

DECEMBER 28, 2021

1) What Is A Misleading Statistic? 2) Are Statistics Reliable? 3) Misleading Statistics Examples In Real Life. 4) How Can Statistics Be Misleading. 5) How To Avoid & Identify The Misuse Of Statistics? If all this is true, what is the problem with statistics? What Is A Misleading Statistic?

Statistics

Statistics Advertising Visualization Data mining

Take Advantage Of The Best Interactive & Effective Data Visualization Examples

datapine

SEPTEMBER 4, 2023

Your Chance: Want to test a powerful data visualization software? Your Chance: Want to test a powerful data visualization software? The pioneering visual is regarded by many as the “greatest statistical graphic ever drawn,” and while such a statement is subjective, it’s nothing short of inspirational.

Interactive

Interactive Visualization Cost-Benefit Dashboards

Quantifying the statistical skills needed to be a Google Data Scientists

The Unofficial Google Data Science Blog

MARCH 24, 2025

This role has several explicit requirements including statistical expertise, programming/ML, communication, data analysis/intuition. Focusing narrowly on the first of these, the description currently states that candidates will bring scientific rigor and statistical methods to the challenges of product creation.

Statistics

Statistics Testing Interactive Sales

A Field Guide to Rapidly Improving AI Products

O'Reilly on Data

APRIL 15, 2025

Instead of reaching for new tools, they: Looked at actual conversation logs Categorized the types of date-handling failures Built specific tests to catch these issues Measured improvement on these metrics The result? LLMs can generate realistic test cases that cover the range of scenarios your AI will encounter.

Experimentation

Experimentation Testing Metrics Measurement

Decision-Making in a Time of Crisis

O'Reilly on Data

JUNE 16, 2020

We know, statistically, that doubling down on an 11 is a good (and common) strategy in blackjack. But when making a decision under uncertainty about the future, two things dictate the outcome: (1) the quality of the decision and (2) chance. Mike had made the common error of equating a bad outcome with a bad decision.

Uncertainty

Uncertainty Testing Risk Reporting

Data Leaders Brief

Uncertainties: Statistical, Representational, Interventional

What you need to know about product management for AI

Webinars

Trending Sources

Regulatory uncertainty overshadows gen AI despite pace of adoption

Webinars

Why HR professionals struggle with big data

Humans-in-the-loop forecasting: integrating data science and business planning

What are decision support systems? Sifting data for better business decisions

Generative AI readiness is shockingly low – these 5 tips will boost it

Hackers beware: Bootstrap sampling may be harmful

In AI we trust? Why we Need to Talk About Ethics and Governance (part 2 of 2)

11 dark secrets of data management

The Lean Analytics Cycle: Metrics > Hypothesis > Experiment > Act

Belcorp reimagines R&D with AI

Quantitative and Qualitative Data: A Vital Combination

Changing assignment weights with time-based confounders

Generative AI that’s tailored for your business needs with watsonx.ai

Towards optimal experimentation in online systems

4 Ways to Attract Top Talent by Combating Job Seeker’s Fears

Product Management for AI

Tackling changed requirements with comprehensive modernization

Getting ready for artificial general intelligence with examples

Using random effects models in prediction problems

Estimating causal effects using geo experiments

Data scientist as scientist

The trinity of errors in applying confidence intervals: An exploration using Statsmodels

COVID-19 Data Spreads like a Virus

Variance and significance in large-scale online services

Measuring Validity and Reliability of Human Ratings

Misadventures in experiments for growth

Predicting Movie Profitability and Risk at the Pre-production Phase

Misleading Statistics Examples – Discover The Potential For Misuse of Statistics & Data In The Digital Age

Take Advantage Of The Best Interactive & Effective Data Visualization Examples

Quantifying the statistical skills needed to be a Google Data Scientists

A Field Guide to Rapidly Improving AI Products

Decision-Making in a Time of Crisis

Stay Connected