Experimentation, Measurement and Testing

10 AI strategy questions every CIO must answer

CIO Business Intelligence

JANUARY 14, 2025

How does our AI strategy support our business objectives, and how do we measure its value? Meanwhile, he says establishing how the organization will measure the value of its AI strategy ensures that it is poised to deliver impactful outcomes because, to create such measures, teams must name desired outcomes and the value they hope to get.

Strategy

Strategy ROI Experimentation Risk

Experimentation and Testing: A Primer

Occam's Razor

MAY 22, 2006

This post is a primer on the delightful world of testing and experimentation (A/B, Multivariate, and a new term from me: Experience Testing). Experimentation and testing help us figure out we are wrong, quickly and repeatedly and if you think about it that is a great thing for our customers, and for our employers.

Experimentation

Experimentation Testing Optimization Measurement

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

O'Reilly on Data

MARCH 25, 2025

Weve seen this across dozens of companies, and the teams that break out of this trap all adopt some version of Evaluation-Driven Development (EDD), where testing, monitoring, and evaluation drive every decision from the start. What breaks your app in production isnt always what you tested for in dev! The way out?

Testing

Testing Data-driven Software Measurement

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Practical Skills for The AI Product Manager

O'Reilly on Data

MAY 14, 2020

AI PMs should enter feature development and experimentation phases only after deciding what problem they want to solve as precisely as possible, and placing the problem into one of these categories. Experimentation: It’s just not possible to create a product by building, evaluating, and deploying a single model.

Management

Management Experimentation B2B Machine Learning

9 IT resolutions for 2025

CIO Business Intelligence

JANUARY 6, 2025

Balancing the rollout with proper training, adoption, and careful measurement of costs and benefits is essential, particularly while securing company assets in tandem, says Ted Kenney, CIO of tech company Access. Our success will be measured by user adoption, a reduction in manual tasks, and an increase in sales and customer satisfaction.

IT

IT Cost-Benefit Measurement Experimentation

MLOps and DevOps: Why Data Makes It Different

O'Reilly on Data

OCTOBER 19, 2021

ML apps need to be developed through cycles of experimentation: due to the constant exposure to data, we don’t learn the behavior of ML apps through logical reasoning but through empirical observation. An Overarching Concern: Correctness and Testing. This approach is not novel. Why did something break? Who did what and when?

IT

IT Testing Experimentation Software

Learn how to design, measure and implement trustworthy A/B tests from leading experimentation expert Ronny Kohavi (ex-Amazon, Airbnb, Microsoft)

KDnuggets

JANUARY 24, 2023

Leading expert Ronny Kohavi, drawing from his 20+ years of experience, will walk you through the ins and outs of experimentation, identifying key insights and working through live demos in his live course, Accelerating Innovation with A/B Testing, starting January 30th.

Experimentation

Experimentation Testing Measurement

From project to product: Architecting the future of enterprise technology

CIO Business Intelligence

JANUARY 14, 2025

By articulating fitness functions automated tests tied to specific quality attributes like reliability, security or performance teams can visualize and measure system qualities that align with business goals. Experimentation: The innovation zone Progressive cities designate innovation districts where new ideas can be tested safely.

Enterprise

Enterprise Technology Metrics Measurement

Business Strategies for Deploying Disruptive Tech: Generative AI and ChatGPT

Rocket-Powered Data Science

FEBRUARY 15, 2023

encouraging and rewarding) a culture of experimentation across the organization. Keep it agile, with short design, develop, test, release, and feedback cycles: keep it lean, and build on incremental changes. Test early and often. Encourage and reward a Culture of Experimentation that learns from failure, “ Test, or get fired!

Strategy

Strategy Experimentation Uncertainty Machine Learning

The DataOps Vendor Landscape, 2021

DataKitchen

APRIL 13, 2021

Testing and Data Observability. It orchestrates complex pipelines, toolchains, and tests across teams, locations, and data centers. Prefect Technologies — Open-source data engineering platform that builds, tests, and runs data workflows. Testing and Data Observability. Production Monitoring and Development Testing.

Testing

Testing Machine Learning Consulting Data Science

Measuring Incrementality: Controlled Experiments to the Rescue!

Occam's Razor

SEPTEMBER 19, 2011

This: You understand all the environmental variables currently in play, you carefully choose more than one group of "like type" subjects, you expose them to a different mix of media, measure differences in outcomes, prove / disprove your hypothesis (DO FACEBOOK NOW!!!), The nice thing is that you can also test that!

Measurement

Measurement Advertising Testing Marketing

Do You Need a DataOps Dojo?

DataKitchen

JANUARY 20, 2021

Centralizing analytics helps the organization standardize enterprise-wide measurements and metrics. Develop/execute regression testing . Test data management and other functions provided ‘as a service’ . Central DataOps process measurement function with reports. Agile ticketing/Kanban tools. Deploy to production.

Metrics

Metrics Experimentation Measurement Testing

3 musts when recruiting vendors for AI

CIO Business Intelligence

MARCH 5, 2025

Two years of experimentation may have given rise to several valuable use cases for gen AI , but during the same period, IT leaders have also learned that the new, fast-evolving technology isnt something to jump into blindly. The next thing is to make sure they have an objective way of testing the outcome and measuring success.

Testing

Testing Measurement Technology Experimentation

US Air Force seeks generative AI test pilots

CIO Business Intelligence

JUNE 13, 2024

Proof that even the most rigid of organizations are willing to explore generative AI arrived this week when the US Department of the Air Force (DAF) launched an experimental initiative aimed at Guardians, Airmen, civilian employees, and contractors.

Testing

Testing Experimentation Data Processing Modeling

How to Set AI Goals

O'Reilly on Data

SEPTEMBER 15, 2020

Technical sophistication: Sophistication measures a team’s ability to use advanced tools and techniques (e.g., Technical competence: Competence measures a team’s ability to successfully deliver on initiatives and projects. They’re not new to the field; they’ve solved problems, and have discovered what does and doesn’t work.

Advertising

Advertising Cost-Benefit ROI Machine Learning

Interview with: Sankar Narayanan, Chief Practice Officer at Fractal Analytics

Corinium

JUNE 6, 2019

Fractal’s recommendation is to take an incremental, test and learn approach to analytics to fully demonstrate the program value before making larger capital investments. A properly set framework will ensure quality, timeliness, scalability, consistency, and industrialization in measuring and driving the return on investment.

Insurance

Insurance Analytics Forecasting Deep Learning

What you need to know about product management for AI

O'Reilly on Data

MARCH 31, 2020

This has serious implications for software testing, versioning, deployment, and other core development processes. The need for an experimental culture implies that machine learning is currently better suited to the consumer space than it is to enterprise companies.

Management

Management Machine Learning Experimentation Metrics

AI Product Management After Deployment

O'Reilly on Data

OCTOBER 13, 2020

In Bringing an AI Product to Market , we distinguished the debugging phase of product development from pre-deployment evaluation and testing. During testing and evaluation, application performance is important, but not critical to success. require not only disclosure, but also monitored testing. Debugging AI Products.

Management

Management Machine Learning Metrics Modeling

Excellent Analytics Tip #8: Measure the Real Conversion Rate & "Opportunity Pie"

Occam's Razor

NOVEMBER 12, 2006

Mostly because short term goals drive a lot of what we do and if you are selling something on your website then it only seems to make logical sense that we measure conversion rate and get it up as high as we can as fast as we can. So measure Bounce Rate of your website. Even though we should not obsess about conversion rate we do.

Measurement

Measurement Analytics Marketing Optimization

The Lean Analytics Cycle: Metrics > Hypothesis > Experiment > Act

Occam's Razor

APRIL 8, 2013

Sometimes, we escape the clutches of this sub optimal existence and do pick good metrics or engage in simple A/B testing. First, you figure out what you want to improve; then you create an experiment; then you run the experiment; then you measure the results and decide what to do. Testing out a new feature. Form a hypothesis.

Metrics

Metrics Analytics KPI Key Performance Indicator

What is a DataOps Engineer?

DataKitchen

OCTOBER 5, 2021

DataOps enables: Rapid experimentation and innovation for the fastest delivery of new insights to customers. Clear measurement and monitoring of results. Instead of focusing on a narrowly defined task with minimal testing and feedback, DataOps focuses on adding value. Create tests. Measure success. Low error rates.

Testing

Testing Dashboards Measurement Experimentation

6 enterprise DevOps mistakes to avoid

CIO Business Intelligence

OCTOBER 15, 2024

But continuous deployment isn’t always appropriate for your business , stakeholders don’t always understand the costs of implementing robust continuous testing , and end-users don’t always tolerate frequent app deployments during peak usage. CrowdStrike recently made the news about a failed deployment impacting 8.5

Enterprise

Enterprise Testing Business Objectives Experimentation

What high-performance IT teams look like today — and how to build one

CIO Business Intelligence

AUGUST 20, 2024

While the focus at these three levels differ, CIOs should provide a consistent definition of high performance and how it’s measured. Emerging leaders who may be agile team leaders and product owners should prioritize developing business acumen and improving facilitation skills to lead self-organizing teams.

IT

IT Digital Transformation Experimentation Risk

Designing A/B tests in a collaboration network

The Unofficial Google Data Science Blog

JANUARY 16, 2018

We present data from Google Cloud Platform (GCP) as an example of how we use A/B testing when users are connected. Experimentation on networks A/B testing is a standard method of measuring the effect of changes by randomizing samples into different treatment groups. This could create confusion.

Testing

Testing Experimentation Measurement Modeling

3 steps to eliminate shadow AI

CIO Business Intelligence

SEPTEMBER 12, 2024

Unmonitored AI tools can lead to decisions or actions that undermine regulatory and corporate compliance measures, particularly in sectors where data handling and processing are tightly regulated, such as finance and healthcare. Review and integrate successful experimental AI projects into the company’s main operational framework.

Experimentation

Experimentation Risk Cost-Benefit Strategy

The early returns on gen AI for software development

CIO Business Intelligence

MARCH 12, 2024

Early use cases include code generation and documentation, test case generation and test automation, as well as code optimization and refactoring, among others. The maturity of any development organization can easily be measured in terms of the size and type of investment made in QA,” he says.

Software

Software Experimentation Testing Cost-Benefit

Towards optimal experimentation in online systems

The Unofficial Google Data Science Blog

APRIL 23, 2024

the weight given to Likes in our video recommendation algorithm) while $Y$ is a vector of outcome measures such as different metrics of user experience (e.g., Taking measurements at parameter settings further from control parameter settings leads to a lower variance estimate of the slope of the line relating the metric to the parameter.

Experimentation

Experimentation Optimization Uncertainty Metrics

Changing assignment weights with time-based confounders

The Unofficial Google Data Science Blog

JULY 22, 2020

Another reason to use ramp-up is to test if a website's infrastructure can handle deploying a new arm to all of its users. The website wants to make sure they have the infrastructure to handle the feature while testing if engagement increases enough to justify the infrastructure. We offer two examples where this may be the case.

Experimentation

Experimentation Statistics Testing Knowledge Discovery

Methods of Study Design – Experiments

Data Science 101

JANUARY 15, 2020

Researchers/ scientists perform experiments to validate their hypothesis/ statements or to test a new product. Suppose we want to test the effectiveness of a new drug against a particular disease. Reliability: It means measurements should have repeatable results. For eg: you measure the blood pressure of a person.

Experimentation

Experimentation Statistics Measurement Testing

6 Case Studies on The Benefits of Business Intelligence And Analytics

datapine

JANUARY 31, 2022

For example, in regards to marketing, traditional advertising methods of spending large amounts of money on TV, radio, and print ads without measuring ROI aren’t working like they used to. Everything is being tested, and then the campaigns that succeed get more money put into them, while the others aren’t repeated. The results?

Business Intelligence

Business Intelligence Analytics Cost-Benefit ROI

Digital addiction detox: Streamline tech to maximize impact, minimize risks

CIO Business Intelligence

OCTOBER 17, 2024

On one hand, they must foster an environment encouraging innovation, allowing for experimentation, evaluation, and learning with new technologies. This structured approach allows for controlled experimentation while mitigating the risks of over-adoption or dependency on unproven technologies. Assume unknown unknowns.

Risk

Risk Experimentation Testing Strategy

Expectations vs. reality: A real-world check on generative AI

CIO Business Intelligence

MAY 1, 2024

Pilots can offer value beyond just experimentation, of course. McKinsey reports that industrial design teams using LLM-powered summaries of user research and AI-generated images for ideation and experimentation sometimes see a reduction upward of 70% in product development cycle times. What are you measuring?

Cost-Benefit

Cost-Benefit Metrics Insurance Measurement

What Are ChatGPT and Its Friends?

O'Reilly on Data

MARCH 23, 2023

Tokens ChatGPT’s sense of “context”—the amount of text that it considers when it’s in conversation—is measured in “tokens,” which are also used for billing. It’s by far the most convincing example of a conversation with a machine; it has certainly passed the Turing test. Tokens are significant parts of a word.

IT

IT Modeling Testing Risk

The top 15 big data and data analytics certifications

CIO Business Intelligence

JUNE 14, 2023

Certifications measure your knowledge and skills against industry- and vendor-specific benchmarks to prove to employers that you have the right skillset. Organization: AWS Price: US$300 How to prepare: Amazon offers free exam guides, sample questions, practice tests, and digital training.

Big Data

Big Data Data Analytics Analytics Predictive Modeling

How To Suck At Social Media: An Indispensable Guide For Businesses

Occam's Razor

NOVEMBER 2, 2015

Higher Order Bits: Human vs. Business, Success KPIs, S-T-D-C Framework, MoR Test. For the rest of this post, I'm going to use the first three to capture the essence of social engagement and brand impact, and one to measure impact on the business. It covers, content, marketing and measurement. It is pronounced the more test.

B2B

B2B Metrics Recreation/Entertainment Testing

Email Marketing: Campaign Analysis, Metrics, Best Practices

Occam's Razor

JULY 18, 2011

You just have to have the right mental model (see Seth Godin above) and you have to… wait for it… wait for it… measure everything you do! For everything you do it is important to measure your effectiveness of all three phases of your effort: Acquisition. You’re trying to measure how well you are doing to: Send emails.

Metrics

Metrics Marketing Measurement Cost-Benefit

Drug Discovery Needs AI To Discover More Treatments

Smart Data Collective

FEBRUARY 4, 2020

Phase 0 is the first to involve human testing. Phase I involves dialing-in the proper dosage and further testing in a larger patient pool. An open and impartial AI model should be able to inject a measure of transparency into this process along with the obvious efficiency advantages.

Experimentation

Experimentation Testing Modeling Strategy

Bayer Crop Science blends gen AI and data science for innovative edge

CIO Business Intelligence

AUGUST 23, 2024

Making that available across the division will spur more robust experimentation and innovation, he notes. In the meantime, as enterprises move toward more advanced development of gen AI models, CIOs will have a lot to manage in terms of vendor partnerships, procurement, costs, development, measuring outcomes, and security.

Data Science

Data Science Experimentation Testing Modeling

Integrate sparse and dense vectors to enhance knowledge retrieval in RAG using Amazon OpenSearch Service

AWS Big Data

SEPTEMBER 5, 2024

Deploy a dense vector model To get more valuable test results, we selected Cohere-embed-multilingual-v3.0 , which is one of several popular models used in production for dense vectors. Experimental data selection For retrieval evaluation, we used to use the datasets from BeIR. How to combine dense and sparse?

Metrics

Metrics Testing Experimentation Modeling

6 roadblocks to IT innovation

CIO Business Intelligence

OCTOBER 21, 2024

“Legacy systems and bureaucratic structures hinder the ability to iterate and experiment rapidly, which is critical for developing and testing innovative solutions. Slow progress frustrates teams and discourages future experimentation.” Those, though, aren’t the only ways legacy tech can hurt innovation.

IT

IT Experimentation Cost-Benefit Consulting

10 digital transformation roadblocks — and 5 tips for overcoming them

CIO Business Intelligence

NOVEMBER 13, 2023

Transformational leaders must ensure their organizations have the expertise to integrate new technologies effectively and the follow-through to test and troubleshoot thoroughly before going live. Leaders must clearly define what they want to achieve through digital transformation and how they plan to do it.

Digital Transformation

Digital Transformation Cost-Benefit Experimentation Measurement

Experiment design and modeling for long-term studies in ads

The Unofficial Google Data Science Blog

OCTOBER 7, 2015

by HENNING HOHNHOLD, DEIRDRE O'BRIEN, and DIANE TANG In this post we discuss the challenges in measuring and modeling the long-term effect of ads on user behavior. A/B testing is used widely in information technology companies to guide product development and improvements.

Modeling

Modeling Experimentation Knowledge Discovery KDD

Leadership superpower: Succeeding sustainably

CIO Business Intelligence

MARCH 28, 2023

As today’s great leaders recognize, true success is not solely measured by the bottom line but also by the impact a business has on its stakeholders, including employees, partners, and the environment. Here are some ways leaders can cultivate innovation: Build a culture of experimentation. Invest in technology. Use data and metrics.

Experimentation

Experimentation Metrics Measurement Digital Transformation

AI incident reporting shortcomings leave regulatory safety hole

CIO Business Intelligence

JULY 1, 2024

Lastly, CLTR said, capacity to monitor, investigate, and respond to incidents needs to be enhanced through measures such as the establishment of a pilot AI incident database. It’s then important to regularly test and validate AI systems to help identify potential issues proactively.”

Reporting

Reporting Risk Management Experimentation Risk

10 AI strategy questions every CIO must answer

Experimentation and Testing: A Primer

Webinars

Trending Sources

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

Webinars

Practical Skills for The AI Product Manager

9 IT resolutions for 2025

MLOps and DevOps: Why Data Makes It Different

Learn how to design, measure and implement trustworthy A/B tests from leading experimentation expert Ronny Kohavi (ex-Amazon, Airbnb, Microsoft)

From project to product: Architecting the future of enterprise technology

Business Strategies for Deploying Disruptive Tech: Generative AI and ChatGPT

The DataOps Vendor Landscape, 2021

Measuring Incrementality: Controlled Experiments to the Rescue!

Do You Need a DataOps Dojo?

3 musts when recruiting vendors for AI

US Air Force seeks generative AI test pilots

How to Set AI Goals

Interview with: Sankar Narayanan, Chief Practice Officer at Fractal Analytics

What you need to know about product management for AI

AI Product Management After Deployment

Excellent Analytics Tip #8: Measure the Real Conversion Rate & "Opportunity Pie"

The Lean Analytics Cycle: Metrics > Hypothesis > Experiment > Act

What is a DataOps Engineer?

6 enterprise DevOps mistakes to avoid

What high-performance IT teams look like today — and how to build one

Designing A/B tests in a collaboration network

3 steps to eliminate shadow AI

The early returns on gen AI for software development

Towards optimal experimentation in online systems

Changing assignment weights with time-based confounders

Methods of Study Design – Experiments

6 Case Studies on The Benefits of Business Intelligence And Analytics

Digital addiction detox: Streamline tech to maximize impact, minimize risks

Expectations vs. reality: A real-world check on generative AI

What Are ChatGPT and Its Friends?

The top 15 big data and data analytics certifications

How To Suck At Social Media: An Indispensable Guide For Businesses

Email Marketing: Campaign Analysis, Metrics, Best Practices

Drug Discovery Needs AI To Discover More Treatments

Bayer Crop Science blends gen AI and data science for innovative edge

Integrate sparse and dense vectors to enhance knowledge retrieval in RAG using Amazon OpenSearch Service

6 roadblocks to IT innovation

10 digital transformation roadblocks — and 5 tips for overcoming them

Experiment design and modeling for long-term studies in ads

Leadership superpower: Succeeding sustainably

AI incident reporting shortcomings leave regulatory safety hole

Stay Connected