Data Leaders Brief

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

O'Reilly on Data

MARCH 25, 2025

Lets be real: building LLM applications today feels like purgatory. Weve seen this across dozens of companies, and the teams that break out of this trap all adopt some version of Evaluation-Driven Development (EDD), where testing, monitoring, and evaluation drive every decision from the start. Leadership gets excited.

Testing

Testing Data-driven Software Measurement

How to Evaluate a Large Language Model (LLM)?

Analytics Vidhya

MAY 16, 2023

Introduction With the release of Chatgpt and other Large Language Models (LLMs), there has been a significant increase in the number of models available. New LLMs are being released every other day. Despite this, there is still no fixed or standardized way to evaluate the quality of these Large Language models.

Modeling

Modeling Analytics Data Science

Moving Beyond Guesswork: How to Evaluate LLM Quality

Dataiku

OCTOBER 4, 2024

Ninety percent of leaders are already investing in Generative AI in some way, but there's a common challenge: How can you objectively measure whether an LLM's output is actually "good enough"? For instance, imagine you’re using an LLM to power a conversational Q&A chatbot.

Measurement

Measurement Interactive Modeling

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Beyond the hype: Do you really need an LLM for your data?

CIO Business Intelligence

FEBRUARY 6, 2025

The hype around large language models (LLMs) is undeniable. They promise to revolutionize how we interact with data, generating human-quality text, understanding natural language and transforming data in ways we never thought possible. In life sciences, LLMs can analyze mountains of research papers to accelerate drug discovery.

Unstructured Data

Unstructured Data Manufacturing Data Governance Sales

The New O’Reilly Answers: The R in “RAG” Stands for “Royalties”

O'Reilly on Data

JUNE 14, 2024

And what will happen to the quality of content in a future of LLMs? However, RAG engines are not generative AI models so much as they are directed reasoning systems and pipelines that use generative LLMs to create answers grounded in sources. Can hallucinations really be controlled? It is possible.

Metadata

Metadata Publishing Data-driven Modeling

Synthetic data’s fine line between reward and disaster

CIO Business Intelligence

MAY 21, 2025

Up to 20% of the data used for training AI is already synthetic that is, generated rather than obtained by observing the real world with LLMs using millions of synthesized samples. Technically, though, any output you get from an LLM is synthetic data. Technically, though, any output you get from an LLM is synthetic data.

Statistics

Statistics Testing Modeling Risk

What’s Next for AI and Sales?

David Menninger's Analyst Perspectives

MAY 21, 2025

The mathematics was sound, the demos impressive, yet adoption faltered because little thought was given as to how sellers should use this information. The root cause of the problem came down to data quality. Yet the success of any agent, no matter how sophisticated, depends on the depth and accuracy of the information it ingests.

Sales

Sales Measurement Data-driven Dashboards

What CIOs should learn now that DeepSeek is here

CIO Business Intelligence

JANUARY 30, 2025

DeepSeeks advancements could lead to more accessible and affordable AI solutions, but they also require careful consideration of strategic, competitive, quality, and security factors, says Ritu Jyoti, group VP and GM, worldwide AI, automation, data, and analytics research with IDCs software market research and advisory practice.

Modeling

Modeling Data-driven Technology Strategy

5 top business use cases for AI agents

CIO Business Intelligence

MARCH 19, 2025

At the time, the best AIs couldnt pass the 5% mark on the SWE-bench, a challenging benchmark designed to see how well AI can solve real-world coding problems. The next evolution of AI has arrived, and its agentic. The technology is relatively new, but all the major players are already on board. Devin scored nearly 14%.

Software

Software Risk Enterprise Cost-Benefit

7 ways gen AI can create more work than it saves

CIO Business Intelligence

NOVEMBER 13, 2024

The company has already rolled out a gen AI assistant and is also looking to use AI and LLMs to optimize every process. The company has already rolled out a gen AI assistant and is also looking to use AI and LLMs to optimize every process. And the second is deploying what we call LLM Suite to almost every employee.

IT

IT Consulting ROI Cost-Benefit

Amazon OpenSearch Service launches flow builder to empower rapid AI search innovation

AWS Big Data

MAY 2, 2025

Next, well create a multi-modal RAG flow, to showcase how you can redefine image discovery within your applications. They consist of: A query interface based on the search API , defining how the flow is queried and ran. In the remainder of the post, well walk through a couple of scenarios to demonstrate the flow builder.

Machine Learning

Machine Learning Visualization Dashboards Metadata

How Far We Can Go with GenAI as an Information Extraction Tool

Ontotext

JANUARY 10, 2025

Introduction In the real world, obtaining high-quality annotated data remains a challenge. Therefore we explored how GenAI could automate several stages of the graph-building pipeline. Understanding its importance, we investigated how GenAI performs on NER, especially in diverse and domain-specific contexts.

Informatics

Informatics Modeling Metadata Experimentation

Avoiding Toxicity in Generative AI

David Menninger's Analyst Perspectives

SEPTEMBER 24, 2024

Nearly a third (32%) identified performance and quality (e.g., The most obvious example is when training an LLM to identify toxic material, you certainly wouldn’t want to eliminate toxic material from the training data. Successful prompts are used to train the LLM to avoid such responses. However, that’s not always possible.

Testing

Testing Modeling Enterprise Risk

What LinkedIn learned leveraging LLMs for its billion users

CIO Business Intelligence

APRIL 25, 2024

During the summer of 2023, at the height of the first wave of interest in generative AI, LinkedIn began to wonder whether matching candidates with employers and making feeds more useful would be better served with the help of large language models (LLMs). LLM was the new thing and it felt like it could solve everything,” Bottaro said. “We

IT

IT Cost-Benefit Experimentation Metrics

12 AI predictions for 2025

CIO Business Intelligence

DECEMBER 30, 2024

But for many business use cases, LLMs are overkill and are too expensive, and too slow, for practical use. LLMs arent just expensive, theyre also very broad, and not always relevant to specific industries, he says. Small language models are also better for edge and mobile deployments, as with Apples recent mobile AI announcements.

Software

Software ROI Modeling Interactive

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

CIO Business Intelligence

JANUARY 7, 2025

Key considerations for cloud strategy and modernization The what: The executive leadership team of business and IT together need to evaluate business needs and their current business challenges, global footprint and current technology landscape and define the companys Northstar, (aka, the what, the vision).

Optimization

Optimization Strategy Cost-Benefit Enterprise

The trick to better answers from generative AI

CIO Business Intelligence

FEBRUARY 29, 2024

But before using generative AI to answer questions about your data, it’s important to first evaluate the questions being asked. Interested in how Smart Answers surfaces its insights, I asked Gunasekara to discuss more deeply Miso.ai’s approach to understanding and answering users’ questions. Update as needed as data changes.

Modeling

Modeling Publishing Enterprise Technology

Write queries faster with Amazon Q generative SQL for Amazon Redshift

AWS Big Data

NOVEMBER 7, 2024

In this post, we show you how to enable the Amazon Q generative SQL feature in the Redshift query editor and use the feature to get tailored SQL commands based on your natural language queries. Amazon Q generative SQL uses a large language model (LLM) and Amazon Bedrock to generate the SQL query.

Metadata

Metadata Sales Data Warehouse Optimization

Salesforce debuts gen AI benchmark for CRM

CIO Business Intelligence

JUNE 18, 2024

Salesforce today announced a first-of-its-kind gen AI benchmark for CRM, which aims to help businesses make more informed decisions when choosing large language models (LLMs) for use with business applications. Furthermore, this benchmark doesn’t rely on automated evaluations based on LLMs or synthetic data.

Metrics

Metrics Sales Business Objectives Modeling

Data governance in the age of generative AI

AWS Big Data

FEBRUARY 29, 2024

Working with large language models (LLMs) for enterprise use cases requires the implementation of quality and privacy considerations to drive responsible AI. Enterprises are doing this by using proprietary data with approaches like Retrieval Augmented Generation (RAG), fine-tuning, and continued pre-training with foundation models.

Data Governance

Data Governance Unstructured Data Metadata Data Lake

How generative AI impacts your digital transformation priorities

CIO Business Intelligence

AUGUST 1, 2023

During keynotes and discussions with CIOs, I remind everyone how strategic priorities evolve significantly every two years or less, from growth in 2018, to pandemic and remote work in 2020, to hybrid work and financial constraints in 2022. Digital transformation must be a core organizational competency.

Digital Transformation

Digital Transformation Unstructured Data Strategy Data Science

Know before you go: 6 lessons for enterprise GenAI adoption

CIO Business Intelligence

JANUARY 9, 2024

That quote aptly describes what Dell Technologies and Intel are doing to help our enterprise customers quickly, effectively, and securely deploy generative AI and large language models (LLMs).Many Here’s a quick read about how enterprises put generative AI to work). That makes it impractical to train an LLM from scratch.

Enterprise

Enterprise Experimentation Modeling Data-driven

Four things that matter in the AI hype cycle

CIO Business Intelligence

OCTOBER 24, 2023

The capabilities of these new generative AI tools, most of which are powered by large language models (LLM), forced every company and employee to rethink how they work. If you don’t figure out how to make the most of GenAI, are you going to get outclassed by your peers? Enter vector embeddings.

Cost-Benefit

Cost-Benefit Modeling Data Quality Statistics

AI recommendations for descriptions in Amazon DataZone for enhanced business data cataloging and discovery is now generally available

AWS Big Data

APRIL 2, 2024

We also detail how the feature works and what criteria was applied for the model and prompt selection while building on Amazon Bedrock. In this post, we share what we heard from our customers that led us to add the AI-generated data descriptions and discuss specific customer use cases addressed by this capability.

Metadata

Metadata Metrics Data-driven Contextual Data

Expectations vs. reality: A real-world check on generative AI

CIO Business Intelligence

MAY 1, 2024

Gen AI takes us from single-use models of machine learning (ML) to AI tools that promise to be a platform with uses in many areas, but you still need to validate they’re appropriate for the problems you want solved, and that your users know how to use gen AI effectively. Pilots can offer value beyond just experimentation, of course.

Cost-Benefit

Cost-Benefit Metrics Insurance Measurement

Streamlining Generative AI Deployment with New Accelerators

Cloudera

SEPTEMBER 26, 2024

Fine Tuning Studio Fine tuning has become an important methodology for creating specialized large language models (LLM). Since LLMs are trained on essentially the entire internet, they are generalists capable of doing many different things very well. To view a demo, watch this vi deo.

Machine Learning

Machine Learning Structured Data Optimization Enterprise

AI agents will transform business processes — and magnify risks

CIO Business Intelligence

AUGUST 21, 2024

According to Gartner, an agent doesn’t have to be an AI model. It can also be a software program or another computational entity — or a robot. When multiple independent but interactive agents are combined, each capable of perceiving the environment and taking actions, you get a multiagent system. And, yes, enterprises are already deploying them.

Risk

Risk Insurance Cost-Benefit Software

Open source large language models: Benefits, risks and types

IBM Big Data Hub

SEPTEMBER 27, 2023

Large language models (LLMs) are foundation models that use artificial intelligence (AI), deep learning and massive data sets, including websites, articles and books, to generate text, translate between languages and write many types of content. Proprietary LLMs are owned by a company and can only be used by customers that purchase a license.

Risk

Risk Modeling Cost-Benefit Enterprise

10 things to watch out for with open source gen AI

CIO Business Intelligence

MAY 15, 2024

Hugging Face currently tracks more than 80,000 LLMs for text generation alone and fortunately has a leaderboard that lets you quickly sort the models by how they score on various benchmarks. According to Stanford’s AI Index Report, released in April, 149 foundation models were released in 2023, two-thirds of them open source.

Modeling

Modeling Risk Software Enterprise

3 key digital transformation priorities for 2024

CIO Business Intelligence

DECEMBER 19, 2023

Here’s how. It’s about looking at the business strategy through the lens of technical capabilities and how that changes how you operate and generate revenues.” Luckily, many are expanding budgets to do so. “94% Luckily, many are expanding budgets to do so. “94%

Digital Transformation

Digital Transformation Unstructured Data Machine Learning Risk Management

IT leaders weigh up AI’s role to improve data management

CIO Business Intelligence

SEPTEMBER 27, 2024

We monitor the entire flow and use aggregated data to evaluate the best solutions and experience to bring to the customer. We always present consumers with two different experiences and evaluate the result. AI has become a sort of corporate mantra, and machine learning (ML) and gen AI have become additions to the bigger conversation.

Management

Management IT Cost-Benefit Testing

How the Masters uses watsonx to manage its AI lifecycle

IBM Big Data Hub

APRIL 9, 2024

“The data lake at the Masters draws on eight years of data that reflects how the course has changed over time, while using only the shot data captured with our current ball-tracking technology,” says Aaron Baughman, IBM Fellow and AI and Hybrid Cloud Lead at IBM.

Management

Management IT Machine Learning Metrics

Differentiate generative AI applications with your data using AWS analytics and managed databases

AWS Big Data

SEPTEMBER 12, 2024

While the potential of generative artificial intelligence (AI) is increasingly under evaluation , organizations are at different stages in defining their generative AI vision. In many organizations, the focus is on large language models (LLMs), and foundation models (FMs) more broadly. Which persona should the FM impersonate?

Management

Management Analytics Data Lake Interactive

Generative AI that’s tailored for your business needs with watsonx.ai

IBM Big Data Hub

SEPTEMBER 28, 2023

Based on initial IBM Research evaluations and testing , across 11 different financial tasks, the results show that by training Granite-13B models with high-quality finance data, they are some of the top performing models on finance tasks, and have the potential to achieve either similar or even better performance than much larger models.

Testing

Testing Finance Enterprise Modeling

Optimizing the Value of AI Solutions for the Public Sector

Cloudera

DECEMBER 19, 2023

As also expected, most had experimented on their own with large language models (LLM) and image generators. As also expected, most had experimented on their own with large language models (LLM) and image generators. Without a doubt, 2023 has shaped up to be generative AI’s breakout year. The underlying reason?

Optimization

Optimization Cost-Benefit Unstructured Data Risk

Enterprise BI & Analytics: The Wave of Generative AI Roars over a Mature Market

BI-Survey

OCTOBER 4, 2024

The market for Enterprise BI & Analytics has reached a significant level of maturity, with platforms that offer robust core functionalities, such as reporting and dashboards, delivered with high quality. LLMs offer a greater versatility that makes it possible to go far beyond chatbots.

Marketing

Marketing Enterprise Analytics Software

Generative AI – How to Care For, and Properly Feed, Chatty Robots

Ontotext

SEPTEMBER 1, 2023

Terms related to GenAI such as hallucinations and Large Language Models (LLMs) have become lingua-franca for any and every business conversation. So, What Exactly are Generative AI and LLMs? An LLM, on the other hand, is a neural network model built by processing text data.

Risk

Risk Modeling Data Quality Data Governance

AI In Analytics: Today and Tomorrow!

Smarten

APRIL 19, 2024

The use of Generative AI, LLM and products such as ChatGPT capabilities has been applied to all kinds of industries, from publishing and research to targeted marketing and healthcare. OpenAI – Azure OpenAI as the foundational entity for creating GPT models and is based on Large Language Models (LLM). in next several years.

Analytics

Analytics Predictive Modeling KPI Machine Learning

Unlocking AI Potential: Community-Driven Innovations with InstructLab

Decision Management Solutions

JUNE 28, 2024

In the rapidly evolving landscape of artificial intelligence, the ability to contribute to and shape large language models (LLMs) has traditionally been reserved for those with deep expertise in AI and machine learning.

Data-driven

Data-driven Machine Learning Modeling Reporting

You Cannot Get to the Moon on a Bike!

Ontotext

JANUARY 10, 2024

Next, I will explain how knowledge graphs help them to get a unified view to data derived from multiple sources and get richer insights in less time. Yet, you don’t expect to be able to get to the moon on a bike, do you? Unless you already have ET riding with you.

Metadata

Metadata Slice and Dice Data Integration Enterprise

How to Fix ‘AI’s Original Sin’

O'Reilly on Data

JUNE 18, 2024

Whether the output of a generative AI system is fair use can depend on how its training datasets were assembled. And how do we create a virtuous circle of ongoing value creation, an ecosystem in which everyone benefits? They raise issues of authorship, similarity, direct and indirect liability, fair use, and licensing, among much else.

Advertising

Advertising Modeling Publishing Data Processing

Real-Real-World Programming with ChatGPT

O'Reilly on Data

JULY 25, 2023

These posts often recount someone trying ChatGPT or Copilot for the first time with a few simple prompts, seeing how it does for some small self-contained coding tasks, and then making sweeping claims like “WOW this exceeded all my highest hopes and wildest dreams, it’s going to replace all programmers in five years!”

Consulting

Consulting Interactive Software Metadata

My Dear Watson, it is Great to Have Someone to Talk to

Ontotext

DECEMBER 17, 2024

It will illustrate how users with varying levels of technical knowledge, particularly the less tech-savvy ones, can benefit from the Graphwise GraphDB-based approach to retrieval augmented generation (RAG) , underpinned by large language model (LLM) agents. This blog post will cover a specific use case in the fact-checking domain.

IT

IT Metadata Visualization Modeling

A new era of cybersecurity with AI: Predictions for 2024

CIO Business Intelligence

JANUARY 24, 2024

On one hand, LLMs make it easy to process large amounts of information and for everybody to leverage AI. Deployment of LLMs requires a new way of thinking about cybersecurity. Data Labeled data Using LLMs has helped us overcome the challenge of not having enough “labeled data”. Yet, these data are hard to come by.

Modeling

Modeling Machine Learning Cost-Benefit Software

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

How to Evaluate a Large Language Model (LLM)?

Webinars

Trending Sources

Moving Beyond Guesswork: How to Evaluate LLM Quality

Webinars

Beyond the hype: Do you really need an LLM for your data?

The New O’Reilly Answers: The R in “RAG” Stands for “Royalties”

Synthetic data’s fine line between reward and disaster

What’s Next for AI and Sales?

What CIOs should learn now that DeepSeek is here

5 top business use cases for AI agents

7 ways gen AI can create more work than it saves

Amazon OpenSearch Service launches flow builder to empower rapid AI search innovation

How Far We Can Go with GenAI as an Information Extraction Tool

Avoiding Toxicity in Generative AI

What LinkedIn learned leveraging LLMs for its billion users

12 AI predictions for 2025

Navigating the cloud maze: A 5-phase approach to optimizing cloud strategies

The trick to better answers from generative AI

Write queries faster with Amazon Q generative SQL for Amazon Redshift

Salesforce debuts gen AI benchmark for CRM

Data governance in the age of generative AI

How generative AI impacts your digital transformation priorities

Know before you go: 6 lessons for enterprise GenAI adoption

Four things that matter in the AI hype cycle

AI recommendations for descriptions in Amazon DataZone for enhanced business data cataloging and discovery is now generally available

Expectations vs. reality: A real-world check on generative AI

Streamlining Generative AI Deployment with New Accelerators

AI agents will transform business processes — and magnify risks

Open source large language models: Benefits, risks and types

10 things to watch out for with open source gen AI

3 key digital transformation priorities for 2024

IT leaders weigh up AI’s role to improve data management

How the Masters uses watsonx to manage its AI lifecycle

Differentiate generative AI applications with your data using AWS analytics and managed databases

Generative AI that’s tailored for your business needs with watsonx.ai

Optimizing the Value of AI Solutions for the Public Sector

Enterprise BI & Analytics: The Wave of Generative AI Roars over a Mature Market

Generative AI – How to Care For, and Properly Feed, Chatty Robots

AI In Analytics: Today and Tomorrow!

Unlocking AI Potential: Community-Driven Innovations with InstructLab

You Cannot Get to the Moon on a Bike!

How to Fix ‘AI’s Original Sin’

Real-Real-World Programming with ChatGPT

My Dear Watson, it is Great to Have Someone to Talk to

A new era of cybersecurity with AI: Predictions for 2024

Stay Connected