article thumbnail

Beyond “Prompt and Pray”

O'Reilly on Data

Your companys AI assistant confidently tells a customer its processed their urgent withdrawal requestexcept it hasnt, because it misinterpreted the API documentation. These are systems that engage in conversations and integrate with APIs but dont create stand-alone content like emails, presentations, or documents.

article thumbnail

The Race For Data Quality in a Medallion Architecture

DataKitchen

Finally, the challenge we are addressing in this document – is how to prove the data is correct at each layer.? Get Off The Blocks Fast: Data Quality In The Bronze Layer Effective Production QA techniques begin with rigorous automated testing at the Bronze layer , where raw data enters the lakehouse environment.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

O'Reilly on Data

Weve seen this across dozens of companies, and the teams that break out of this trap all adopt some version of Evaluation-Driven Development (EDD), where testing, monitoring, and evaluation drive every decision from the start. What breaks your app in production isnt always what you tested for in dev! The way out?

Testing 174
article thumbnail

5 predictions for emerging ’25 technology trends

CIO Business Intelligence

Advances in AI and ML will automate the compliance, testing, documentation and other tasks which can occupy 40-50% of a developers time. There will be productivity boosts for documentations, test cases the biggest value add immediately is human-in-the-loop internal efficiency use cases.

article thumbnail

Drug Launch Case Study: Amazing Efficiency Using DataOps

DataKitchen

data quality tests every day to support a cast of analysts and customers. DataKitchen loaded this data and implemented data tests to ensure integrity and data quality via statistical process control (SPC) from day one. The numbers speak for themselves: working towards the launch, an average of 1.5

article thumbnail

Generative AI for Farming

O'Reilly on Data

While RAG is conceptually simple—look up relevant documents and construct a prompt that tells the model to build its response from them—in practice, it’s more complex. Keep in mind that, for Digital Green, this problem is both multilingual and multimodal: relevant documents can turn up in any of the languages or modes that they use.

Testing 318
article thumbnail

5 top business use cases for AI agents

CIO Business Intelligence

Meanwhile, in December, OpenAIs new O3 model, an agentic model not yet available to the public, scored 72% on the same test. Mitre has also tested dozens of commercial AI models in a secure Mitre-managed cloud environment with AWS Bedrock. That adds up to millions of documents a month that need to be processed.

Software 143