Blog - Data Leaders Brief

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. We use Anthropic’s Claude 2.1 foundation model (FM) in Amazon Bedrock as the LLM.

Metadata

Metadata Data Lake Modeling Data Warehouse

Three’s Company Too: Metadata, Data and Text Analysis

Ontotext

AUGUST 19, 2020

Metadata used to be a secret shared between system programmers and the data. Metadata described the data in terms of cardinality, data types such as strings vs integers, and primary or foreign key relationships. Inevitably, the information that could and needed to be expressed by metadata increased in complexity.

Metadata

Metadata Knowledge Discovery Cost-Benefit Data Governance

Recap of Amazon Redshift key product announcements in 2024

AWS Big Data

DECEMBER 17, 2024

Amazon Redshift , launched in 2013, has undergone significant evolution since its inception, allowing customers to expand the horizons of data warehousing and SQL analytics. Industry-leading price-performance Amazon Redshift offers up to three times better price-performance than alternative cloud data warehouses.

Data Lake

Data Lake Data Warehouse Data-driven Optimization

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How Far We Can Go with GenAI as an Information Extraction Tool

Ontotext

JANUARY 10, 2025

Introduction In the real world, obtaining high-quality annotated data remains a challenge. This blog post summarizes our findings, focusing on NER as a first-step key task for knowledge extraction. Data In Natural Language Processing (NLP), domain-specific knowledge plays a crucial role in the accuracy of tasks like NER.

Informatics

Informatics Modeling Metadata Experimentation

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. This is something that you can learn more about in just about any technology blog.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

How to supercharge data exploration with Pandas Profiling

Domino Data Lab

JANUARY 21, 2021

Producing insights from raw data is a time-consuming process. The Importance of Exploratory Analytics in the Data Science Lifecycle. Exploratory analysis is a critical component of the data science lifecycle. As a result, exploratory analysis is inherently iterative, and difficult to scope.

Statistics

Statistics Unstructured Data Data Science Visualization

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

AWS Big Data

JULY 25, 2024

This blog post is co-written with Raj Samineni from ATPCO. In today’s data-driven world, companies across industries recognize the immense value of data in making decisions, driving innovation, and building new products to serve their customers.

Data Lake

Data Lake Metadata Sales Publishing

You Cannot Get to the Moon on a Bike!

Ontotext

JANUARY 10, 2024

Next, I will explain how knowledge graphs help them to get a unified view to data derived from multiple sources and get richer insights in less time. Next, I will explain how knowledge graphs help them to get a unified view to data derived from multiple sources and get richer insights in less time.

Metadata

Metadata Slice and Dice Data Integration Enterprise

The People’s Data Catalog: Alation Featured as Top Choice in Eckerson’s Latest Report

Alation

JULY 15, 2021

Many data catalog initiatives fail. According to the latest report from Eckerson Group, Deep Dive on Data Catalogs , shoppers must match the goals of their organizations to the capabilities of their chosen catalog. A data catalog’s approach is key. Finding a trustworthy asset in a sea of data can take analysts months.

Reporting

Reporting Data Governance Recreation/Entertainment Metadata

Generative AI is pushing unstructured data to center stage

CIO Business Intelligence

DECEMBER 13, 2023

When I think about unstructured data, I see my colleague Rob Gerbrandt (an information governance genius) walking into a customer’s conference room where tubes of core samples line three walls. While most of us would see dirt and rock, Rob sees unstructured data. have encouraged the creation of unstructured data.

Unstructured Data

Unstructured Data IoT Metadata Manufacturing

How to Do Data Modeling the Right Way

erwin

MAY 27, 2020

Data modeling supports collaboration among business stakeholders – with different job roles and skills – to coordinate with business objectives. Data resides everywhere in a business , on-premise and in private or public clouds. A single source of data truth helps companies begin to leverage data as a strategic asset.

Modeling

Modeling Metadata Data Governance Visualization

Ontotext Marketing Gets a Boost from Knowledge Graph Powered LLMs

Ontotext

MARCH 13, 2024

We started with our marketing content and quickly expanded that to also integrate a set of workflows for data and content management. Through Ontotext Metadata Studio (OMDS), we then apply semantic content enrichment using text analysis based on our marketing vocabularies.

Marketing

Marketing Knowledge Discovery Metadata Data-driven

US Open heralds new era of fan engagement with watsonx and generative AI

IBM Big Data Hub

AUGUST 17, 2023

Year after year, IBM Consulting works with the United States Tennis Association (USTA) to transform massive amounts of data into meaningful insight for tennis fans. This year, the USTA is using watsonx , IBM’s new AI and data platform for business.

Unstructured Data

Unstructured Data Statistics Consulting Enterprise

Better Analytics Through AI: Our Take on Gartner’s AI Trends

Sisense

AUGUST 21, 2020

AI and machine learning are the future of every industry, especially data and analytics. Reading through the Gartner Top 10 Trends in Data and Analytics for 2020 , I was struck by how different terms mean different things to different audiences under different contexts. But what do we really mean when we talk about these issues?

Analytics

Analytics Machine Learning Dashboards Visualization

On the Hunt for Patterns: from Hippocrates to Supercomputers

Ontotext

MAY 18, 2020

Ever since Hippocrates founded his school of medicine in ancient Greece some 2,500 years ago, writes Hannah Fry in her book Hello World: Being Human in the Age of Algorithms , what has been fundamental to healthcare (as she calls it “the fight to keep us healthy”) was observation, experimentation and the analysis of data.

Knowledge Discovery

Knowledge Discovery Experimentation Data-driven Metadata

From Data Silos to Data Fabric with Knowledge Graphs

Ontotext

SEPTEMBER 15, 2020

What Makes a Data Fabric? Data Fabric’ has reached where ‘Cloud Computing’ and ‘Grid Computing’ once trod. Data Fabric hit the Gartner top ten in 2019. This multiplicity of data leads to the growth silos, which in turns increases the cost of integration. It is a buzzword.

Metadata

Metadata Knowledge Discovery Data Quality Data-driven

Building Your Human Benchmark with Ontotext Metadata Studio

Ontotext

FEBRUARY 16, 2023

An area of AI that Ontotext has been working on for over 20 years is text analytics. This data can then be easily analyzed to provide insights or used to train machine learning models. In text analytics, the human benchmark is a set of documents manually annotated by human experts. You can read more about it in this blog post.

Metadata

Metadata Measurement Metrics Modeling

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

dbt is an open source, SQL-first templating engine that allows you to write repeatable and extensible data transforms in Python and SQL. dbt is predominantly used by data warehouses (such as Amazon Redshift ) customers who are looking to keep their data transform logic separate from storage and engine.

Data Lake

Data Lake Management Metrics Data Warehouse

The Gold Standard – The Key to Information Extraction and Data Quality Control

Ontotext

MAY 26, 2021

In much the same way, in the context of Artificial Intelligence AI systems, the Gold Standard refers to a set of data that has been manually prepared or verified and that represents “the objective truth” as closely as possible. And this is a challenge, as today’s data comes in huge volumes and from various sources.

Data Quality

Data Quality Machine Learning Measurement Metadata

Build Spark Structured Streaming applications with the open source connector for Amazon Kinesis Data Streams

AWS Big Data

MAY 24, 2024

Apache Spark is a powerful big data engine used for large-scale data analytics. You can use Apache Spark to process streaming data from a variety of streaming sources, including Amazon Kinesis Data Streams for use cases like clickstream analysis, fraud detection, and more. Starting with Amazon EMR 7.1,

Metadata

Metadata Interactive Business Objectives Management

KGF 2023: Bikes To The Moon, Datastrophies, Abstract Art And A Knowledge Graph Forum To Embrace Them All

Ontotext

DECEMBER 1, 2023

It enriched their understanding of the full spectrum of knowledge graph business applications and the technology partner ecosystem needed to turn data into a competitive advantage. Content and data management solutions based on knowledge graphs are becoming increasingly important across enterprises.

Metadata

Metadata Sales Machine Learning Consulting

Microsoft Azure OpenAI Service and DataRobot Modernize Data Science Work with Cutting-Edge Technology Innovations

DataRobot Blog

MARCH 16, 2023

The integration of DataRobot and Azure OpenAI Service breaks down a barrier that has long existed between data teams and business stakeholders. Traditionally, developing appropriate data science code and interpreting the results to solve a use-case is manually done by data scientists.

Data Science

Data Science Technology Data-driven Modeling

Ontotext’s Semantic Approach Towards LLM, Better Data and Content Management: An Interview with Doug Kimball and Atanas Kiryakov

Ontotext

MAY 22, 2023

At the same time, most data management (DM) applications require 100% correct retrieval, 0% hallucination! At the same time, most data management (DM) applications require 100% correct retrieval, 0% hallucination! And getting a free text summary of the results, instead of just a table. I am very optimistic!

Management

Management Unstructured Data Metadata Cost-Benefit

Exploring the AI and data capabilities of watsonx

IBM Big Data Hub

JULY 17, 2023

In this blog, I will cover: What is watsonx.ai? IBM software products are embedding watsonx capabilities across digital labor, IT automation, security, sustainability, and application modernization to help unlock new levels of business value for clients. What capabilities are included in watsonx.ai? What is watsonx.data? What is watsonx.ai?

Machine Learning

Machine Learning Data Warehouse Modeling Cost-Benefit

Process and analyze highly nested and large XML files using AWS Glue and Amazon Athena

AWS Big Data

SEPTEMBER 29, 2023

In today’s digital age, data is at the heart of every organization’s success. One of the most commonly used formats for exchanging data is XML. Analyzing XML files can help organizations gain insights into their data, allowing them to make better decisions and improve their operations.

Metadata

Metadata Visualization Data-driven Optimization

Knowledge Graphs: Breaking the Ice

Ontotext

DECEMBER 7, 2023

The three roles of a knowledge graph A knowledge graph is a versatile way of organizing and using data. Like a database , knowledge graphs have schemas and users can apply complex structured queries to extract specific data needed. Because of the formal semantics attached to the data, knowledge graphs can act as a knowledge base.

Metadata

Metadata Modeling Software Statistics

Using Machine Learning for Sentiment Analysis: a Deep Dive

DataRobot Blog

MARCH 9, 2022

Sentiment analysis invites us to consider the sentence, You’re so smart! In fact, when presented with a piece of text, sometimes even humans disagree about its tonality, especially if there’s not a fair deal of informative context provided to help rule out incorrect interpretations. Sentiment analysis datasets. It provides 1.6

Machine Learning

Machine Learning Deep Learning Modeling Measurement

What Does 2000 Year Old Concrete Have to Do with Knowledge Graphs?

Ontotext

SEPTEMBER 2, 2020

Organizations that invest time and resources to improve the knowledge and capabilities of their employees perform better. The risk is that the organization creates a valuable asset with years of expertise and experience that is directly relevant to the organization and that valuable asset can one day cross the street to your competitors.

Insurance

Insurance Metadata Publishing Unstructured Data

Best practices for enabling business users to answer questions about data using natural language in Amazon QuickSight

AWS Big Data

JUNE 15, 2023

In this post, we explain how you can enable business users to ask and answer questions about data using their everyday business language by using the Amazon QuickSight natural language query function, Amazon QuickSight Q. Q uses the same QuickSight datasets you use for your dashboards and reports so your data is governed and secured.

Sales

Sales Dashboards Visualization Testing

The Superpowers of Ontotext’s Relation and Event Detector

Ontotext

FEBRUARY 26, 2024

This is part of Ontotext’s AI-in-Action initiative aimed at enabling data scientists and engineers to benefit from the AI capabilities of our products. RED’s focus on news content serves a pivotal function: identifying, extracting, and structuring data on events, parties involved, and subsequent impacts.

Data-driven

Data-driven Risk Modeling Risk Management

Prioritizing Data: Why a Solid Data Management Strategy Will Be Critical in 2024

Ontotext

JANUARY 29, 2024

In 2023, data leaders and enthusiasts were enamored of — and often distracted by — initiatives such as generative AI and cloud migration. I expect to see the following data and knowledge management trends emerge in 2024.

Strategy

Strategy Management Metadata Data-driven

Amazon OpenSearch Service H1 2023 in review

AWS Big Data

AUGUST 23, 2023

It uses Amazon Simple Storage Service (Amazon S3) as the primary data storage for indexes, adding durability for your data. It uses Amazon Simple Storage Service (Amazon S3) as the primary data storage for indexes, adding durability for your data. When you create a serverless collection, you set a collection type.

Snapshot

Snapshot Dashboards Visualization Metrics

What’s the Difference: Quantitative vs Qualitative Data

Alation

OCTOBER 12, 2022

Companies collect and analyze vast amounts of data to make informed business decisions. From product development to customer satisfaction, nearly every aspect of a business uses data and analytics to measure success and define strategies. When choosing between qualitative and quantitative data, think about what you want to learn.

Statistics

Statistics Sales Testing Marketing

Ontotext Knowledge Graph Platform: The Modern Way of Building Smart Enterprise Applications

Ontotext

MARCH 18, 2020

According to an article in Harvard Business Review , cross-industry studies show that, on average, big enterprises actively use less than half of their structured data and sometimes about 1% of their unstructured data. The many data warehouse systems designed in the last 30 years present significant difficulties in that respect.

Enterprise

Enterprise B2B Unstructured Data Machine Learning

Turning petabytes of pharmaceutical data into actionable insights

Cloudera

JUNE 4, 2018

That’s the equivalent of 1 petabyte ( ComputerWeekly ) – the amount of unstructured data available within our large pharmaceutical client’s business. Then imagine the insights that are locked in that massive amount of data. Ensure content can be reused within the data hub to support pharmaceutical use cases.

Unstructured Data

Unstructured Data Metadata Big Data Enterprise

Common English Entity Linking: Linking Text to Knowledge Fast and Efficient

Ontotext

FEBRUARY 14, 2024

Entity linking is the process of automatically linking entity mentions from text to the corresponding entries in a knowledge base. It has been an important capability for Ontotext ever since we dove into Natural Language Processing (NLP), as it is a crucial aspect of the interplay between text analysis and knowledge graphs.

Cost-Benefit

Cost-Benefit Modeling Metadata Publishing

At Center Stage V: Embedding Graphs in Enterprise Architectures via GraphQL, Federation and Kafka

Ontotext

JANUARY 20, 2022

We’ve already discussed that enterprise knowledge graphs bring together and harmonize all-important organizational knowledge and metadata. We’ve already discussed that enterprise knowledge graphs bring together and harmonize all-important organizational knowledge and metadata. Building a single data graph across the three services.

Enterprise

Enterprise Unstructured Data Visualization Modeling

Introducing Cloudera Enterprise 6.0

Cloudera

AUGUST 30, 2018

Consider the following practices that, until recently, were relegated to the R&D department: Data-driven decision making – the collection and analysis of data to guide decisions that improve success. Complicating matters is the increasing focus on data protection and the far-reaching implications of IoT (e.g.

Enterprise

Enterprise Data-driven Digital Transformation Machine Learning

GraphDB: Semantic Text Similarity for Identifying Related Terms & Documents

Ontotext

JULY 11, 2019

Ontotext’s GraphDB is an enterprise-ready semantic graph database (also called RDF triplestore because it stores data in RDF triples). It provides the core infrastructure for solutions where modelling agility, data integration, relationship exploration, cross-enterprise data publishing and consumption are critical. .

Statistics

Statistics Modeling Metadata Enterprise

Success Stories: Applications and Benefits of Knowledge Graphs in Financial Services

Ontotext

JULY 6, 2023

In today’s fast changing environment, enterprises that have transitioned from being focused on applications to becoming data-driven gain a significant competitive edge. There are four groups of data that are naturally siloed: Structured data (e.g., internal metadata, industry ontologies, etc.)

Cost-Benefit

Cost-Benefit Metadata Experimentation Risk

Case study: Policy Enforcement Automation With Semantics

Ontotext

MAY 2, 2024

Data leaders today are faced with an almost impossible challenge. They are expected to understand the entire data landscape and generate business-moving insights while facing the voracious needs of different teams and the constraints of technology architecture and compliance.

Metadata

Metadata Data Lake Data-driven Enterprise

Accelerating End-to-end Knowledge Graph Solutions with Ontotext’s LinkedLifeData Inventory: The Case of Target Discovery

Ontotext

AUGUST 30, 2024

It’s no secret that data scientists and researchers spend 80% of their time on the less glamorous tasks of chasing down data, cleaning it up, and making sure it’s not full of nonsense. During the target identification phase of drug development, several challenges related to data can impede progress.

Interactive

Interactive Metadata Dashboards Cost-Benefit

Accelerating End-to-end Knowledge Graph Solutions with Ontotext’s LinkedLifeData Inventory: The Case of Target Discovery

Ontotext

AUGUST 30, 2024

It’s no secret that data scientists and researchers spend 80% of their time on the less glamorous tasks of chasing down data, cleaning it up, and making sure it’s not full of nonsense. During the target identification phase of drug development, several challenges related to data can impede progress.

Interactive

Interactive Metadata Dashboards Cost-Benefit

GraphDB: Semantic Text Similarity for Identifying Related Terms & Documents

Ontotext

JULY 11, 2019

The similarity indices are a fuzzy match heuristic based on statistical semantics, which is particularly useful when retrieving the closest related texts or when grouping a cluster of graph nodes based on their topology. Let’s go through some of the main types of text semantic similarities searches with a simple but representative example.

Statistics

Statistics Modeling Metadata IT

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Three’s Company Too: Metadata, Data and Text Analysis

Webinars

Trending Sources

Recap of Amazon Redshift key product announcements in 2024

Webinars

How Far We Can Go with GenAI as an Information Extraction Tool

Biggest Trends in Data Visualization Taking Shape in 2022

How to supercharge data exploration with Pandas Profiling

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

You Cannot Get to the Moon on a Bike!

The People’s Data Catalog: Alation Featured as Top Choice in Eckerson’s Latest Report

Generative AI is pushing unstructured data to center stage

How to Do Data Modeling the Right Way

Ontotext Marketing Gets a Boost from Knowledge Graph Powered LLMs

US Open heralds new era of fan engagement with watsonx and generative AI

Better Analytics Through AI: Our Take on Gartner’s AI Trends

On the Hunt for Patterns: from Hippocrates to Supercomputers

From Data Silos to Data Fabric with Knowledge Graphs

Building Your Human Benchmark with Ontotext Metadata Studio

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

The Gold Standard – The Key to Information Extraction and Data Quality Control

Build Spark Structured Streaming applications with the open source connector for Amazon Kinesis Data Streams

KGF 2023: Bikes To The Moon, Datastrophies, Abstract Art And A Knowledge Graph Forum To Embrace Them All

Microsoft Azure OpenAI Service and DataRobot Modernize Data Science Work with Cutting-Edge Technology Innovations

Ontotext’s Semantic Approach Towards LLM, Better Data and Content Management: An Interview with Doug Kimball and Atanas Kiryakov

Exploring the AI and data capabilities of watsonx

Process and analyze highly nested and large XML files using AWS Glue and Amazon Athena

Knowledge Graphs: Breaking the Ice

Using Machine Learning for Sentiment Analysis: a Deep Dive

What Does 2000 Year Old Concrete Have to Do with Knowledge Graphs?

Best practices for enabling business users to answer questions about data using natural language in Amazon QuickSight

The Superpowers of Ontotext’s Relation and Event Detector

Prioritizing Data: Why a Solid Data Management Strategy Will Be Critical in 2024

Amazon OpenSearch Service H1 2023 in review

What’s the Difference: Quantitative vs Qualitative Data

Ontotext Knowledge Graph Platform: The Modern Way of Building Smart Enterprise Applications

Turning petabytes of pharmaceutical data into actionable insights

Common English Entity Linking: Linking Text to Knowledge Fast and Efficient

At Center Stage V: Embedding Graphs in Enterprise Architectures via GraphQL, Federation and Kafka

Introducing Cloudera Enterprise 6.0

GraphDB: Semantic Text Similarity for Identifying Related Terms & Documents

Success Stories: Applications and Benefits of Knowledge Graphs in Financial Services

Case study: Policy Enforcement Automation With Semantics

Accelerating End-to-end Knowledge Graph Solutions with Ontotext’s LinkedLifeData Inventory: The Case of Target Discovery

Accelerating End-to-end Knowledge Graph Solutions with Ontotext’s LinkedLifeData Inventory: The Case of Target Discovery

GraphDB: Semantic Text Similarity for Identifying Related Terms & Documents

Stay Connected