Data-driven, Metadata and Reference

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

AWS Big Data

NOVEMBER 14, 2024

The landscape of big data management has been transformed by the rising popularity of open table formats such as Apache Iceberg, Apache Hudi, and Linux Foundation Delta Lake. These formats, designed to address the limitations of traditional data storage systems, have become essential in modern data architectures.

Metadata

Metadata Data Warehouse Big Data Data Lake

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data. 10) Data Quality Solutions: Key Attributes.

Data Quality

Data Quality Metrics Data-driven Management

Write queries faster with Amazon Q generative SQL for Amazon Redshift

AWS Big Data

NOVEMBER 7, 2024

Amazon Redshift is a fully managed, AI-powered cloud data warehouse that delivers the best price-performance for your analytics workloads at any scale. It provides a conversational interface where users can submit queries in natural language within the scope of their current data permissions. Your data is not shared across accounts.

Metadata

Metadata Sales Data Warehouse Optimization

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Build a high-performance quant research platform with Apache Iceberg

AWS Big Data

JANUARY 9, 2025

In this post, we focus on data management implementation options such as accessing data directly in Amazon Simple Storage Service (Amazon S3), using popular data formats like Parquet, or using open table formats like Iceberg. Data management is the foundation of quantitative research.

Metadata

Metadata Snapshot Cost-Benefit Optimization

Streamline data discovery with precise technical identifier search in Amazon SageMaker Unified Studio

AWS Big Data

APRIL 9, 2025

This yields results with exact precision, dramatically improving the speed and accuracy of data discovery. In this post, we demonstrate how to streamline data discovery with precise technical identifier search in Amazon SageMaker Unified Studio.

Metadata

Metadata Metrics Data-driven Cost-Benefit

Recap of Amazon Redshift key product announcements in 2024

AWS Big Data

DECEMBER 17, 2024

Amazon Redshift , launched in 2013, has undergone significant evolution since its inception, allowing customers to expand the horizons of data warehousing and SQL analytics. Industry-leading price-performance Amazon Redshift offers up to three times better price-performance than alternative cloud data warehouses.

Data Lake

Data Lake Data Warehouse Data-driven Optimization

RDF-Star: Metadata Complexity Simplified

Ontotext

JUNE 10, 2021

And yeah, the real-world relationships among the entities represented in the data had to be fudged a bit to fit in the counterintuitive model of tabular data, but, in trade, you get reliability and speed. Ironically, relational databases only imply relationships between data points by whatever row or column they exist in.

Metadata

Metadata Cost-Benefit OLAP Modeling

Data-Driven Enterprise Architecture: Why Enterprise Architects Need to Look at Data First

erwin

MAY 31, 2019

It’s time to consider data-driven enterprise architecture. The traditional approach to enterprise architecture – the analysis, design, planning and implementation of IT capabilities for the successful execution of enterprise strategy – seems to be missing something … data. That’s right. This is what we call the Mezzo.

Data-driven

Data-driven Enterprise Metadata Strategy

The New O’Reilly Answers: The R in “RAG” Stands for “Royalties”

O'Reilly on Data

JUNE 14, 2024

Miso’s cofounders, Lucky Gunasekara and Andy Hsieh, are veterans of the Small Data Lab at Cornell Tech, which is devoted to private AI approaches for immersive personalization and content-centric explorations. The platform required a more effective way to connect learners directly to the key information that they sought.

Metadata

Metadata Publishing Data-driven Modeling

Use open table format libraries on AWS Glue 5.0 for Apache Spark

AWS Big Data

DECEMBER 4, 2024

Open table formats are emerging in the rapidly evolving domain of big data management, fundamentally altering the landscape of data storage and analysis. By providing a standardized framework for data representation, open table formats break down data silos, enhance data quality, and accelerate analytics at scale.

Snapshot

Snapshot Metadata Data Lake Optimization

Best Practices for Metadata Management

Alation

JULY 19, 2021

What Is Metadata? Metadata is information about data. A clothing catalog or dictionary are both examples of metadata repositories. Indeed, a popular online catalog, like Amazon, offers rich metadata around products to guide shoppers: ratings, reviews, and product details are all examples of metadata.

Metadata

Metadata Management Data Governance Machine Learning

Four Use Cases Proving the Benefits of Metadata-Driven Automation

erwin

FEBRUARY 7, 2019

Organization’s cannot hope to make the most out of a data-driven strategy, without at least some degree of metadata-driven automation. The volume and variety of data has snowballed, and so has its velocity. So it’s safe to say that organizations can’t reap the rewards of their data without automation.

Metadata

Metadata Insurance Data-driven Cost-Benefit

Three Emerging Analytics Products Derived from Value-driven Data Innovation and Insights Discovery in the Enterprise

Rocket-Powered Data Science

JULY 19, 2023

I recently saw an informal online survey that asked users which types of data (tabular, text, images, or “other”) are being used in their organization’s analytics applications. The results showed that (among those surveyed) approximately 90% of enterprise analytics applications are being built on tabular data.

Data-driven

Data-driven Enterprise Analytics Machine Learning

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

AWS Big Data

OCTOBER 21, 2024

Data is the most significant asset of any organization. However, enterprises often encounter challenges with data silos, insufficient access controls, poor governance, and quality issues. Embracing data as a product is the key to address these challenges and foster a data-driven culture.

Sales

Sales Data-driven Data Processing Key Performance Indicator

Amazon SageMaker Lakehouse now supports attribute-based access control

AWS Big Data

APRIL 24, 2025

Amazon SageMaker Lakehouse now supports attribute-based access control (ABAC) with AWS Lake Formation , using AWS Identity and Access Management (IAM) principals and session tags to simplify data access, grant creation, and maintenance. You can then query, analyze, and join the data using Redshift, Amazon Athena , Amazon EMR , and AWS Glue.

Sales

Sales Data Lake Management Data-driven

Implement a custom subscription workflow for unmanaged Amazon S3 assets published with Amazon DataZone

AWS Big Data

DECEMBER 19, 2024

Organizational data is often fragmented across multiple lines of business, leading to inconsistent and sometimes duplicate datasets. This fragmentation can delay decision-making and erode trust in available data. This solution enhances governance and simplifies access to unstructured data assets across the organization.

Publishing

Publishing Unstructured Data Metadata Data-driven

Deep automation in machine learning

O'Reilly on Data

DECEMBER 19, 2018

We need to do more than automate model building with autoML; we need to automate tasks at every stage of the data pipeline. In a previous post , we talked about applications of machine learning (ML) to software development, which included a tour through sample tools in data science and for managing data infrastructure.

Machine Learning

Machine Learning Software Metadata Testing

Do I Need a Data Catalog?

erwin

JUNE 26, 2020

If you’re serious about a data-driven strategy , you’re going to need a data catalog. Organizations need a data catalog because it enables them to create a seamless way for employees to access and consume data and business assets in an organized manner. Three Types of Metadata in a Data Catalog.

Metadata

Metadata Cost-Benefit Measurement Data-driven

The Enduring Significance of Data Modeling in the Modern Data-Driven Enterprise

erwin

AUGUST 31, 2023

Q: Is data modeling cool again? In today’s fast-paced digital landscape, data reigns supreme. The data-driven enterprise relies on accurate, accessible, and actionable information to make strategic decisions and drive innovation. A: It always was and is getting cooler!!

Data-driven

Data-driven Modeling Enterprise Structured Data

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

With the growing emphasis on data, organizations are constantly seeking more efficient and agile ways to integrate their data, especially from a wide variety of applications. In addition, organizations rely on an increasingly diverse array of digital systems, data fragmentation has become a significant challenge.

Data Integration

Data Integration Data Lake Statistics Data-driven

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

AWS Big Data

OCTOBER 11, 2023

Organizations with legacy, on-premises, near-real-time analytics solutions typically rely on self-managed relational databases as their data store for analytics workloads. Near-real-time streaming analytics captures the value of operational data and metrics to provide new insights to create business opportunities.

Management

Management Metadata Analytics Dashboards

Amazon OpenSearch Service launches flow builder to empower rapid AI search innovation

AWS Big Data

MAY 2, 2025

Through a visual designer, you can configure custom AI search flowsa series of AI-driven data enrichments performed during ingestion and search. Each processor applies a type of data transform such as encoding text into vector embeddings, or summarizing search results with a chatbot AI service.

Machine Learning

Machine Learning Visualization Dashboards Metadata

What you need to know about product management for AI

O'Reilly on Data

MARCH 31, 2020

AI products are automated systems that collect and learn from data to make user-facing decisions. All you need to know for now is that machine learning uses statistical techniques to give computer systems the ability to “learn” by being trained on existing data. Why AI software development is different.

Management

Management Machine Learning Experimentation Metrics

The Need For Personalized Data Journeys for Your Data Consumers

DataKitchen

OCTOBER 20, 2023

In today’s data-driven landscape, Data and Analytics Teams i ncreasingly face a unique set of challenges presented by Demanding Data Consumers who require a personalized level of Data Observability. Data Observability platforms often need to deliver this level of customization.

Insurance

Insurance Metadata Data-driven Data Quality

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

erwin

JULY 17, 2019

Once you’ve determined what part(s) of your business you’ll be innovating — the next step in a digital transformation strategy is using data to get there. Constructing A Digital Transformation Strategy: Data Enablement. Many organizations prioritize data collection as part of their digital transformation strategy.

Digital Transformation

Digital Transformation Strategy Metadata Data-driven

Top 10 Data Governance Trends for 2020: Data’s Real Value Comes Into Focus

erwin

JANUARY 3, 2020

Understanding the data governance trends for the year ahead will give business leaders and data professionals a competitive edge … Happy New Year! Regulatory compliance and data breaches have driven the data governance narrative during the past few years.

Data Governance

Data Governance Digital Transformation IoT Metadata

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

AWS Big Data

DECEMBER 16, 2024

The need to integrate diverse data sources has grown exponentially, but there are several common challenges when integrating and analyzing data from multiple sources, services, and applications. First, you need to create and maintain independent connections to the same data source for different services.

Visualization

Visualization Data Processing Testing Publishing

Visualize Amazon DynamoDB insights in Amazon QuickSight using the Amazon Athena DynamoDB connector and AWS Glue

AWS Big Data

NOVEMBER 17, 2023

DynamoDB offers built-in security, continuous backups, automated multi-Region replication, in-memory caching, and data import and export tools. The scalability and flexible data schema of DynamoDB make it well-suited for a variety of use cases. Data stored in DynamoDB is the basis for valuable business intelligence (BI) insights.

Visualization

Visualization Metadata Testing Internet of Things

Organize content across business units with enterprise-wide data governance using Amazon DataZone domain units and authorization policies

AWS Big Data

AUGUST 13, 2024

Amazon DataZone has announced a set of new data governance capabilities—domain units and authorization policies—that enable you to create business unit-level or team-level organization and manage policies according to your business needs. Organizations can adopt different approaches when defining and structuring domains and domain units.

Data Governance

Data Governance Metadata Enterprise Sales

Maximize your data dividends with active metadata

IBM Big Data Hub

NOVEMBER 28, 2022

Metadata management performs a critical role within the modern data management stack. It helps blur data silos, and empowers data and analytics teams to better understand the context and quality of data. This, in turn, builds trust in data and the decision-making to follow. Improve data discovery.

Metadata

Metadata Data Quality Data-driven Data Governance

Denodo Provides a Logical Approach to Data Management

David Menninger's Analyst Perspectives

OCTOBER 24, 2024

Although the terms data fabric and data mesh are often used interchangeably, I previously explained that they are distinct but complementary. The popularity of data fabric and data mesh has highlighted the importance of software providers, such as Denodo, that utilize data virtualization to enable logical data management.

Management

Management Data-driven Data Governance Data Lake

Amazon DataZone introduces OpenLineage-compatible data lineage visualization in preview

AWS Big Data

JULY 8, 2024

We are excited to announce the preview of API-driven, OpenLineage-compatible data lineage in Amazon DataZone to help you capture, store, and visualize lineage of data movement and transformations of data assets on Amazon DataZone. The lineage visualized includes activities inside the Amazon DataZone business data catalog.

Visualization

Visualization Metadata Publishing Sales

Data Governance Tools: What Are They? Are They Optional?

erwin

NOVEMBER 14, 2019

Data governance tools used to occupy a niche in an organization’s tech stack, but those days are gone. The rise of data-driven business and the complexities that come with it ushered in a soft mandate for data governance and data governance tools. It is also used to make data more easily understood and secure.

Data Governance

Data Governance Cost-Benefit Data-driven Metadata

Gartner Magic Quadrant for Metadata Management Includes Alation

Alation

FEBRUARY 20, 2020

Gartner predicts that “By 2020, 50% of information governance initiatives will be enacted with policies based on metadata alone.”. Magic Quadrant for Metadata Management Solutions , Guido de Simoni and Roxane Edjlali, August 10, 2017. Metadata management no longer refers to a static technical repository.

Metadata

Metadata Management Data-driven Data Governance

Informatica Embraces AI for Data Intelligence and Operations

David Menninger's Analyst Perspectives

MAY 8, 2025

It has been a little over a decade since the term data operations entered the analytics and data lexicon. It describes the application of agile development, DevOps and lean manufacturing by data engineering professionals in support of data production. Informatica is still closely associated with data integration.

Data Quality

Data Quality Data Governance Data Integration Software

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

Businesses are constantly evolving, and data leaders are challenged every day to meet new requirements. Customers are using AWS and Snowflake to develop purpose-built data architectures that provide the performance required for modern analytics and artificial intelligence (AI) use cases.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Why metadata management software is essential and what to expect

erwin

OCTOBER 4, 2021

Metadata management is essential to becoming a data-driven organization and reaping the competitive advantage your organization’s data offers. Gartner refers to metadata as data that is used to enhance the usability, comprehension, utility or functionality of any other data point.

Metadata

Metadata Software Management Visualization

The Semantic Web: 20 Years And a Handful of Enterprise Knowledge Graphs Later

Ontotext

JULY 29, 2021

The Semantic Web, both as a research field and a technology stack, is seeing mainstream industry interest, especially with the knowledge graph concept emerging as a pillar for data well and efficiently managed. And what are the commercial implications of semantic technologies for enterprise data? Source: tag.ontotext.com.

Enterprise

Enterprise Metadata Knowledge Discovery Management

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

AWS Big Data

JUNE 25, 2024

This post is co-authored by Vijay Gopalakrishnan, Director of Product, Salesforce Data Cloud. In today’s data-driven business landscape, organizations collect a wealth of data across various touch points and unify it in a central data warehouse or a data lake to deliver business insights.

Data Lake

Data Lake Cost-Benefit Data-driven Data Warehouse

Governing data in relational databases using Amazon DataZone

AWS Big Data

MAY 7, 2024

Data governance is a key enabler for teams adopting a data-driven culture and operational model to drive innovation with data. Amazon DataZone allows you to simply and securely govern end-to-end data assets stored in your Amazon Redshift data warehouses or data lakes cataloged with the AWS Glue data catalog.

Metadata

Metadata Data Lake Data Processing Data-driven

AI recommendations for descriptions in Amazon DataZone for enhanced business data cataloging and discovery is now generally available

AWS Big Data

APRIL 2, 2024

In March 2024, we announced the general availability of the generative artificial intelligence (AI) generated data descriptions in Amazon DataZone. In this post, we share what we heard from our customers that led us to add the AI-generated data descriptions and discuss specific customer use cases addressed by this capability.

Metadata

Metadata Metrics Data-driven Contextual Data

6 Case Studies on The Benefits of Business Intelligence And Analytics

datapine

JANUARY 31, 2022

Because things are changing and becoming more competitive in every sector of business, the benefits of business intelligence and proper use of data analytics are key to outperforming the competition. BI software uses algorithms to extract actionable insights from a company’s data and guide its strategic decisions.

Business Intelligence

Business Intelligence Analytics Cost-Benefit ROI

Harness Zero Copy data sharing from Salesforce Data Cloud to Amazon Redshift for Unified Analytics – Part 2

AWS Big Data

SEPTEMBER 12, 2024

In the era of digital transformation and data-driven decision making, organizations must rapidly harness insights from their data to deliver exceptional customer experiences and gain competitive advantage. Solution overview Salesforce Data Cloud provides a point-and-click experience to share data with a customer’s AWS account.

Data Lake

Data Lake Analytics Data-driven Data Strategy

Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Data quality is crucial in data pipelines because it directly impacts the validity of the business insights derived from the data. Today, many organizations use AWS Glue Data Quality to define and enforce data quality rules on their data at rest and in transit.

Data Quality

Data Quality Visualization Metadata Key Performance Indicator

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Webinars

Trending Sources

Write queries faster with Amazon Q generative SQL for Amazon Redshift

Webinars

Build a high-performance quant research platform with Apache Iceberg

Streamline data discovery with precise technical identifier search in Amazon SageMaker Unified Studio

Recap of Amazon Redshift key product announcements in 2024

RDF-Star: Metadata Complexity Simplified

Data-Driven Enterprise Architecture: Why Enterprise Architects Need to Look at Data First

The New O’Reilly Answers: The R in “RAG” Stands for “Royalties”

Use open table format libraries on AWS Glue 5.0 for Apache Spark

Best Practices for Metadata Management

Four Use Cases Proving the Benefits of Metadata-Driven Automation

Three Emerging Analytics Products Derived from Value-driven Data Innovation and Insights Discovery in the Enterprise

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

Amazon SageMaker Lakehouse now supports attribute-based access control

Implement a custom subscription workflow for unmanaged Amazon S3 assets published with Amazon DataZone

Deep automation in machine learning

Do I Need a Data Catalog?

The Enduring Significance of Data Modeling in the Modern Data-Driven Enterprise

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

Amazon OpenSearch Service launches flow builder to empower rapid AI search innovation

What you need to know about product management for AI

The Need For Personalized Data Journeys for Your Data Consumers

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

Top 10 Data Governance Trends for 2020: Data’s Real Value Comes Into Focus

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

Visualize Amazon DynamoDB insights in Amazon QuickSight using the Amazon Athena DynamoDB connector and AWS Glue

Organize content across business units with enterprise-wide data governance using Amazon DataZone domain units and authorization policies

Maximize your data dividends with active metadata

Denodo Provides a Logical Approach to Data Management

Amazon DataZone introduces OpenLineage-compatible data lineage visualization in preview

Data Governance Tools: What Are They? Are They Optional?

Gartner Magic Quadrant for Metadata Management Includes Alation

Informatica Embraces AI for Data Intelligence and Operations

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Why metadata management software is essential and what to expect

The Semantic Web: 20 Years And a Handful of Enterprise Knowledge Graphs Later

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

Governing data in relational databases using Amazon DataZone

AI recommendations for descriptions in Amazon DataZone for enhanced business data cataloging and discovery is now generally available

6 Case Studies on The Benefits of Business Intelligence And Analytics

Harness Zero Copy data sharing from Salesforce Data Cloud to Amazon Redshift for Unified Analytics – Part 2

Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone

Stay Connected