Analytics, Data Integration and Metadata

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

With the growing emphasis on data, organizations are constantly seeking more efficient and agile ways to integrate their data, especially from a wide variety of applications. SageMaker Lakehouse gives you the flexibility to access and query your data in-place with all Apache Iceberg compatible tools and engines.

Data Integration

Data Integration Data Lake Statistics Data-driven

Bridging the gap between mainframe data and hybrid cloud environments

CIO Business Intelligence

FEBRUARY 27, 2025

A high hurdle many enterprises have yet to overcome is accessing mainframe data via the cloud. Mainframes hold an enormous amount of critical and sensitive business data including transactional information, healthcare records, customer data, and inventory metrics.

Metadata

Metadata Data Lake Cost-Benefit Forecasting

7 Benefits of Metadata Management

erwin

FEBRUARY 19, 2021

Metadata management is key to wringing all the value possible from data assets. However, most organizations don’t use all the data at their disposal to reach deeper conclusions about how to drive revenue, achieve regulatory compliance or accomplish other strategic objectives. What Is Metadata? Harvest data.

Metadata

Metadata Management Data Quality Cost-Benefit

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

RDF-Star: Metadata Complexity Simplified

Ontotext

JUNE 10, 2021

To handle such scenarios you need a transalytical graph database – a database engine that can deal with both frequent updates (OLTP workload) as well as with graph analytics (OLAP). Not Every Graph is a Knowledge Graph: Schemas and Semantic Metadata Matter. Metadata about Relationships Come in Handy. Schemas are powerful.

Metadata

Metadata Cost-Benefit OLAP Modeling

Build a high-performance quant research platform with Apache Iceberg

AWS Big Data

JANUARY 9, 2025

Iceberg offers distinct advantages through its metadata layer over Parquet, such as improved data management, performance optimization, and integration with various query engines. Unlike direct Amazon S3 access, Iceberg supports these operations on petabyte-scale data lakes without requiring complex custom code.

Metadata

Metadata Snapshot Cost-Benefit Optimization

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

Data lakes provide a unified repository for organizations to store and use large volumes of data. This enables more informed decision-making and innovative insights through various analytics and machine learning applications. This ensures that each change is tracked and reversible, enhancing data governance and auditability.

Metadata

Metadata Snapshot Data Lake Metrics

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

This week on the keynote stages at AWS re:Invent 2024, you heard from Matt Garman, CEO, AWS, and Swami Sivasubramanian, VP of AI and Data, AWS, speak about the next generation of Amazon SageMaker , the center for all of your data, analytics, and AI. The relationship between analytics and AI is rapidly evolving.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

For container terminal operators, data-driven decision-making and efficient data sharing are vital to optimizing operations and boosting supply chain efficiency. Enhance agility by localizing changes within business domains and clear data contracts. Eliminate centralized bottlenecks and complex data pipelines.

IoT

IoT Machine Learning Metadata Data-driven

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Metadata Management, Data Governance and Automation

erwin

NOVEMBER 6, 2019

According to IDC’s “Data Intelligence in Context” Technology Spotlight sponsored by erwin, “professionals who work with data spend 80 percent of their time looking for and preparing data and only 20 percent of their time on analytics.”. IT teams need the ability to smoothly generate hundreds of mappings and ETL jobs.

Metadata

Metadata Data Governance Management Cost-Benefit

How companies are building sustainable AI and ML initiatives

O'Reilly on Data

JANUARY 29, 2019

Companies are building or evaluating solutions in foundational technologies needed to sustain success in analytics and AI.

Deep Learning

Deep Learning Machine Learning Data Science Metadata

The Missing Link in Enterprise Data Governance: Metadata

Octopai

JUNE 26, 2020

In order to figure out why the numbers in the two reports didn’t match, Steve needed to understand everything about the data that made up those reports – when the report was created, who created it, any changes made to it, which system it was created in, etc. Enterprise data governance. Metadata in data governance.

Metadata

Metadata Data Governance Enterprise Reporting

How Metadata Makes Data Meaningful

erwin

DECEMBER 12, 2019

Metadata is an important part of data governance, and as a result, most nascent data governance programs are rife with project plans for assessing and documenting metadata. But in many scenarios, it seems that the underlying driver of metadata collection projects is that it’s just something you do for data governance.

Metadata

Metadata Data Governance Digital Transformation Data Quality

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. This premier event showcased groundbreaking advancements, keynotes from AWS leadership, hands-on technical sessions, and exciting product launches.

Analytics

Analytics Data Lake Metadata Data Warehouse

What Is a Metadata Catalog? (And How it Can Dramatically Improve Your Data Accuracy)

Octopai

JANUARY 31, 2022

If you’re a mystery lover, I’m sure you’ve read that classic tale: Sherlock Holmes and the Case of the Deceptive Data, and you know how a metadata catalog was a key plot element. In The Case of the Deceptive Data, Holmes is approached by B.I. He goes on to explain: Reasons for inaccurate data. Big data is BIG.

Metadata

Metadata IT Unstructured Data IoT

Are You Content with Your Organization’s Content Strategy?

Rocket-Powered Data Science

JULY 6, 2021

This is accomplished through tags, annotations, and metadata (TAM). granules) of the data collection for fast search, access, and retrieval is also important for efficient orchestration and delivery of the data that fuels AI, automation, and machine learning operations. Collect, curate, and catalog (i.e.,

Strategy

Strategy Machine Learning Metadata Knowledge Discovery

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JUNE 10, 2024

In this blog post, we dive into different data aspects and how Cloudinary breaks the two concerns of vendor locking and cost efficient data analytics by using Apache Iceberg, Amazon Simple Storage Service (Amazon S3 ), Amazon Athena , Amazon EMR , and AWS Glue. This concept makes Iceberg extremely versatile.

Data Lake

Data Lake Metadata Snapshot Analytics

Recap of Amazon Redshift key product announcements in 2024

AWS Big Data

DECEMBER 17, 2024

Amazon Redshift , launched in 2013, has undergone significant evolution since its inception, allowing customers to expand the horizons of data warehousing and SQL analytics. This allowed customers to scale read analytics workloads and offered isolation to help maintain SLAs for business-critical applications.

Data Lake

Data Lake Data Warehouse Data-driven Optimization

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

AWS Big Data

OCTOBER 21, 2024

Let’s briefly describe the capabilities of the AWS services we referred above: AWS Glue is a fully managed, serverless, and scalable extract, transform, and load (ETL) service that simplifies the process of discovering, preparing, and loading data for analytics. Amazon Athena is used to query, and explore the data.

Sales

Sales Data-driven Data Processing Key Performance Indicator

5 Ways Data Modeling Is Critical to Data Governance

erwin

JANUARY 9, 2020

That’s because it’s the best way to visualize metadata , and metadata is now the heart of enterprise data management and data governance/ intelligence efforts. So here’s why data modeling is so critical to data governance. erwin Data Modeler: Where the Magic Happens.

Data Governance

Data Governance Modeling Metadata Unstructured Data

Introducing MongoDB Atlas metadata collection with AWS Glue crawlers

AWS Big Data

FEBRUARY 6, 2023

AWS Glue is a serverless data integration service that makes it simple to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning (ML), and application development. MongoDB Atlas is a developer data service from AWS technology partner MongoDB, Inc.

Metadata

Metadata Data Lake Machine Learning Big Data

Amazon OpenSearch Service Under the Hood : OpenSearch Optimized Instances(OR1)

AWS Big Data

APRIL 17, 2024

Today, customers widely use OpenSearch Service for operational analytics because of its ability to ingest high volumes of data while also providing rich and interactive analytics. As your operational analytics data velocity and volume of data grows, bottlenecks may emerge.

Optimization

Optimization Snapshot Metadata Cost-Benefit

Data integrity vs. data quality: Is there a difference?

IBM Big Data Hub

JULY 13, 2023

When we talk about data integrity, we’re referring to the overarching completeness, accuracy, consistency, accessibility, and security of an organization’s data. Together, these factors determine the reliability of the organization’s data. In short, yes.

Data Quality

Data Quality Data Integration Metadata Cost-Benefit

The Power of Active Metadata

Data Virtualization

JULY 28, 2023

Reading Time: 2 minutes As the volume, variety, and velocity of data continue to surge, organizations still struggle to gain meaningful insights. This is where active metadata comes in. Listen to “Why is Active Metadata Management Essential?” What is Active Metadata? ” on Spreaker.

Metadata

Metadata Data Integration Management Data Science

What’s the Current State of Data Governance and Automation?

erwin

JANUARY 30, 2020

1 reason to implement data governance. However, the latest group of survey participants say better decision-making is their primary driver (62 percent), with analytics secondary (51 percent), and regulatory compliance coming in third (48 percent). And close to 50 percent have deployed data catalogs and business glossaries.

Data Governance

Data Governance Metadata Cost-Benefit Digital Transformation

Why data observability is essential to AI governance

erwin

DECEMBER 9, 2024

When it comes to using AI and machine learning across your organization, there are many good reasons to provide your data and analytics community with an intelligent data foundation. For instance, Large Language Models (LLMs) are known to ultimately perform better when data is structured.

Metadata

Metadata Data Quality Sales Modeling

Why Your Business Should Use a Data Catalog to Organize Its Data

Smart Data Collective

JULY 15, 2021

A data catalog serves the same purpose. By using metadata (or short descriptions), data catalogs help companies gather, organize, retrieve, and manage information. You can think of a data catalog as an enhanced Access database or library card catalog system. What Does a Data Catalog Do?

Metadata

Metadata IT Data-driven Data Quality

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

This cloud service was a significant leap from the traditional data warehousing solutions, which were expensive, not elastic, and required significant expertise to tune and operate. Amazon Redshift Serverless, generally available since 2021, allows you to run and scale analytics without having to provision and manage the data warehouse.

Data Warehouse

Data Warehouse Analytics Data Lake Machine Learning

How Metadata Makes Data Meaningful

erwin

DECEMBER 12, 2019

Metadata is an important part of data governance, and as a result, most nascent data governance programs are rife with project plans for assessing and documenting metadata. But in many scenarios, it seems that the underlying driver of metadata collection projects is that it’s just something you do for data governance.

Metadata

Metadata Data Governance Digital Transformation Data Quality

Becoming a machine learning company means investing in foundational technologies

O'Reilly on Data

MAY 21, 2019

We found that companies that have successfully adopted machine learning do so either by building on existing data products and services, or by modernizing existing models and algorithms. Not surprisingly, data integration and ETL were among the top responses, with 60% currently building or evaluating solutions in this area.

Machine Learning

Machine Learning Technology Deep Learning Data Science

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights. Analytics use cases on data lakes are always evolving. In this method, the metadata are recreated in an isolated environment and colocated with the existing data files.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

What is data governance? Best practices for managing data assets

CIO Business Intelligence

MARCH 24, 2023

The program must introduce and support standardization of enterprise data. Programs must support proactive and reactive change management activities for reference data values and the structure/use of master data and metadata. One of its key capabilities, TrustCheck, provides real-time “guardrails” to workflows.

Data Governance

Data Governance Management Metadata Data Quality

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

This data needs to be ingested into a data lake, transformed, and made available for analytics, machine learning (ML), and visualization. For this, Cargotec built an Amazon Simple Storage Service (Amazon S3) data lake and cataloged the data assets in AWS Glue Data Catalog.

Metadata

Metadata Data Lake Machine Learning Big Data

The Need For Personalized Data Journeys for Your Data Consumers

DataKitchen

OCTOBER 20, 2023

In today’s data-driven landscape, Data and Analytics Teams i ncreasingly face a unique set of challenges presented by Demanding Data Consumers who require a personalized level of Data Observability.

Insurance

Insurance Metadata Data-driven Data Quality

Denodo Provides a Logical Approach to Data Management

David Menninger's Analyst Perspectives

OCTOBER 24, 2024

At the heart of the Denodo Platform is a data virtualization access layer that provides a virtual representation of physical data sources, including operational and analytic data platforms, applications, file systems and application programming interfaces.

Management

Management Data-driven Data Governance Data Lake

What Is Data Modeling? Data Modeling Best Practices for Data-Driven Organizations

erwin

JANUARY 17, 2020

Data modeling is a process that enables organizations to discover, design, visualize, standardize and deploy high-quality data assets through an intuitive, graphical interface. Data models provide visualization, create additional metadata and standardize data design across the enterprise. SQL or NoSQL?

Data-driven

Data-driven Modeling Metadata Data Governance

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

erwin

JULY 17, 2019

Your organization won’t be able to take complete advantage of analytics tools to become data-driven unless you establish a foundation for agile and complete data management. You need automated data mapping and cataloging through the integration lifecycle process, inclusive of data at rest and data in motion.

Digital Transformation

Digital Transformation Strategy Metadata Data-driven

How to Do Data Modeling the Right Way

erwin

MAY 27, 2020

What, then, should users look for in a data modeling product to support their governance/intelligence requirements in the data-driven enterprise? Nine Steps to Data Modeling. Provide metadata and schema visualization regardless of where data is stored. naming and database standards, formatting options, and so on.

Modeling

Modeling Metadata Data Governance Visualization

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

They understand that a one-size-fits-all approach no longer works, and recognize the value in adopting scalable, flexible tools and open data formats to support interoperability in a modern data architecture to accelerate the delivery of new solutions.

Data Lake

Data Lake Snapshot Metadata Data Architecture

SAP Datasphere review: turning data from a technical problem to a business data product.

Jen Stirrup

MARCH 29, 2023

Put together, these factors help to provide the company with a competitive advantage since better data – and better use of data – leads to better strategies. Data as a product as well as a foundation The data foundation is critical for all types of reporting and analytics. Why is this interesting?

Data Warehouse

Data Warehouse Metadata Data Integration Business Intelligence

Extracting key insights from Amazon S3 access logs with AWS Glue for Ray

AWS Big Data

SEPTEMBER 7, 2023

We will partition and format the server access logs with Amazon Web Services (AWS) Glue , a serverless data integration service, to generate a catalog for access logs and create dashboards for insights. Both the user data and logs buckets must be in the same AWS Region and owned by the same account.

Metadata

Metadata Dashboards Metrics Visualization

Data confidence begins at the edge

CIO Business Intelligence

SEPTEMBER 23, 2024

For sectors such as industrial manufacturing and energy distribution, metering, and storage, embracing artificial intelligence (AI) and generative AI (GenAI) along with real-time data analytics, instrumentation, automation, and other advanced technologies is the key to meeting the demands of an evolving marketplace, but it’s not without risks.

Manufacturing

Manufacturing Internet of Things Metadata Risk

Why next-generation execs should care about data governance

IBM Big Data Hub

APRIL 9, 2018

There’s a general need for next-gen executives to not only understand corporate regulations, but be able to adhere to and follow them using metadata solutions like data governance. As the business world’s top asset becomes data, data governance will ensure that data and information being handled is consistent, reliable and trustworthy.

Data Governance

Data Governance Metadata Data Integration Analytics

Collibra Provides a Platform for Data Intelligence

David Menninger's Analyst Perspectives

OCTOBER 8, 2024

As I recently noted , the term “data intelligence” has been used by multiple providers across analytics and data for several years and is becoming more widespread as software providers respond to the need to provide enterprises with a holistic view of data production and consumption.

Data Quality

Data Quality Data Governance Enterprise Visualization

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Bridging the gap between mainframe data and hybrid cloud environments

Webinars

Trending Sources

7 Benefits of Metadata Management

Webinars

RDF-Star: Metadata Complexity Simplified

Build a high-performance quant research platform with Apache Iceberg

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

How EUROGATE established a data mesh architecture using Amazon DataZone

Data’s dark secret: Why poor quality cripples AI and growth

Metadata Management, Data Governance and Automation

How companies are building sustainable AI and ML initiatives

The Missing Link in Enterprise Data Governance: Metadata

How Metadata Makes Data Meaningful

Top analytics announcements of AWS re:Invent 2024

What Is a Metadata Catalog? (And How it Can Dramatically Improve Your Data Accuracy)

Are You Content with Your Organization’s Content Strategy?

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

Recap of Amazon Redshift key product announcements in 2024

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

5 Ways Data Modeling Is Critical to Data Governance

Introducing MongoDB Atlas metadata collection with AWS Glue crawlers

Amazon OpenSearch Service Under the Hood : OpenSearch Optimized Instances(OR1)

Data integrity vs. data quality: Is there a difference?

The Power of Active Metadata

What’s the Current State of Data Governance and Automation?

Why data observability is essential to AI governance

Why Your Business Should Use a Data Catalog to Organize Its Data

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

How Metadata Makes Data Meaningful

Becoming a machine learning company means investing in foundational technologies

Migrate an existing data lake to a transactional data lake using Apache Iceberg

What is data governance? Best practices for managing data assets

How Cargotec uses metadata replication to enable cross-account data sharing

The Need For Personalized Data Journeys for Your Data Consumers

Denodo Provides a Logical Approach to Data Management

What Is Data Modeling? Data Modeling Best Practices for Data-Driven Organizations

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

How to Do Data Modeling the Right Way

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

SAP Datasphere review: turning data from a technical problem to a business data product.

Extracting key insights from Amazon S3 access logs with AWS Glue for Ray

Data confidence begins at the edge

Why next-generation execs should care about data governance

Collibra Provides a Platform for Data Intelligence

Stay Connected