Data Architecture, Management and Metadata

Manage concurrent write conflicts in Apache Iceberg on the AWS Glue Data Catalog

AWS Big Data

APRIL 8, 2025

In modern data architectures, Apache Iceberg has emerged as a popular table format for data lakes, offering key features including ACID transactions and concurrent write support. However, commits can still fail if the latest metadata is updated after the base metadata version is established.

Snapshot

Snapshot Management Metadata Big Data

Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

erwin

OCTOBER 24, 2019

Untapped data, if mined, represents tremendous potential for your organization. While there has been a lot of talk about big data over the years, the real hero in unlocking the value of enterprise data is metadata , or the data about the data. Metadata Is the Heart of Data Intelligence.

Metadata

Metadata Management Data-driven Data Architecture

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

AWS Big Data

NOVEMBER 14, 2024

The landscape of big data management has been transformed by the rising popularity of open table formats such as Apache Iceberg, Apache Hudi, and Linux Foundation Delta Lake. These formats, designed to address the limitations of traditional data storage systems, have become essential in modern data architectures.

Metadata

Metadata Data Warehouse Big Data Data Lake

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How to Manage Risk with Modern Data Architectures

Cloudera

JUNE 29, 2023

To improve the way they model and manage risk, institutions must modernize their data management and data governance practices. Up your liquidity risk management game Historically, technological limitations made it difficult for financial institutions to accurately forecast and manage liquidity risk.

Data Architecture

Data Architecture Risk Management Risk Management

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Their terminal operations rely heavily on seamless data flows and the management of vast volumes of data. With the addition of these technologies alongside existing systems like terminal operating systems (TOS) and SAP, the number of data producers has grown substantially. This process is shown in the following figure.

IoT

IoT Machine Learning Metadata Data-driven

What is a Data Mesh?

DataKitchen

AUGUST 3, 2021

The data mesh design pattern breaks giant, monolithic enterprise data architectures into subsystems or domains, each managed by a dedicated team. Third-generation – more or less like the previous generation but with streaming data, cloud, machine learning and other (fill-in-the-blank) fancy tools. See the pattern?

Data Architecture

Data Architecture Data Lake Cost-Benefit Data Warehouse

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

We also examine how centralized, hybrid and decentralized data architectures support scalable, trustworthy ecosystems. As data-centric AI, automated metadata management and privacy-aware data sharing mature, the opportunity to embed data quality into the enterprises core has never been more significant.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

How Metadata Makes Data Meaningful

erwin

DECEMBER 12, 2019

Metadata is an important part of data governance, and as a result, most nascent data governance programs are rife with project plans for assessing and documenting metadata. But in many scenarios, it seems that the underlying driver of metadata collection projects is that it’s just something you do for data governance.

Metadata

Metadata Data Governance Digital Transformation Data Quality

What is data governance? Best practices for managing data assets

CIO Business Intelligence

MARCH 24, 2023

It encompasses the people, processes, and technologies required to manage and protect data assets. The Data Management Association (DAMA) International defines it as the “planning, oversight, and control over management of data and the use of data and data-related sources.”

Data Governance

Data Governance Management Metadata Data Quality

Cloudera and Snowflake Partner to Deliver the Most Comprehensive Open Data Lakehouse

Cloudera

OCTOBER 23, 2024

In August, we wrote about how in a future where distributed data architectures are inevitable, unifying and managing operational and business metadata is critical to successfully maximizing the value of data, analytics, and AI.

Metadata

Metadata Data Lake Dashboards Interactive

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

AWS Big Data

SEPTEMBER 11, 2024

Their large inventory requires extensive supply chain management to source parts, make products, and distribute them globally. This post describes how HPE Aruba automated their Supply Chain management pipeline, and re-architected and deployed their data solution by adopting a modern data architecture on AWS.

Data Architecture

Data Architecture Optimization Data Warehouse Metadata

Unstructured data management and governance using AWS AI/ML and analytics services

AWS Big Data

OCTOBER 25, 2023

Additionally, we show how to use AWS AI/ML services for analyzing unstructured data. Why it’s challenging to process and manage unstructured data Unstructured data makes up a large proportion of the data in the enterprise that can’t be stored in a traditional relational database management systems (RDBMS).

Unstructured Data

Unstructured Data Metadata Management Analytics

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

erwin

JULY 12, 2019

In light of recent, high-profile data breaches, it’s past-time we re-examined strategic data governance and its role in managing regulatory requirements. for alleged violations of the European Union’s General Data Protection Regulation (GDPR). Manage policies and rules. Five Steps to GDPR/CCPA Compliance.

Data Governance

Data Governance Management Metadata Risk Management

The Data Turf Wars are Over, But the Metadata Turf Wars Have Just Begun

Cloudera

AUGUST 6, 2024

Open data is the future. And for that future to be a reality, data teams must shift their attention to metadata, the new turf war for data. The need for unified metadata While open and distributed architectures offer many benefits, they come with their own set of challenges. Unifying the data isn’t enough.

Metadata

Metadata Cost-Benefit Management Enterprise

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

AUGUST 30, 2022

That should be easy, but when agencies don’t share data or applications, they don’t have a unified view of people. As such, managers at different agencies need to sort through multiple systems to make sure these documents are delivered correctly—even though they all apply to the same individuals.”. Modern data architectures.

Data Architecture

Data Architecture Data Lake Data Warehouse Metadata

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

AWS Big Data

JANUARY 24, 2023

This solution only replicates metadata in the Data Catalog, not the actual underlying data. To have a redundant data lake using Lake Formation and AWS Glue in an additional Region, we recommend replicating the Amazon S3-based storage using S3 replication , S3 sync, aws-s3-copy-sync-using-batch or S3 Batch replication process.

Data Architecture

Data Architecture Metadata Data Lake Snapshot

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

While traditional extract, transform, and load (ETL) processes have long been a staple of data integration due to its flexibility, for common use cases such as replication and ingestion, they often prove time-consuming, complex, and less adaptable to the fast-changing demands of modern data architectures. What is zero-ETL?

Data Integration

Data Integration Data Lake Statistics Data-driven

5 Ways Data Modeling Is Critical to Data Governance

erwin

JANUARY 9, 2020

Enterprises are trying to manage data chaos. They also face increasing regulatory pressure because of global data regulations , such as the European Union’s General Data Protection Regulation (GDPR) and the new California Consumer Privacy Act (CCPA), that went into effect last week on Jan.

Data Governance

Data Governance Modeling Metadata Unstructured Data

Metadata Improves Security, Quality, and Transparency

TDAN

JULY 20, 2021

They use data better. How does Spotify win against a competitor like Apple? Using machine learning and AI, Spotify creates value for their users by providing a more personalized experience.

Metadata

Metadata Machine Learning Data Architecture Data Strategy

Modern Data Modeling: The Foundation of Enterprise Data Management and Data Governance

erwin

MAY 13, 2020

The role of data modeling (DM) has expanded to support enterprise data management, including data governance and intelligence efforts. After all, you can’t manage or govern what you can’t see, much less use it to make smart decisions. Types of Data Models: Conceptual, Logical and Physical.

Data Governance

Data Governance Enterprise Modeling Management

Top 10 Metadata Management Influencers, Sites, and Blogs You Must Follow in 2021

Octopai

APRIL 19, 2021

Aptly named, metadata management is the process in which BI and Analytics teams manage metadata, which is the data that describes other data. In other words, data is the context and metadata is the content. Without metadata, BI teams are unable to understand the data’s full story.

Metadata

Metadata Management Business Intelligence Data Governance

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

Ask questions in plain English to find the right datasets, automatically generate SQL queries, or create data pipelines without writing code. With this launch, you can query data regardless of where it is stored with support for a wide range of use cases, including analytics, ad-hoc querying, data science, machine learning, and generative AI.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

Dive deep into security management: The Data on EKS Platform

AWS Big Data

APRIL 29, 2024

In the realm of big data, securing data on cloud applications is crucial. This post explores the deployment of Apache Ranger for permission management within the Hadoop ecosystem on Amazon EKS. The following diagram illustrates the solution architecture.

Management

Management Big Data Data Warehouse Metadata

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

They understand that a one-size-fits-all approach no longer works, and recognize the value in adopting scalable, flexible tools and open data formats to support interoperability in a modern data architecture to accelerate the delivery of new solutions.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Metadata, the Neglected Stepchild of IT

Data Virtualization

DECEMBER 8, 2022

Nowadays, we no longer use the term DD/DS, but “data catalog” or simply “metadata system”. The post Metadata, the Neglected Stepchild of IT appeared first on Data Virtualization blog - Data Integration and Modern Data Management Articles, Analysis and Information. It was written by L.

Metadata

Metadata IT Publishing Data Integration

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architect role Data architects are senior visionaries who translate business requirements into technology requirements and define data standards and principles, often in support of data or digital transformations. Data architects are frequently part of a data science team and tasked with leading data system projects.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Enterprise Data Management — Driving Large-Scale Change in Your Organization

Sisense

JULY 6, 2020

Employing Enterprise Data Management (EDM). What is enterprise data management? Companies looking to do more with data and insights need an effective EDM setup in place. The team in charge of your company’s EDM is focused on a set of processes, practices, and activities across the entire data lineage process.

Enterprise

Enterprise Management Data Architecture Data-driven

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Cloudera

SEPTEMBER 29, 2022

Each of these trends claim to be complete models for their data architectures to solve the “everything everywhere all at once” problem. Data teams are confused as to whether they should get on the bandwagon of just one of these trends or pick a combination. First, we describe how data mesh and data fabric could be related.

Data Architecture

Data Architecture Data Warehouse Metadata Sales

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

Data fabric and data mesh are emerging data management concepts that are meant to address the organizational change and complexities of understanding, governing and working with enterprise data in a hybrid multicloud ecosystem. The good news is that both data architecture concepts are complimentary.

Management

Management Metadata Data Architecture Data Lake

Making OT-IT integration a reality with new data architectures and generative AI

CIO Business Intelligence

FEBRUARY 20, 2024

Where all data – structured, semi-structured, and unstructured – is sourced, unified, and exploited in automated processes, AI tools and by highly skilled, but over-stretched, employees. Legacy data management is holding back manufacturing transformation Until now, however, this vision has remained out of reach.

Data Architecture

Data Architecture Unstructured Data Manufacturing IT

How Zurich Insurance Group built a log management solution on AWS

AWS Big Data

JULY 16, 2024

The Zurich Cyber Fusion Center management team faced similar challenges, such as balancing licensing costs to ingest and long-term retention requirements for both business application log and security log data within the existing SIEM architecture. Previously, P2 logs were ingested into the SIEM.

Insurance

Insurance Management Cost-Benefit Optimization

Data architecture strategy for data quality

IBM Big Data Hub

JANUARY 5, 2023

Several factors determine the quality of your enterprise data like accuracy, completeness, consistency, to name a few. But there’s another factor of data quality that doesn’t get the recognition it deserves: your data architecture. How the right data architecture improves data quality.

Data Architecture

Data Architecture Data Quality Strategy Data Lake

How Metadata Makes Data Meaningful

erwin

DECEMBER 12, 2019

Metadata is an important part of data governance, and as a result, most nascent data governance programs are rife with project plans for assessing and documenting metadata. But in many scenarios, it seems that the underlying driver of metadata collection projects is that it’s just something you do for data governance.

Metadata

Metadata Data Governance Digital Transformation Data Quality

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Amazon SageMaker Lakehouse provides an open data architecture that reduces data silos and unifies data across Amazon Simple Storage Service (Amazon S3) data lakes, Redshift data warehouses, and third-party and federated data sources. With AWS Glue 5.0,

Analytics

Analytics Data Lake Metadata Data Warehouse

Metadata is the Magic Behind Data Fabric

TDAN

MAY 31, 2022

The main goal of creating an enterprise data fabric is not new. It is the ability to deliver the right data at the right time, in the right shape, and to the right data consumer, irrespective of how and where it is stored. Data fabric is the common “net” that stitches integrated data from multiple data […].

Metadata

Metadata Enterprise IT Data Architecture

Introducing Apache Iceberg in Cloudera Data Platform

Cloudera

FEBRUARY 22, 2022

Over the past decade, the successful deployment of large scale data platforms at our customers has acted as a big data flywheel driving demand to bring in even more data, apply more sophisticated analytics, and on-board many new data practitioners from business analysts to data scientists.

Snapshot

Snapshot Metadata Cost-Benefit Data Architecture

Data Governance 2.0: The CIO’s Guide to Collaborative Data Governance

erwin

DECEMBER 6, 2019

While many organizations are aware of the need to implement a formal data governance initiative, many have faced obstacles getting started. Creating a Culture of Data Governance. Team Resources : Most successful organizations have established a formal data management group at the enterprise level. Data Security.

Data Governance

Data Governance Metadata Enterprise Data-driven

Data Management 20/20: Anatomy of a Business Glossary Definition

TDAN

AUGUST 18, 2020

Standards exist for naming conventions, abbreviations and other pertinent metadata properties. Consistent business meaning is important because distinctions between business terms are not typically well defined or documented. What are the standards for writing […].

Management

Management Metadata Data Architecture IT

Boosting Object Storage Performance with Ozone Manager

Cloudera

JULY 19, 2023

The Ozone Manager is a critical component of Ozone. It is a replicated, highly-available service that is responsible for managing the metadata for all objects stored in Ozone. As Ozone scales to exabytes of data, it is important to ensure that Ozone Manager can perform at scale.

Management

Management Metadata Metrics Optimization

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JUNE 10, 2024

The open table format accelerates companies’ adoption of a modern data strategy because it allows them to use various tools on top of a single copy of the data. A solution based on Apache Iceberg encompasses complete data management, featuring simple built-in table optimization capabilities within an existing storage solution.

Data Lake

Data Lake Metadata Snapshot Analytics

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

Over the years, data lakes on Amazon Simple Storage Service (Amazon S3) have become the default repository for enterprise data and are a common choice for a large set of users who query data for a variety of analytics and machine leaning use cases. Analytics use cases on data lakes are always evolving.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Accelerate SQL code migration from Google BigQuery to Amazon Redshift using BladeBridge

AWS Big Data

NOVEMBER 7, 2024

BladeBridge offers a comprehensive suite of tools that automate much of the complex conversion work, allowing organizations to quickly and reliably transition their data analytics capabilities to the scalable Amazon Redshift data warehouse. Amazon Redshift is a fully managed data warehouse service offered by Amazon Web Services (AWS).

Data Warehouse

Data Warehouse Reporting Big Data Data Lake

The Future Is Hybrid Data, Embrace It

Cloudera

JUNE 7, 2022

The cause is hybrid data – the massive amounts of data created everywhere businesses operate – in clouds, on-prem, and at the edge. Only a fraction of data created is actually stored and managed, with analysts estimating it to be between 4 – 6 ZB in 2020. Today, we are leading the way in hybrid data.

IT

IT Data Architecture Unstructured Data Big Data

Habib Bank manages data at scale with Cloudera Data Platform

Cloudera

NOVEMBER 17, 2022

We needed a solution to manage our data at scale, to provide greater experiences to our customers. With Cloudera Data Platform, we aim to unlock value faster and offer consistent data security and governance to meet this goal. Aqeel Ahmed Jatoi, Lead – Architecture, Governance and Control, Habib Bank Limited.

Management

Management Data Lake Consulting Unstructured Data

Manage concurrent write conflicts in Apache Iceberg on the AWS Glue Data Catalog

Very Meta … Unlocking Data’s Potential with Metadata Management Solutions

Webinars

Trending Sources

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

Webinars

How to Manage Risk with Modern Data Architectures

How EUROGATE established a data mesh architecture using Amazon DataZone

What is a Data Mesh?

Data’s dark secret: Why poor quality cripples AI and growth

How Metadata Makes Data Meaningful

What is data governance? Best practices for managing data assets

Cloudera and Snowflake Partner to Deliver the Most Comprehensive Open Data Lakehouse

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

Unstructured data management and governance using AWS AI/ML and analytics services

Using Strategic Data Governance to Manage GDPR/CCPA Complexity

The Data Turf Wars are Over, But the Metadata Turf Wars Have Just Begun

Breaking State and Local Data Silos with Modern Data Architectures

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

5 Ways Data Modeling Is Critical to Data Governance

Metadata Improves Security, Quality, and Transparency

Modern Data Modeling: The Foundation of Enterprise Data Management and Data Governance

Top 10 Metadata Management Influencers, Sites, and Blogs You Must Follow in 2021

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Dive deep into security management: The Data on EKS Platform

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Metadata, the Neglected Stepchild of IT

What is a data architect? Skills, salaries, and how to become a data framework master

Enterprise Data Management — Driving Large-Scale Change in Your Organization

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Augmented data management: Data fabric versus data mesh

Making OT-IT integration a reality with new data architectures and generative AI

How Zurich Insurance Group built a log management solution on AWS

Data architecture strategy for data quality

How Metadata Makes Data Meaningful

Top analytics announcements of AWS re:Invent 2024

Metadata is the Magic Behind Data Fabric

Introducing Apache Iceberg in Cloudera Data Platform

Data Governance 2.0: The CIO’s Guide to Collaborative Data Governance

Data Management 20/20: Anatomy of a Business Glossary Definition

Boosting Object Storage Performance with Ozone Manager

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Accelerate SQL code migration from Google BigQuery to Amazon Redshift using BladeBridge

The Future Is Hybrid Data, Embrace It

Habib Bank manages data at scale with Cloudera Data Platform

Stay Connected