Data Governance, Data Lake and Metadata

Data Governance

Data Lake

Metadata

Collibra Brings Effective Data Governance to Line-of-Business

David Menninger's Analyst Perspectives

SEPTEMBER 28, 2021

Collibra is a data governance software company that offers tools for metadata management and data cataloging. The software enables organizations to find data quickly, identify its source and assure its integrity.

Data Governance

Data Governance Metadata Software Management

How BMW streamlined data access using AWS Lake Formation fine-grained access control

AWS Big Data

OCTOBER 29, 2024

To achieve this, they aimed to break down data silos and centralize data from various business units and countries into the BMW Cloud Data Hub (CDH). However, the initial version of CDH supported only coarse-grained access control to entire data assets, and hence it was not possible to scope access to data asset subsets.

Data Lake

Data Lake Sales Metadata Machine Learning

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Data landscape in EUROGATE and current challenges faced in data governance The EUROGATE Group is a conglomerate of container terminals and service providers, providing container handling, intermodal transports, maintenance and repair, and seaworthy packaging services. Eliminate centralized bottlenecks and complex data pipelines.

IoT

IoT Machine Learning Metadata Data-driven

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

In the era of big data, data lakes have emerged as a cornerstone for storing vast amounts of raw data in its native format. They support structured, semi-structured, and unstructured data, offering a flexible and scalable environment for data ingestion from multiple sources.

Metadata

Metadata Snapshot Data Lake Metrics

Doing Cloud Migration and Data Governance Right the First Time

erwin

OCTOBER 8, 2020

That means your cloud data assets must be available for use by the right people for the right purposes to maximize their security, quality and value. Why You Need Cloud Data Governance. Regulatory compliance is also a major driver of data governance (e.g., GDPR, CCPA, HIPAA, SOX, PIC DSS).

Data Governance

Data Governance Metadata Testing Data Lake

Integrating Data Governance and Enterprise Architecture

erwin

SEPTEMBER 3, 2020

Why should you integrate data governance (DG) and enterprise architecture (EA)? Data governance provides time-sensitive, current-state architecture information with a high level of quality. Data governance provides time-sensitive, current-state architecture information with a high level of quality.

Data Governance

Data Governance Enterprise Risk Data Lake

Data governance in the age of generative AI

AWS Big Data

FEBRUARY 29, 2024

Data is your generative AI differentiator, and a successful generative AI implementation depends on a robust data strategy incorporating a comprehensive data governance approach. Data governance is a critical building block across all these approaches, and we see two emerging areas of focus.

Data Governance

Data Governance Unstructured Data Metadata Data Lake

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

AWS Big Data

OCTOBER 30, 2024

Amazon DataZone now launched authentication supports through the Amazon Athena JDBC driver, allowing data users to seamlessly query their subscribed data lake assets via popular business intelligence (BI) and analytics tools like Tableau, Power BI, Excel, SQL Workbench, DBeaver, and more.

Visualization

Visualization Data Lake Testing Data Governance

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

In this post, we delve into the key aspects of using Amazon EMR for modern data management, covering topics such as data governance, data mesh deployment, and streamlined data discovery. Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Unlocking the true value of data often gets impeded by siloed information. Traditional data management—wherein each business unit ingests raw data in separate data lakes or warehouses—hinders visibility and cross-functional analysis. Amazon DataZone natively supports data sharing for Amazon Redshift data assets.

Data Lake

Data Lake Data Warehouse Data Governance Publishing

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

AWS Big Data

JULY 18, 2024

Over the years, organizations have invested in creating purpose-built, cloud-based data lakes that are siloed from one another. A major challenge is enabling cross-organization discovery and access to data across these multiple data lakes, each built on different technology stacks.

Data Lake

Data Lake Publishing Metadata Data-driven

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Statistics Optimization

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Under the federated mesh architecture, each divisional mesh functions as a node within the broader enterprise data mesh, maintaining a degree of autonomy in managing its data products. These nodes can implement analytical platforms like data lake houses, data warehouses, or data marts, all united by producing data products.

Metadata

Metadata Data Governance Data Quality Data-driven

Data Governance Makes Data Security Less Scary

erwin

OCTOBER 31, 2019

The Regulatory Rationale for Integrating Data Management & Data Governance. Now, as Cybersecurity Awareness Month comes to a close – and ghosts and goblins roam the streets – we thought it a good time to resurrect some guidance on how data governance can make data security less scary.

Data Governance

Data Governance Metadata Risk Data Lake

HEMA accelerates their data governance journey with Amazon DataZone

AWS Big Data

DECEMBER 19, 2024

Initially, the data inventories of different services were siloed within isolated environments, making data discovery and sharing across services manual and time-consuming for all teams involved. Implementing robust data governance is challenging. The following figure illustrates the data mesh architecture.

Data Governance

Data Governance Publishing Data-driven Metadata

How Data Governance Protects Sensitive Data

erwin

APRIL 2, 2021

How can companies protect their enterprise data assets, while also ensuring their availability to stewards and consumers while minimizing costs and meeting data privacy requirements? Data Security Starts with Data Governance. Lack of a solid data governance foundation increases the risk of data-security incidents.

Data Governance

Data Governance Cost-Benefit Metadata Risk

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Don’t Fear Artificial Intelligence; Embrace it Through Data Governance

CIO Business Intelligence

APRIL 29, 2022

This would be straightforward task were it not for the fact that, during the digital-era, there has been an explosion of data – collected and stored everywhere – much of it poorly governed, ill-understood, and irrelevant. Further, data management activities don’t end once the AI model has been developed.

Data Governance

Data Governance IT Data Lake Risk

Cloudera and Snowflake Partner to Deliver the Most Comprehensive Open Data Lakehouse

Cloudera

OCTOBER 23, 2024

In August, we wrote about how in a future where distributed data architectures are inevitable, unifying and managing operational and business metadata is critical to successfully maximizing the value of data, analytics, and AI.

Metadata

Metadata Data Lake Dashboards Interactive

AWS Lake Formation 2023 year in review

AWS Big Data

JANUARY 18, 2024

AWS Lake Formation and the AWS Glue Data Catalog form an integral part of a data governance solution for data lakes built on Amazon Simple Storage Service (Amazon S3) with multiple AWS analytics services integrating with them. In 2022 , we talked about the enhancements we had done to these services.

Data Lake

Data Lake Metadata Data Governance Statistics

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

Pillars of Knowledge, Best Practices for Data Governance

Cloudera

AUGUST 4, 2021

And if data security tops IT concerns, data governance should be their second priority. Not only is it critical to protect data, but data governance is also the foundation for data-driven businesses and maximizing value from data analytics. But it’s still not easy. But it’s still not easy.

Data Governance

Data Governance Metadata Data-driven Enterprise

Data Lakes: What Are They and Who Needs Them?

Jet Global

JULY 2, 2019

To address the flood of data and the needs of enterprise businesses to store, sort, and analyze that data, a new storage solution has evolved: the data lake. What’s in a Data Lake? Data warehouses do a great job of standardizing data from disparate sources for analysis. Taking a Dip.

Data Lake

Data Lake Data Warehouse Big Data Machine Learning

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

AWS Big Data

OCTOBER 21, 2024

However, enterprises often encounter challenges with data silos, insufficient access controls, poor governance, and quality issues. Embracing data as a product is the key to address these challenges and foster a data-driven culture. Amazon Athena is used to query, and explore the data.

Sales

Sales Data-driven Data Processing Key Performance Indicator

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

AWS Big Data

JULY 21, 2023

Data-driven organizations treat data as an asset and use it across different lines of business (LOBs) to drive timely insights and better business decisions. This leads to having data across many instances of data warehouses and data lakes using a modern data architecture in separate AWS accounts.

Data Lake

Data Lake Data Warehouse Marketing Management

How to modernize data lakes with a data lakehouse architecture

IBM Big Data Hub

JULY 5, 2023

Data Lakes have been around for well over a decade now, supporting the analytic operations of some of the largest world corporations. Such data volumes are not easy to move, migrate or modernize. The challenges of a monolithic data lake architecture Data lakes are, at a high level, single repositories of data at scale.

Data Lake

Data Lake Metadata Cost-Benefit Data Warehouse

Read and write S3 Iceberg table using AWS Glue Iceberg Rest Catalog from Open Source Apache Spark

AWS Big Data

DECEMBER 4, 2024

In today’s data-driven world , organizations are constantly seeking efficient ways to process and analyze vast amounts of information across data lakes and warehouses. This post will showcase how this data can also be queried by other data teams using Amazon Athena. Verify that you have Python version 3.7

Data Lake

Data Lake Metadata Insurance Data-driven

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

erwin

JULY 17, 2019

Your organization won’t be able to take complete advantage of analytics tools to become data-driven unless you establish a foundation for agile and complete data management. You need automated data mapping and cataloging through the integration lifecycle process, inclusive of data at rest and data in motion.

Digital Transformation

Digital Transformation Strategy Metadata Data-driven

Denodo Provides a Logical Approach to Data Management

David Menninger's Analyst Perspectives

OCTOBER 24, 2024

Data silos are a perennial data management problem for enterprises, with almost three-quarters (73%) of participants in ISG Research’s Data Governance Benchmark Research citing disparate data sources and systems as a data governance challenge.

Management

Management Data-driven Data Governance Data Lake

Governing data in relational databases using Amazon DataZone

AWS Big Data

MAY 7, 2024

Data governance is a key enabler for teams adopting a data-driven culture and operational model to drive innovation with data. Amazon DataZone allows you to simply and securely govern end-to-end data assets stored in your Amazon Redshift data warehouses or data lakes cataloged with the AWS Glue data catalog.

Metadata

Metadata Data Lake Data Processing Data-driven

Introducing AWS Glue crawler and create table support for Apache Iceberg format

AWS Big Data

AUGUST 16, 2023

Iceberg has become very popular for its support for ACID transactions in data lakes and features like schema and partition evolution, time travel, and rollback. Iceberg captures metadata information on the state of datasets as they evolve and change over time. Choose Create.

Data Lake

Data Lake Metadata Snapshot Analytics

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

AWS Big Data

JANUARY 24, 2023

AWS Lake Formation helps with enterprise data governance and is important for a data mesh architecture. It works with the AWS Glue Data Catalog to enforce data access and governance. This solution only replicates metadata in the Data Catalog, not the actual underlying data.

Data Architecture

Data Architecture Metadata Data Lake Snapshot

Driving Business Value and ROI from a Hybrid Cloud Data Lake

Alation

FEBRUARY 20, 2020

For many enterprises, a hybrid cloud data lake is no longer a trend, but becoming reality. With an on-premise deployment, enterprises have full control over data security, data access, and data governance. Data that needs to be tightly controlled (e.g. The Problem with Hybrid Cloud Environments.

Data Lake

Data Lake ROI Metadata Cost-Benefit

AWS Lake Formation 2022 year in review

AWS Big Data

JANUARY 31, 2023

Data governance is the collection of policies, processes, and systems that organizations use to ensure the quality and appropriate handling of their data throughout its lifecycle for the purpose of generating business value.

Data Lake

Data Lake Data Governance Data Architecture Machine Learning

Unlock data across organizational boundaries using Amazon DataZone – now generally available

AWS Big Data

OCTOBER 4, 2023

Collaboration – Analysts, data scientists, and data engineers often own different steps within the end-to-end analytics journey but do not have an simple way to collaborate on the same governed data, using the tools of their choice.

Metadata

Metadata Data Lake Publishing Data Governance

Where Do Data Catalogs Fit in Metadata Management?

Alation

FEBRUARY 13, 2020

In an earlier blog, I defined a data catalog as “a collection of metadata, combined with data management and search tools, that helps analysts and other data users to find the data that they need, serves as an inventory of available data, and provides information to evaluate fitness data for intended uses.”.

Metadata

Metadata Management Data Lake Data Governance

erwin, Microsoft and the Power of the Common Data Model

erwin

DECEMBER 17, 2020

Insights: Given the meaning of the data is the same, regardless of the domain it came from, an organization can use its data to power business insights. Compliance: It improves data governance to comply with such regulations as the General Data Protection Regulation (GDPR).

Modeling

Modeling Metadata Data-driven Data Lake

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

AWS Big Data

MARCH 7, 2023

A data hub is a center of data exchange that constitutes a hub of data repositories and is supported by data engineering, data governance, security, and monitoring services. A data hub contains data at multiple levels of granularity and is often not integrated.

Analytics

Analytics Data Warehouse Data Lake Metadata

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

AWS Big Data

JUNE 25, 2024

This post is co-authored by Vijay Gopalakrishnan, Director of Product, Salesforce Data Cloud. In today’s data-driven business landscape, organizations collect a wealth of data across various touch points and unify it in a central data warehouse or a data lake to deliver business insights.

Data Lake

Data Lake Cost-Benefit Data-driven Data Warehouse

Cross-account data collaboration with Amazon DataZone and AWS analytical tools

AWS Big Data

MARCH 5, 2025

Quick setup enables two default blueprints and creates the default environment profiles for the data lake and data warehouse default blueprints. You will then publish the data assets from these data sources. Add an AWS Glue data source to publish the new AWS Glue table. Review and choose Create.

Analytics

Analytics Publishing Metadata Sales

Amazon DataZone announces custom blueprints for AWS services

AWS Big Data

JUNE 26, 2024

New feature: Custom AWS service blueprints Previously, Amazon DataZone provided default blueprints that created AWS resources required for data lake, data warehouse, and machine learning use cases. You can build projects and subscribe to both unstructured and structured data assets within the Amazon DataZone portal.

Data Lake

Data Lake Data Warehouse Unstructured Data Data Governance

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Solutions data architect: These individuals design and implement data solutions for specific business needs, including data warehouses, data marts, and data lakes. Application data architect: The application data architect designs and implements data models for specific software applications.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Data Governance for Dummies: Your Questions, Answered

Alation

FEBRUARY 17, 2023

This past week, I had the pleasure of hosting Data Governance for Dummies author Jonathan Reichental for a fireside chat , along with Denise Swanson , Data Governance lead at Alation. Can you have proper data management without establishing a formal data governance program?

Data Governance

Data Governance Data Quality Metadata Cost-Benefit

6 BI challenges IT teams must address

CIO Business Intelligence

DECEMBER 21, 2022

Jim Hare, distinguished VP and analyst at Gartner, says that some people think they need to take all the data siloed in systems in various business units and dump it into a data lake. But what they really need to do is fundamentally rethink how data is managed and accessed,” he says.

IT Business Intelligence Sales Key Performance Indicator

Collibra Brings Effective Data Governance to Line-of-Business

How BMW streamlined data access using AWS Lake Formation fine-grained access control

Webinars

Trending Sources

How EUROGATE established a data mesh architecture using Amazon DataZone

Webinars

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

Doing Cloud Migration and Data Governance Right the First Time

Integrating Data Governance and Enterprise Architecture

Data governance in the age of generative AI

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

Choosing an open table format for your transactional data lake on AWS

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Data Governance Makes Data Security Less Scary

HEMA accelerates their data governance journey with Amazon DataZone

How Data Governance Protects Sensitive Data

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Don’t Fear Artificial Intelligence; Embrace it Through Data Governance

Cloudera and Snowflake Partner to Deliver the Most Comprehensive Open Data Lakehouse

AWS Lake Formation 2023 year in review

Top analytics announcements of AWS re:Invent 2024

Pillars of Knowledge, Best Practices for Data Governance

Data Lakes: What Are They and Who Needs Them?

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

How to modernize data lakes with a data lakehouse architecture

Read and write S3 Iceberg table using AWS Glue Iceberg Rest Catalog from Open Source Apache Spark

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

Denodo Provides a Logical Approach to Data Management

Governing data in relational databases using Amazon DataZone

Introducing AWS Glue crawler and create table support for Apache Iceberg format

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

Driving Business Value and ROI from a Hybrid Cloud Data Lake

AWS Lake Formation 2022 year in review

Unlock data across organizational boundaries using Amazon DataZone – now generally available

Where Do Data Catalogs Fit in Metadata Management?

erwin, Microsoft and the Power of the Common Data Model

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

Cross-account data collaboration with Amazon DataZone and AWS analytical tools

Amazon DataZone announces custom blueprints for AWS services

What is a data architect? Skills, salaries, and how to become a data framework master

Data Governance for Dummies: Your Questions, Answered

6 BI challenges IT teams must address

Stay Connected