2012, Data Governance and Data Lake

2012

Data Governance

Data Lake

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Unlocking the true value of data often gets impeded by siloed information. Traditional data management—wherein each business unit ingests raw data in separate data lakes or warehouses—hinders visibility and cross-functional analysis. Amazon DataZone natively supports data sharing for Amazon Redshift data assets.

Data Lake

Data Lake Data Warehouse Data Governance Publishing

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

In this post, we delve into the key aspects of using Amazon EMR for modern data management, covering topics such as data governance, data mesh deployment, and streamlined data discovery. Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

AWS Big Data

JULY 18, 2024

Over the years, organizations have invested in creating purpose-built, cloud-based data lakes that are siloed from one another. A major challenge is enabling cross-organization discovery and access to data across these multiple data lakes, each built on different technology stacks.

Data Lake

Data Lake Publishing Metadata Data-driven

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Apply enterprise data governance and management using AWS Lake Formation and AWS IAM Identity Center

AWS Big Data

SEPTEMBER 26, 2024

In this blog post, there are three personas: Data Lake Administrator (with admin level access) User Silver from the Data Engineering group User Lead Auditor from the Auditor group. You will see how different personas in an organization can access the data without the need to modify their existing enterprise entitlements.

Data Governance

Data Governance Enterprise Management Data Lake

Amazon DataZone announces custom blueprints for AWS services

AWS Big Data

JUNE 26, 2024

New feature: Custom AWS service blueprints Previously, Amazon DataZone provided default blueprints that created AWS resources required for data lake, data warehouse, and machine learning use cases. You can build projects and subscribe to both unstructured and structured data assets within the Amazon DataZone portal.

Data Lake

Data Lake Data Warehouse Unstructured Data Data Governance

How Novo Nordisk built distributed data governance and control at scale

AWS Big Data

APRIL 28, 2023

The first post of this series describes the overall architecture and how Novo Nordisk built a decentralized data mesh architecture, including Amazon Athena as the data query engine. The third post will show how end-users can consume data from their tool of choice, without compromising data governance.

Data Governance

Data Governance Management Data-driven Analytics

Simplify external object access in Amazon Redshift using automatic mounting of the AWS Glue Data Catalog

AWS Big Data

JULY 28, 2023

Amazon Redshift now makes it easier for you to run queries in AWS data lakes by automatically mounting the AWS Glue Data Catalog. You no longer have to create an external schema in Amazon Redshift to use the data lake tables cataloged in the Data Catalog. There are additional changes required in IAM policy.

Data Lake

Data Lake Data Governance Data Warehouse Data-driven

Handle UPSERT data operations using open-source Delta Lake and AWS Glue

AWS Big Data

JANUARY 30, 2023

Many customers need an ACID transaction (atomic, consistent, isolated, durable) data lake that can log change data capture (CDC) from operational data sources. There is also demand for merging real-time data into batch data. Delta Lake framework provides these two capabilities. Choose Create policy.

Insurance

Insurance Data Lake Data-driven Analytics

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

Paco Nathan ‘s latest column dives into data governance. This month’s article features updates from one of the early data conferences of the year, Strata Data Conference – which was held just last week in San Francisco. In particular, here’s my Strata SF talk “Overview of Data Governance” presented in article form.

Machine Learning

Machine Learning Data Governance Metadata Data Science

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

JANUARY 22, 2024

Discussions with users showed they were happier to have faster access to data in a simpler way, a more structured data organization, and a clear mapping of who the producer is. A lot of progress has been made to advance their data-driven culture (data literacy, data sharing, and collaboration across business units).

Data-driven

Data-driven Advertising Metadata Data Architecture

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

AUGUST 8, 2019

In this episode I’ll cover themes from Sci Foo and important takeaways that data science teams should be tracking. First and foremost: there’s substantial overlap between what the scientific community is working toward for scholarly infrastructure and some of the current needs of data governance in industry. We did it again.”.

Data Science

Data Science Machine Learning Data Governance Statistics

Why We Started the Data Intelligence Project

Alation

JULY 7, 2022

To answer these questions we need to look at how data roles within the job market have evolved, and how academic programs have changed to meet new workforce demands. In the 2010s, the growing scope of the data landscape gave rise to a new profession: the data scientist. The data scientist.

Metadata

Metadata Data-driven Insurance Statistics

Simplify AWS Glue job orchestration and monitoring with Amazon MWAA

AWS Big Data

MAY 19, 2023

Organizations across all industries have complex data processing requirements for their analytical use cases across different analytics systems, such as data lakes on AWS , data warehouses ( Amazon Redshift ), search ( Amazon OpenSearch Service ), NoSQL ( Amazon DynamoDB ), machine learning ( Amazon SageMaker ), and more.

Machine Learning

Machine Learning Metrics Big Data Management

Configure cross-account access of Amazon SageMaker Lakehouse multi-catalog tables using AWS Glue 5.0 Spark

AWS Big Data

MAY 9, 2025

Many organizations build and operate enterprise-wide data mesh architectures using the AWS Glue Data Catalog and AWS Lake Formation for their Amazon Simple Storage Service (Amazon S3) based data lakes. AWS Glue is a serverless service that makes data integration simpler, faster, and cheaper. and Python 3.11.

Data Lake

Data Lake Data Warehouse Marketing Management

Data Leaders Brief

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Webinars

Trending Sources

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

Webinars

Apply enterprise data governance and management using AWS Lake Formation and AWS IAM Identity Center

Amazon DataZone announces custom blueprints for AWS services

How Novo Nordisk built distributed data governance and control at scale

Simplify external object access in Amazon Redshift using automatic mounting of the AWS Glue Data Catalog

Handle UPSERT data operations using open-source Delta Lake and AWS Glue

Themes and Conferences per Pacoid, Episode 8

Design a data mesh on AWS that reflects the envisioned organization

Themes and Conferences per Pacoid, Episode 12

Why We Started the Data Intelligence Project

Simplify AWS Glue job orchestration and monitoring with Amazon MWAA

Configure cross-account access of Amazon SageMaker Lakehouse multi-catalog tables using AWS Glue 5.0 Spark

Stay Connected