2012, Data Governance and Metadata

2012

Data Governance

Metadata

Becoming a machine learning company means investing in foundational technologies

O'Reilly on Data

MAY 21, 2019

Use ML to unlock new data types—e.g., Consider deep learning, a specific form of machine learning that resurfaced in 2011/2012 due to record-setting models in speech and computer vision. You also need solutions that let you understand what data you have and who can access it. Metadata and artifacts needed for audits.

Machine Learning

Machine Learning Technology Deep Learning Data Science

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

In this post, we delve into the key aspects of using Amazon EMR for modern data management, covering topics such as data governance, data mesh deployment, and streamlined data discovery. Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

AWS Big Data

JULY 18, 2024

The current method is largely manual, relying on emails and general communication, which not only increases overhead but also varies from one use case to another in terms of data governance. Data domain producers publish data assets using datasource run to Amazon DataZone in the Central Governance account.

Data Lake

Data Lake Publishing Metadata Data-driven

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

AWS Big Data

APRIL 25, 2024

This approach allows the team to process the raw data extracted from Account A to Account B, which is dedicated for data handling tasks. This makes sure the raw and processed data can be maintained securely separated across multiple accounts, if required, for enhanced data governance and security.

Metadata

Metadata Data Processing Management Testing

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

AWS Big Data

AUGUST 15, 2024

This streamlined architecture approach offers several advantages: Single source of truth – The Central IT team acts as the custodian of the combined and curated data from all business units, thereby providing a unified and consistent dataset. Similarly, individual business units produce their own domain-specific data.

Data Lake

Data Lake Data Warehouse Data Governance Publishing

How Novo Nordisk built distributed data governance and control at scale

AWS Big Data

APRIL 28, 2023

The first post of this series describes the overall architecture and how Novo Nordisk built a decentralized data mesh architecture, including Amazon Athena as the data query engine. The third post will show how end-users can consume data from their tool of choice, without compromising data governance.

Data Governance

Data Governance Management Data-driven Analytics

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

Paco Nathan ‘s latest column dives into data governance. This month’s article features updates from one of the early data conferences of the year, Strata Data Conference – which was held just last week in San Francisco. In particular, here’s my Strata SF talk “Overview of Data Governance” presented in article form.

Machine Learning

Machine Learning Data Governance Metadata Data Science

10 Years Later: Who’s the GOAT of Data Catalogs?

Alation

DECEMBER 15, 2022

December 2012: Alation forms and goes to work creating the first enterprise data catalog. Later, in its inaugural report on data catalogs, Forrester Research recognizes that “Alation started the MLDC trend.”. August 2017: Alation debuts as a leader in the Gartner MQ for Metadata Management Solutions.

Metadata

Metadata Data Governance Data Quality Marketing

Amazon DataZone announces custom blueprints for AWS services

AWS Big Data

JUNE 26, 2024

Administrators can customize Amazon DataZone to use existing AWS resources, enabling Amazon DataZone portal users to have federated access to those AWS services to catalog, share, and subscribe to data, thereby establishing data governance across the platform.

Data Lake

Data Lake Data Warehouse Unstructured Data Data Governance

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

JANUARY 22, 2024

Discussions with users showed they were happier to have faster access to data in a simpler way, a more structured data organization, and a clear mapping of who the producer is. A lot of progress has been made to advance their data-driven culture (data literacy, data sharing, and collaboration across business units).

Data-driven

Data-driven Advertising Metadata Data Architecture

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

AUGUST 8, 2019

In this episode I’ll cover themes from Sci Foo and important takeaways that data science teams should be tracking. First and foremost: there’s substantial overlap between what the scientific community is working toward for scholarly infrastructure and some of the current needs of data governance in industry. We did it again.”.

Data Science

Data Science Machine Learning Data Governance Statistics

Why We Started the Data Intelligence Project

Alation

JULY 7, 2022

In 2013 I joined American Family Insurance as a metadata analyst. I had always been fascinated by how people find, organize, and access information, so a metadata management role after school was a natural choice. The use cases for metadata are boundless, offering opportunities for innovation in every sector. The data scientist.

Metadata

Metadata Data-driven Insurance Statistics

Enrich your AWS Glue Data Catalog with generative AI metadata using Amazon Bedrock

AWS Big Data

NOVEMBER 15, 2024

Metadata can play a very important role in using data assets to make data driven decisions. Generating metadata for your data assets is often a time-consuming and manual task. First, we explore the option of in-context learning, where the LLM generates the requested metadata without documentation.

Metadata

Metadata Modeling Data-driven Machine Learning

Data Science, Past & Future

Domino Data Lab

JULY 22, 2019

data science’s emergence as an interdisciplinary field – from industry, not academia. why data governance, in the context of machine learning is no longer a “dry topic” and how the WSJ’s “global reckoning on data governance” is potentially connected to “premiums on leveraging data science teams for novel business cases”.

Data Science

Data Science Machine Learning Data Governance Modeling

Connect, share, and query where your data sits using Amazon SageMaker Unified Studio

AWS Big Data

MARCH 21, 2025

To strike a fine balance of democratizing data and AI access while maintaining strict compliance and regulatory standards, Amazon SageMaker Data and AI Governance is built into SageMaker Unified Studio. The table metadata is managed by Data Catalog. Data analysts discover the data and subscribe to the data.

Data Warehouse

Data Warehouse Metadata Publishing Sales

Data Leaders Brief

Becoming a machine learning company means investing in foundational technologies

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Webinars

Trending Sources

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

Webinars

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

How Novo Nordisk built distributed data governance and control at scale

Themes and Conferences per Pacoid, Episode 8

10 Years Later: Who’s the GOAT of Data Catalogs?

Amazon DataZone announces custom blueprints for AWS services

Design a data mesh on AWS that reflects the envisioned organization

Themes and Conferences per Pacoid, Episode 12

Why We Started the Data Intelligence Project

Enrich your AWS Glue Data Catalog with generative AI metadata using Amazon Bedrock

Data Science, Past & Future

Connect, share, and query where your data sits using Amazon SageMaker Unified Studio

Stay Connected