Data Governance, Metadata and Statistics

The state of data quality in 2020

O'Reilly on Data

FEBRUARY 11, 2020

They don’t have the resources they need to clean up data quality problems. The building blocks of data governance are often lacking within organizations. These include the basics, such as metadata creation and management, data provenance, data lineage, and other essentials. And that’s just the beginning.

Data Quality

Data Quality Metadata Data Governance Publishing

HEMA accelerates their data governance journey with Amazon DataZone

AWS Big Data

DECEMBER 19, 2024

Initially, the data inventories of different services were siloed within isolated environments, making data discovery and sharing across services manual and time-consuming for all teams involved. Implementing robust data governance is challenging.

Data Governance

Data Governance Publishing Data-driven Metadata

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

It addresses many of the shortcomings of traditional data lakes by providing features such as ACID transactions, schema evolution, row-level updates and deletes, and time travel. In this blog post, we’ll discuss how the metadata layer of Apache Iceberg can be used to make data lakes more efficient.

Metadata

Metadata Snapshot Data Lake Metrics

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

AI adoption in the enterprise 2020

O'Reilly on Data

MARCH 18, 2020

Whether it’s controlling for common risk factors—bias in model development, missing or poorly conditioned data, the tendency of models to degrade in production—or instantiating formal processes to promote data governance, adopters will have their work cut out for them as they work to establish reliable AI production lines.

Enterprise

Enterprise Deep Learning Data Governance Risk

There’s More to erwin Data Governance Automation Than Meets the AI

erwin

NOVEMBER 6, 2020

Prashant Parikh, erwin’s Senior Vice President of Software Engineering, talks about erwin’s vision to automate every aspect of the data governance journey to increase speed to insights. Although AI and ML are massive fields with tremendous value, erwin’s approach to data governance automation is much broader.

Data Governance

Data Governance Metadata Data-driven Visualization

Alation 2022.2: Open Data Quality Initiative and Enhanced Data Governance

Alation

MAY 24, 2022

generally available on May 24, Alation introduces the Open Data Quality Initiative for the modern data stack, giving customers the freedom to choose the data quality vendor that’s best for them with the added confidence that those tools will integrate seamlessly with Alation’s Data Catalog and Data Governance application.

Data Quality

Data Quality Data Governance Metadata Metrics

Becoming a machine learning company means investing in foundational technologies

O'Reilly on Data

MAY 21, 2019

You also need solutions that let you understand what data you have and who can access it. About a third of the respondents in the survey indicated they are interested in data governance systems and data catalogs. Metadata and artifacts needed for audits. Marquez (WeWork) and Databook (Uber). Source: O'Reilly.

Machine Learning

Machine Learning Technology Deep Learning Data Science

AWS Lake Formation 2023 year in review

AWS Big Data

JANUARY 18, 2024

AWS Lake Formation and the AWS Glue Data Catalog form an integral part of a data governance solution for data lakes built on Amazon Simple Storage Service (Amazon S3) with multiple AWS analytics services integrating with them. DataZone automatically manages the permissions of your shared data in the DataZone projects.

Data Lake

Data Lake Metadata Data Governance Statistics

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

S3 Tables integration with the AWS Glue Data Catalog is in preview, allowing you to stream, query, and visualize dataincluding Amazon S3 Metadata tablesusing AWS analytics services such as Amazon Data Firehose , Amazon Athena , Amazon Redshift, Amazon EMR, and Amazon QuickSight. With AWS Glue 5.0,

Analytics

Analytics Data Lake Metadata Data Warehouse

Maximize your data dividends with active metadata

IBM Big Data Hub

NOVEMBER 28, 2022

Metadata management performs a critical role within the modern data management stack. It helps blur data silos, and empowers data and analytics teams to better understand the context and quality of data. This, in turn, builds trust in data and the decision-making to follow. Improve data discovery.

Metadata

Metadata Data Quality Data-driven Data Governance

What is Data Lineage? Top 5 Benefits of Data Lineage

erwin

APRIL 29, 2020

The CEO also makes decisions based on performance and growth statistics. An understanding of the data’s origins and history helps answer questions about the origin of data in a Key Performance Indicator (KPI) reports, including: How the report tables and columns are defined in the metadata? Who are the data owners?

Metadata

Metadata Key Performance Indicator Data Governance Data Quality

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Application data architect: The application data architect designs and implements data models for specific software applications. Information/data governance architect: These individuals establish and enforce data governance policies and procedures. Are data architects in demand?

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Metadata enrichment – highly scalable data classification and data discovery

IBM Big Data Hub

JULY 28, 2022

Metadata enrichment is about scaling the onboarding of new data into a governed data landscape by taking data and applying the appropriate business terms, data classes and quality assessments so it can be discovered, governed and utilized effectively. Scalability and elasticity.

Metadata

Metadata Machine Learning Data Quality Statistics

5 Data Governance Mistakes to Avoid

Alation

APRIL 25, 2023

Whether you deal in customer contact information, website traffic statistics, sales data, or some other type of valuable information, you’ll need to put a framework of policies in place to manage your data seamlessly. Let’s take a closer look at what data governance is — and the top five mistakes to avoid when implementing it.

Data Governance

Data Governance Marketing Machine Learning Sales

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

Data in customers’ data lakes is used to fulfil a multitude of use cases, from real-time fraud detection for financial services companies, inventory and real-time marketing campaigns for retailers, or flight and hotel room availability for the hospitality industry.

Data Lake

Data Lake Metadata Statistics Optimization

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

This person (or group of individuals) ensures that the theory behind data quality is communicated to the development team. 2 – Data profiling. Data profiling is an essential process in the DQM lifecycle. Data monitoring and visualization: To be able to assess the quality of the data it is necessary to monitor it closely.

Data Quality

Data Quality Metrics Data-driven Management

Data Catalog Management 101: The Tools and Roles You Need for Success

Octopai

JUNE 27, 2022

Work out your organization’s list of needs, sort them in order of priority, and use that as the basis for evaluating data catalog candidates. Manually updating the catalog every time a data asset changes is a Sisyphean task, and you’d require an army of Sisyphuses to even attempt it.

Management

Management Metadata Visualization Data Governance

What’s New in CDP Private Cloud Base 7.1.7?

Cloudera

AUGUST 10, 2021

SDX enhancements for improved platform and data governance, including the following notable features: . Atlas / Kafka integration provides metadata collection for Kafa producers/consumers so that consumers can manage, govern, and monitor Kafka metadata and metadata lineage in the Atlas UI. x, and 6.3.x,

Metadata

Metadata Sales Statistics Management

Recognizing Organizations Leading the Way in Data Security & Governance

Cloudera

DECEMBER 20, 2021

Understanding that the future of banking is data-driven and cloud-based, Bank of the West embraced cloud computing and its benefits, like remote capabilities, integrated processes, and flexible systems. The platform is centralizing the data, data management & governance, and building custom controls for data ingestion into the system.

Metadata

Metadata Data-driven Cost-Benefit Digital Transformation

What is Automated Data Discovery?

Octopai

OCTOBER 6, 2019

– Visualizing your data landscape: By slicing and dicing the data landscape in different ways, what connections, relationships, and outliers can be found? – Analyzing the data: Using statistical methods, what insights can be gained by summarizing the data? What hidden trends can be identified?

Slice and Dice

Slice and Dice Metadata Visualization Data Governance

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

erwin

JANUARY 11, 2019

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms , including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. If you want more control over and more value from all your data, join us for a demo of erwin MM.

Data Governance

Data Governance Risk Metadata Management

5 Data Governance Mistakes to Avoid

Alation

APRIL 25, 2023

Whether you deal in customer contact information, website traffic statistics, sales data, or some other type of valuable information, you’ll need to put a framework of policies in place to manage your data seamlessly. Let’s take a closer look at what data governance is — and the top five mistakes to avoid when implementing it.

Data Governance

Data Governance Marketing Machine Learning Sales

IBM named a leader in the 2022 Gartner® Magic Quadrant™ for Data Quality Solutions

IBM Big Data Hub

NOVEMBER 4, 2022

Data observability capability makes data quality checks upstream possible. Data Governance. Ensuring data quality is critical for data governance initiatives. IBM’s holistic approach to Data Quality.

Data Quality

Data Quality Metadata Data Governance Data-driven

Don’t let your data pipeline slow to a trickle of low-quality data

IBM Big Data Hub

JULY 6, 2022

To help companies avoid that pitfall, IBM has recently announced the acquisition of Databand.ai, a leading provider of data observability solutions. The data observability difference . starts at the data source, collecting data pipeline metadata across key solutions in the modern data stack like Airflow, dbt, Databricks and many more.

Metadata

Metadata Data Quality Snapshot Cost-Benefit

Data Profiling: What It Is and How to Perfect It

Alation

APRIL 18, 2023

On the contrary, data profiling today describes an automated process, where a data user can “point and click” to return key results on a given asset, like aggregate functions, top patterns, outliers, inferred data types, and more. In summary, data profiling is a critical component of a comprehensive data governance strategy.

IT

IT Data Quality Metadata Data Governance

Alation Named a Leader in BARC Score Data Intelligence Platforms Report

Alation

SEPTEMBER 1, 2022

By definition, a data intelligence platform must serve a wide variety of user types and use cases – empowering them to collaborate in one shared space. The problem Data Intelligence Platforms solve. Why is a data intelligence platform needed in the first place? Get the new IDC Marketscape for Data Catalogs to learn more.

Reporting

Reporting Metadata Data Governance Interactive

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

AUGUST 8, 2019

In this episode I’ll cover themes from Sci Foo and important takeaways that data science teams should be tracking. First and foremost: there’s substantial overlap between what the scientific community is working toward for scholarly infrastructure and some of the current needs of data governance in industry. We did it again.”.

Data Science

Data Science Machine Learning Data Governance Statistics

The Role of AI and ML in Model Governance

Alation

JUNE 2, 2022

In part one of this series, I discussed how data management challenges have evolved and how data governance and security have to play in such challenges, with an eye to cloud migration and drift over time. All Machine Learning uses “algorithms,” many of which are no different from those used by statisticians and data scientists.

Modeling

Modeling Data Governance Statistics Unstructured Data

The importance of data ingestion and integration for enterprise AI

IBM Big Data Hub

JANUARY 9, 2024

High variance in a model may indicate the model works with training data but be inadequate for real-world industry use cases. Limited data scope and non-representative answers: When data sources are restrictive, homogeneous or contain mistaken duplicates, statistical errors like sampling bias can skew all results.

Enterprise

Enterprise Data Integration Data Quality Contextual Data

Why We Started the Data Intelligence Project

Alation

JULY 7, 2022

In 2013 I joined American Family Insurance as a metadata analyst. I had always been fascinated by how people find, organize, and access information, so a metadata management role after school was a natural choice. The use cases for metadata are boundless, offering opportunities for innovation in every sector.

Metadata

Metadata Data-driven Insurance Statistics

Alation and Einstein Analytics Announce Deep Integration For Data Cataloging and Governance

Alation

FEBRUARY 20, 2020

The head of sales needs the most up-to-date statistics on the state of the business, and they need it now — well, actually, yesterday. These recommendations organize the data about your data — what the industry for many years has called metadata. Creating a set of custom reports would take too much time.

Analytics

Analytics Metadata Dashboards Sales

What Is a Data Fabric and How Does a Data Catalog Support It?

Alation

JANUARY 25, 2022

As a reminder, here’s Gartner’s definition of data fabric: “A design concept that serves as an integrated layer (fabric) of data and connecting processes. In this blog, we will focus on the “integrated layer” part of this definition by examining each of the key layers of a comprehensive data fabric in more detail.

Metadata

Metadata IT Data-driven Metrics

The Proven Way for Healthcare Providers to Maintain HIPAA Compliance

Octopai

JANUARY 25, 2021

With such sensitive information at risk, the federal government passed the Health Insurance Portability and Accountability Act (HIPAA). This initiative is enforced to set standards for data governance in healthcare and keep medical information safe. million Americans.

Insurance

Insurance Metadata Statistics Reporting

Demystifying Modern Data Platforms

Cloudera

SEPTEMBER 15, 2022

As shown above, the data fabric provides the data services from the source data through to the delivery of data products, aligning well with the first and second elements of the modern data platform architecture. In June 2022, Barr Moses of Monte Carlo expanded on her initial article defining data observability.

Data Lake

Data Lake Data Architecture Data-driven Data Warehouse

You Don’t Know Data! (The Importance of Sound Definitions)

TDAN

JANUARY 3, 2024

Edwards Deming, the father of statistical quality control, said: “If you can’t describe what you are doing as a process, you don’t know what you’re doing.” When looking at the world of IT and applied to the dichotomy of software and data, Deming’s quote applies to the software part of that pair.

Statistics

Statistics Software Data Quality IT

The Gartner 2022 Leadership Vision for Data and Analytics Leaders Questions and Answers

Andrew White

JANUARY 9, 2022

We found anecdotal data that suggested things such as a) CDO’s with a business, more than a technical, background tend to be more effective or successful, and b) CDOs most often came from a business background, and c) those that were successful had a good chance at becoming CEO or CEO or some other CXO (but not really CIO).

Analytics

Analytics Measurement Data-driven Modeling

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Andrew White

JANUARY 11, 2021

But we are seeing increasing data suggesting that broad and bland data literacy programs, for example statistics certifying all employees of a firm, do not actually lead to the desired change. New data suggests that pinpoint or targeted efforts are likely to be more effective. We do have good examples and bad examples.

Data Analytics

Data Analytics Analytics Data-driven Finance

Why Metadata Management Automation is Crucial to the Healthcare Industry

Octopai

JULY 18, 2019

Add to that the fact that the service providers are typically scrutinized at a highly detailed level by government regulators—so much so that in some countries, the government is the sole service provider. Healthcare and Data Governance. All of the challenges described above, among others, are data problems.

Metadata

Metadata Management Data Governance Insurance

Bringing an AI Product to Market

O'Reilly on Data

JULY 28, 2020

Acquiring data is often difficult, especially in regulated industries. Once relevant data has been obtained, understanding what is valuable and what is simply noise requires statistical and scientific rigor. Garbage in, garbage out” holds true for AI, so good AI PMs must concern themselves with data health.

Marketing

Marketing Experimentation Metrics Testing

Data Science, Past & Future

Domino Data Lab

JULY 22, 2019

data science’s emergence as an interdisciplinary field – from industry, not academia. why data governance, in the context of machine learning is no longer a “dry topic” and how the WSJ’s “global reckoning on data governance” is potentially connected to “premiums on leveraging data science teams for novel business cases”.

Data Science

Data Science Machine Learning Data Governance Modeling

The Role of Data Governance During A Pandemic

Anmut

OCTOBER 29, 2020

Data governance - who's counting? The role of data governance. This large gap between reported figures raises tough questions on the reliability of COVID-19 tracking data. In dealing with situations like pandemic data, how important are aspects of data governance such as standardised definitions?

Data Governance

Data Governance Data Collection Data-driven Statistics

What is a Data Catalog?

erwin

APRIL 21, 2020

Another foundational purpose of a data catalog is to streamline, organize and process the thousands, if not millions, of an organization’s data assets to help consumers/users search for specific datasets and understand metadata , ownership, data lineage and usage.

Measurement

Measurement Sales Data Governance Metadata

Using a Data Catalog to Crisis-Proof Your Business

erwin

APRIL 21, 2020

Another foundational purpose of a data catalog is to streamline, organize and process the thousands, if not millions, of an organization’s data assets to help consumers/users search for specific datasets and understand metadata , ownership, data lineage and usage.

Measurement

Measurement Sales Data Governance Metadata

Cloudera Lakehouse Optimizer Makes it Easier Than Ever to Deliver High-Performance Iceberg Tables

Cloudera

OCTOBER 10, 2024

The open data lakehouse is quickly becoming the standard architecture for unified multifunction analytics on large volumes of data. It combines the flexibility and scalability of data lake storage with the data analytics, data governance, and data management functionality of the data warehouse.

Optimization

Optimization Snapshot Data Lake Cost-Benefit

The state of data quality in 2020

HEMA accelerates their data governance journey with Amazon DataZone

Webinars

Trending Sources

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

Webinars

AI adoption in the enterprise 2020

There’s More to erwin Data Governance Automation Than Meets the AI

Alation 2022.2: Open Data Quality Initiative and Enhanced Data Governance

Becoming a machine learning company means investing in foundational technologies

AWS Lake Formation 2023 year in review

Top analytics announcements of AWS re:Invent 2024

Maximize your data dividends with active metadata

What is Data Lineage? Top 5 Benefits of Data Lineage

What is a data architect? Skills, salaries, and how to become a data framework master

Metadata enrichment – highly scalable data classification and data discovery

5 Data Governance Mistakes to Avoid

Choosing an open table format for your transactional data lake on AWS

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Data Catalog Management 101: The Tools and Roles You Need for Success

What’s New in CDP Private Cloud Base 7.1.7?

Recognizing Organizations Leading the Way in Data Security & Governance

What is Automated Data Discovery?

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

5 Data Governance Mistakes to Avoid

IBM named a leader in the 2022 Gartner® Magic Quadrant™ for Data Quality Solutions

Don’t let your data pipeline slow to a trickle of low-quality data

Data Profiling: What It Is and How to Perfect It

Alation Named a Leader in BARC Score Data Intelligence Platforms Report

Themes and Conferences per Pacoid, Episode 12

The Role of AI and ML in Model Governance

The importance of data ingestion and integration for enterprise AI

Why We Started the Data Intelligence Project

Alation and Einstein Analytics Announce Deep Integration For Data Cataloging and Governance

What Is a Data Fabric and How Does a Data Catalog Support It?

The Proven Way for Healthcare Providers to Maintain HIPAA Compliance

Demystifying Modern Data Platforms

You Don’t Know Data! (The Importance of Sound Definitions)

The Gartner 2022 Leadership Vision for Data and Analytics Leaders Questions and Answers

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Why Metadata Management Automation is Crucial to the Healthcare Industry

Bringing an AI Product to Market

Data Science, Past & Future

The Role of Data Governance During A Pandemic

What is a Data Catalog?

Using a Data Catalog to Crisis-Proof Your Business

Cloudera Lakehouse Optimizer Makes it Easier Than Ever to Deliver High-Performance Iceberg Tables

Stay Connected