2012, Metadata and Risk - Data Leaders Brief

2012

Metadata

Risk

Becoming a machine learning company means investing in foundational technologies

O'Reilly on Data

MAY 21, 2019

Consider deep learning, a specific form of machine learning that resurfaced in 2011/2012 due to record-setting models in speech and computer vision. Metadata and artifacts needed for audits. The technologies I’ve alluded to above—data governance, data lineage, model governance—are all going to be useful for helping manage these risks.

Machine Learning

Machine Learning Technology Deep Learning Data Science

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

AWS Big Data

APRIL 19, 2023

Instead, we can use automation to speed up the process of migration and reduce heavy lifting tasks, costs, and risks. We split the solution into two primary components: generating Spark job metadata and running the SQL on Amazon EMR. Generate Spark SQL metadata Our batch job consists of Hive steps scheduled to run sequentially.

Metadata

Metadata Data Lake Testing Consulting

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Trending Sources

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Eliminating dependency on business units – Redshift Spectrum uses a metadata layer to directly query the data residing in S3 data lakes, eliminating the need for data copying or relying on individual business units to initiate the copy jobs. There are no duplicate data products created by business units or the Central IT team.

Data Lake

Data Lake Data Warehouse Data Governance Publishing

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

How BMO improved data security with Amazon Redshift and AWS Lake Formation

AWS Big Data

MARCH 1, 2024

One of the bank’s key challenges related to strict cybersecurity requirements is to implement field level encryption for personally identifiable information (PII), Payment Card Industry (PCI), and data that is classified as high privacy risk (HPR). Only users with required permissions are allowed to access data in clear text.

Data Lake

Data Lake Data Warehouse Management Risk

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

That’s a lot of priorities – especially when you group together closely related items such as data lineage and metadata management which rank nearby. Also, while surveying the literature two key drivers stood out: Risk management is the thin-edge-of-the-wedge ?for Allows metadata repositories to share and exchange.

Machine Learning

Machine Learning Data Governance Metadata Data Science

Amazon DataZone now integrates with AWS Glue Data Quality and external data quality solutions

AWS Big Data

APRIL 3, 2024

As data is refreshed and updated, changes can happen through upstream processes that put it at risk of not maintaining the intended quality. By selecting the corresponding asset, you can understand its content through the readme, glossary terms , and technical and business metadata.

Data Quality

Data Quality Visualization Metadata Metrics

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

JANUARY 22, 2024

Data as a product Treating data as a product entails three key components: the data itself, the metadata, and the associated code and infrastructure. For orchestration, they use the AWS Cloud Development Kit (AWS CDK) for infrastructure as code (IaC) and AWS Glue Data Catalogs for metadata management.

Data-driven

Data-driven Advertising Metadata Data Architecture

Themes and Conferences per Pacoid, Episode 10

Domino Data Lab

JUNE 2, 2019

Trying to dissect a model to divine an interpretation of its results is a good way to throw away much of the crucial information – especially about non-automated inputs and decisions going into our workflows – that will be required to mitigate existential risk. Because of compliance. Admittedly less Descartes, more Wednesday Addams.

Data Science

Data Science Data-driven Machine Learning Modeling

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

AUGUST 8, 2019

The gist is, leveraging metadata about research datasets, projects, publications, etc., The probabilistic nature changes the risks and process required. We face problems—crises—regarding risks involved with data and machine learning in production. Some people are in fact trained to work with these kinds of risks.

Data Science

Data Science Machine Learning Data Governance Statistics

How Novo Nordisk built distributed data governance and control at scale

AWS Big Data

APRIL 28, 2023

There is a risk that two different groups could hash to the same character combination; however, we have checked that there are no collisions in the existing groups. To mitigate this risk going forward, we have introduced guardrails in multiples places. This has shown to be sufficient for our case.

Data Governance

Data Governance Management Data-driven Analytics

Data Science, Past & Future

Domino Data Lab

JULY 22, 2019

I went to a meeting at Starbucks with the founder of Alation right before they launched in 2012, drawing on the proverbial back-of-the-napkin. What I’m trying to say is this evolution of system architecture, the hardware driving the software layers, and also, the whole landscape with regard to threats and risks, it changes things.

Data Science

Data Science Machine Learning Data Governance Modeling

Jumia builds a next-generation data platform with metadata-driven specification frameworks

AWS Big Data

DECEMBER 20, 2024

Jumia is a technology company born in 2012, present in 14 African countries, with its main headquarters in Lagos, Nigeria. Solution overview The basic concept of the modernization project is to create metadata-driven frameworks, which are reusable, scalable, and able to respond to the different phases of the modernization process.

Metadata

Metadata Data-driven Snapshot Data Lake

Redefining enterprise transformation in the age of intelligent ecosystems

CIO Business Intelligence

JANUARY 16, 2025

Later, as an enterprise architect in consumer-packaged goods, I could no longer realistically contemplate a world where IT could execute mass application portfolio migrations from data centers to cloud and SaaS-based applications and survive the cost, risk and time-to-market implications.

Enterprise

Enterprise Digital Transformation Scorecard Interactive

Becoming a machine learning company means investing in foundational technologies

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

Webinars

Trending Sources

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

Webinars

How BMO improved data security with Amazon Redshift and AWS Lake Formation

Themes and Conferences per Pacoid, Episode 8

Amazon DataZone now integrates with AWS Glue Data Quality and external data quality solutions

Design a data mesh on AWS that reflects the envisioned organization

Themes and Conferences per Pacoid, Episode 10

Themes and Conferences per Pacoid, Episode 12

How Novo Nordisk built distributed data governance and control at scale

Data Science, Past & Future

Jumia builds a next-generation data platform with metadata-driven specification frameworks

Redefining enterprise transformation in the age of intelligent ecosystems

Stay Connected