Data Governance, Data Integration and Data Lake

Introducing Precisely for Data Integrity

David Menninger's Analyst Perspectives

JANUARY 25, 2021

Data is becoming more valuable and more important to organizations. At the same time, organizations have become more disciplined about the data on which they rely to ensure it is robust, accurate and governed properly.

Data Integration

Data Integration Data Processing Data Lake IT

Doing Cloud Migration and Data Governance Right the First Time

erwin

OCTOBER 8, 2020

That means your cloud data assets must be available for use by the right people for the right purposes to maximize their security, quality and value. Why You Need Cloud Data Governance. Regulatory compliance is also a major driver of data governance (e.g., GDPR, CCPA, HIPAA, SOX, PIC DSS).

Data Governance

Data Governance Metadata Testing Data Lake

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Unlocking the true value of data often gets impeded by siloed information. Traditional data management—wherein each business unit ingests raw data in separate data lakes or warehouses—hinders visibility and cross-functional analysis. Amazon DataZone natively supports data sharing for Amazon Redshift data assets.

Data Lake

Data Lake Data Warehouse Publishing Data Governance

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Data landscape in EUROGATE and current challenges faced in data governance The EUROGATE Group is a conglomerate of container terminals and service providers, providing container handling, intermodal transports, maintenance and repair, and seaworthy packaging services. Eliminate centralized bottlenecks and complex data pipelines.

IoT

IoT Machine Learning Metadata Data-driven

What is data architecture? A framework to manage data

CIO Business Intelligence

DECEMBER 20, 2024

Beyond breaking down silos, modern data architectures need to provide interfaces that make it easy for users to consume data using tools fit for their jobs. Data must be able to freely move to and from data warehouses, data lakes, and data marts, and interfaces must make it easy for users to consume that data.

Data Architecture

Data Architecture Management Consulting Internet of Things

Data governance in the age of generative AI

AWS Big Data

FEBRUARY 29, 2024

Data is your generative AI differentiator, and a successful generative AI implementation depends on a robust data strategy incorporating a comprehensive data governance approach. Data governance is a critical building block across all these approaches, and we see two emerging areas of focus.

Data Governance

Data Governance Unstructured Data Metadata Data Lake

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

In this post, we delve into the key aspects of using Amazon EMR for modern data management, covering topics such as data governance, data mesh deployment, and streamlined data discovery. Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Don’t Fear Artificial Intelligence; Embrace it Through Data Governance

CIO Business Intelligence

APRIL 29, 2022

This would be straightforward task were it not for the fact that, during the digital-era, there has been an explosion of data – collected and stored everywhere – much of it poorly governed, ill-understood, and irrelevant. Further, data management activities don’t end once the AI model has been developed.

Data Governance

Data Governance IT Risk Data Lake

How to Simplify Your Approach to Data Governance

Data Virtualization

JUNE 16, 2022

Reading Time: 6 minutes Data Governance as a concept and practice has been around for as long as data management has been around. It, however is gaining prominence and interest in recent years due to the increasing volume of data that needs to be.

Data Governance

Data Governance Data Integration Management Data Lake

Talend Data Fabric Simplifies Data Life Cycle Management

David Menninger's Analyst Perspectives

NOVEMBER 16, 2021

Talend is a data integration and management software company that offers applications for cloud computing, big data integration, application integration, data quality and master data management.

Management

Management Data Warehouse Data Quality Data Integration

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

In the era of big data, data lakes have emerged as a cornerstone for storing vast amounts of raw data in its native format. They support structured, semi-structured, and unstructured data, offering a flexible and scalable environment for data ingestion from multiple sources.

Metadata

Metadata Snapshot Data Lake Metrics

Denodo Provides a Logical Approach to Data Management

David Menninger's Analyst Perspectives

OCTOBER 24, 2024

Data silos are a perennial data management problem for enterprises, with almost three-quarters (73%) of participants in ISG Research’s Data Governance Benchmark Research citing disparate data sources and systems as a data governance challenge.

Management

Management Data-driven Data Governance Data Lake

Data Management on Display at Informatica World 2019

David Menninger's Analyst Perspectives

JUNE 12, 2019

Under that focus, Informatica's conference emphasized capabilities across six areas (all strong areas for Informatica): data integration, data management, data quality & governance, Master Data Management (MDM), data cataloging, and data security.

Management

Management Data Quality Data Integration Data Lake

Data transformation takes flight at Atlanta’s Hartsfield-Jackson airport

CIO Business Intelligence

AUGUST 9, 2024

The original proof of concept was to have one data repository ingesting data from 11 sources, including flat files and data stored via APIs on premises and in the cloud, Pruitt says. There are a lot of variables that determine what should go into the data lake and what will probably stay on premise,” Pruitt says.

Data Transformation

Data Transformation Machine Learning Data Lake Dashboards

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

AWS Big Data

JUNE 25, 2024

This post is co-authored by Vijay Gopalakrishnan, Director of Product, Salesforce Data Cloud. In today’s data-driven business landscape, organizations collect a wealth of data across various touch points and unify it in a central data warehouse or a data lake to deliver business insights.

Data Lake

Data Lake Cost-Benefit Data-driven Data Warehouse

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

AWS Big Data

OCTOBER 21, 2024

However, enterprises often encounter challenges with data silos, insufficient access controls, poor governance, and quality issues. Embracing data as a product is the key to address these challenges and foster a data-driven culture.

Sales

Sales Data-driven Data Processing Key Performance Indicator

Modernizing the Data Warehouse: Challenges and Benefits

BI-Survey

AUGUST 21, 2020

The primary modernization approach is data warehouse/ETL automation, which helps promote broad usage of the data warehouse but can only partially improve efficiency in data management processes. However, an automation approach alone is of limited usefulness when data management processes are inefficient.

Data Warehouse

Data Warehouse Data Lake Data Governance Data Architecture

How Data Governance Supports Analytics

Alation

JANUARY 27, 2022

People might not understand the data, the data they chose might not be ideal for their application, or there might be better, more current, or more accurate data available. An effective data governance program ensures data consistency and trustworthiness. It can also help prevent data misuse.

Data Governance

Data Governance Analytics Cost-Benefit Data-driven

Migrate data from Azure Blob Storage to Amazon S3 using AWS Glue

AWS Big Data

OCTOBER 20, 2023

Today, we are pleased to announce new AWS Glue connectors for Azure Blob Storage and Azure Data Lake Storage that allow you to move data bi-directionally between Azure Blob Storage, Azure Data Lake Storage, and Amazon Simple Storage Service (Amazon S3). option("header","true").load("wasbs://yourblob@youraccountname.blob.core.windows.net/loadingtest-input/100mb")

Data Lake

Data Lake Big Data Data Warehouse Consulting

Automatically detect Personally Identifiable Information in Amazon Redshift using AWS Glue

AWS Big Data

DECEMBER 15, 2023

Even after identification, it’s cumbersome to implement redaction, masking, or encryption of sensitive data at scale. In this post, we provide an automated solution to detect PII data in Amazon Redshift using AWS Glue. For our solution, we use Amazon Redshift to store the data.

Data Warehouse

Data Warehouse Data Lake Big Data Structured Data

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

Laying the Foundation for Modern Data Architecture

Cloudera

MAY 28, 2024

This form of architecture can handle data in all forms—structured, semi-structured, unstructured—blending capabilities from data warehouses and data lakes into data lakehouses.

Data Architecture

Data Architecture Data Lake Data Warehouse Cost-Benefit

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

erwin

JULY 17, 2019

The solution is data intelligence. It improves IT and business data literacy and knowledge, supporting enterprise data governance and business enablement. Organizations need a real-time, accurate picture of the metadata landscape to: Discover data – Identify and interrogate metadata from various data management silos.

Digital Transformation

Digital Transformation Strategy Metadata Data-driven

Your guide to AWS Analytics at AWS re:Invent 2023

AWS Big Data

NOVEMBER 13, 2023

11:30 AM – 12:30 PM (PDT) Ceasars Forum ANT318 | Accelerate innovation with end-to-end serverless data architecture. 4:30 PM – 5:30 PM (PDT) Wynn ANT207 | Understand your data with business context. 1:00 PM – 2:00 PM (PDT) Venetian ANT201 | Accelerate innovation with real-time data.

Analytics

Analytics Data Lake Data Warehouse Data-driven

Straumann Group is transforming dentistry with data, AI

CIO Business Intelligence

FEBRUARY 16, 2023

Selling the value of data transformation Iyengar and his team are 18 months into a three- to five-year journey that started by building out the data layer — corralling data sources such as ERP, CRM, and legacy databases into data warehouses for structured data and data lakes for unstructured data.

Unstructured Data

Unstructured Data Data Lake Prescriptive Analytics Data Warehouse

Unlock scalable analytics with a secure connectivity pattern in AWS Glue to read from or write to Snowflake

AWS Big Data

AUGUST 19, 2024

As organizations increasingly rely on data stored across various platforms, such as Snowflake , Amazon Simple Storage Service (Amazon S3), and various software as a service (SaaS) applications, the challenge of bringing these disparate data sources together has never been more pressing. For more information on AWS Glue, visit AWS Glue.

Analytics

Analytics Data-driven Data Integration Data Lake

Your 5-Step Journey from Analytics to AI

CIO Business Intelligence

MARCH 22, 2022

Which type(s) of storage consolidation you use depends on the data you generate and collect. . One option is a data lake—on-premises or in the cloud—that stores unprocessed data in any type of format, structured or unstructured, and can be queried in aggregate. Set up unified data governance rules and processes.

Analytics

Analytics Key Performance Indicator Data Warehouse Data-driven

Use AWS Glue to streamline SFTP data processing

AWS Big Data

AUGUST 13, 2024

In today’s data-driven world, seamless integration and transformation of data across diverse sources into actionable insights is paramount. With AWS Glue, you can discover and connect to hundreds of diverse data sources and manage your data in a centralized data catalog. Kamen Sharlandjiev is a Sr.

Data Processing

Data Processing Visualization Data Lake Data Processing

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

The data fabric architectural approach can simplify data access in an organization and facilitate self-service data consumption at scale. Read: The first capability of a data fabric is a semantic knowledge data catalog, but what are the other 5 core capabilities of a data fabric? 11 May 2021. .

Management

Management Metadata Data Architecture Data Lake

Improve healthcare services through patient 360: A zero-ETL approach to enable near real-time data analytics

AWS Big Data

MARCH 27, 2024

AWS has invested in a zero-ETL (extract, transform, and load) future so that builders can focus more on creating value from data, instead of having to spend time preparing data for analysis. This means you no longer have to create an external schema in Amazon Redshift to use the data lake tables cataloged in the Data Catalog.

Data Analytics

Data Analytics Analytics Data Warehouse Data Lake

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

In this post, we discuss how you can use purpose-built AWS services to create an end-to-end data strategy for C360 to unify and govern customer data that address these challenges. The AWS modern data architecture shows a way to build a purpose-built, secure, and scalable data platform in the cloud.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Handle UPSERT data operations using open-source Delta Lake and AWS Glue

AWS Big Data

JANUARY 30, 2023

Many customers need an ACID transaction (atomic, consistent, isolated, durable) data lake that can log change data capture (CDC) from operational data sources. There is also demand for merging real-time data into batch data. Delta Lake framework provides these two capabilities. option("header",True).schema(schema).load("s3://"+

Insurance

Insurance Data Lake Data-driven Management

Preparing for a Logical Data Management Solution

Data Virtualization

JUNE 25, 2024

Reading Time: 5 minutes For years, organizations have been managing data by consolidating it into a single data repository, such as a cloud data warehouse or data lake, so it can be analyzed and delivered to business users. Unfortunately, organizations struggle to get this.

Management

Management Data Lake Data Warehouse Data Integration

The Enduring Significance of Data Modeling in the Modern Data-Driven Enterprise

erwin

AUGUST 31, 2023

Reduced Data Redundancy : By eliminating data duplication, it optimizes storage and enhances data quality, reducing errors and discrepancies. Efficient Development : Accurate data models expedite database development, leading to efficient data integration, migration, and application development.

Data-driven

Data-driven Modeling Enterprise Structured Data

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

erwin

JANUARY 11, 2019

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms , including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. But the attempts to standardize data across the entire enterprise haven’t produced the desired results.

Data Governance

Data Governance Risk Metadata Management

It’s Only Logical

Data Virtualization

MAY 20, 2025

In 2020, in The Practical Logical Data Warehouse, Gartner begins by saying, The logical data warehouse a data consolidation and virtualization architecture. The post Its Only Logical appeared first on Data Management Blog - Data Integration and Modern Data Management Articles, Analysis and Information.

Data Warehouse

Data Warehouse Data Integration Management Data Lake

The Data Lakehouse Myth

Data Virtualization

FEBRUARY 22, 2023

Reading Time: 2 minutes The data lakehouse attempts to combine the best parts of the data warehouse with the best parts of data lakes while avoiding all of the problems inherent in both. However, the data lakehouse is not the last word in data.

Data Lake

Data Lake Data Warehouse Data Integration Management

The Data Lakehouse Myth

Data Virtualization

FEBRUARY 22, 2023

Reading Time: 2 minutes The data lakehouse attempts to combine the best parts of the data warehouse with the best parts of data lakes while avoiding all of the problems inherent in both. However, the data lakehouse is not the last word in data.

Data Lake

Data Lake Data Warehouse Data Integration Management

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

CIO Business Intelligence

AUGUST 2, 2023

With the first two layers, the business is the driver with IT in a support role, but with the data governance and architecture layer, IT and the business are side by side, working through complex decisions about governance and architecture together. These aren’t typically business discussions; it’s largely IT.

Manufacturing

Manufacturing Data Architecture Data Strategy Strategy

A Closer Look at The Next Phase of Cloudera’s Hybrid Data Lakehouse

Cloudera

MARCH 5, 2024

With built-in features like time travel, schema evolution, and streamlined data discovery, Iceberg empowers data teams to enhance data lake management while upholding data integrity. Available for cloud and now also for the data center. And that’s also what makes Cloudera so unique.

Snapshot

Snapshot Data Lake Enterprise Data Governance

Data Mesh vs. Data Fabric: A Love Story

Alation

JANUARY 13, 2022

Thoughtworks says data mesh is key to moving beyond a monolithic data lake. Spoiler alert: data fabric and data mesh are independent design concepts that are, in fact, quite complementary. Thoughtworks says data mesh is key to moving beyond a monolithic data lake 2. Gartner on Data Fabric.

Data Lake

Data Lake Metadata Data-driven Data Governance

Data Virtualization Brings Data Together Quickly and Easily

David Menninger's Analyst Perspectives

OCTOBER 7, 2021

In this post, I don’t want to debate the meanings and origins of different terms; rather, I’d like to highlight a technology weapon that you should have in your data management arsenal. We currently refer to this technology as data virtualization.

Technology

Technology Management Data Lake IT

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Alation

MAY 16, 2023

Data Pipeline Use Cases Here are just a few examples of the goals you can achieve with a robust data pipeline: Data Prep for Visualization Data pipelines can facilitate easier data visualization by gathering and transforming the necessary data into a usable state.

Data Lake

Data Lake Data Governance Data Warehouse Data Processing

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Big Data Hub

AUGUST 4, 2023

Data democratization instead refers to the simplification of all processes related to data, from storage architecture to data management to data security. It also requires an organization-wide data governance approach, from adopting new types of employee training to creating new policies for data storage.

Data Architecture

Data Architecture Data Lake Machine Learning Data Governance

Introducing Precisely for Data Integrity

Doing Cloud Migration and Data Governance Right the First Time

Webinars

Trending Sources

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

Webinars

How EUROGATE established a data mesh architecture using Amazon DataZone

What is data architecture? A framework to manage data

Data governance in the age of generative AI

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Don’t Fear Artificial Intelligence; Embrace it Through Data Governance

How to Simplify Your Approach to Data Governance

Talend Data Fabric Simplifies Data Life Cycle Management

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

Denodo Provides a Logical Approach to Data Management

Data Management on Display at Informatica World 2019

Data transformation takes flight at Atlanta’s Hartsfield-Jackson airport

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

Modernizing the Data Warehouse: Challenges and Benefits

How Data Governance Supports Analytics

Migrate data from Azure Blob Storage to Amazon S3 using AWS Glue

Automatically detect Personally Identifiable Information in Amazon Redshift using AWS Glue

Top analytics announcements of AWS re:Invent 2024

Laying the Foundation for Modern Data Architecture

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

Your guide to AWS Analytics at AWS re:Invent 2023

Straumann Group is transforming dentistry with data, AI

Unlock scalable analytics with a secure connectivity pattern in AWS Glue to read from or write to Snowflake

Your 5-Step Journey from Analytics to AI

Use AWS Glue to streamline SFTP data processing

Augmented data management: Data fabric versus data mesh

Improve healthcare services through patient 360: A zero-ETL approach to enable near real-time data analytics

Create an end-to-end data strategy for Customer 360 on AWS

Handle UPSERT data operations using open-source Delta Lake and AWS Glue

Preparing for a Logical Data Management Solution

The Enduring Significance of Data Modeling in the Modern Data-Driven Enterprise

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

It’s Only Logical

The Data Lakehouse Myth

The Data Lakehouse Myth

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

A Closer Look at The Next Phase of Cloudera’s Hybrid Data Lakehouse

Data Mesh vs. Data Fabric: A Love Story

Data Virtualization Brings Data Together Quickly and Easily

Building Robust Data Pipelines: 9 Fundamentals and Best Practices to Follow

Data democratization: How data architecture can drive business decisions and AI initiatives

Stay Connected