Data Governance, Data Lake and Unstructured Data

Data Governance

Data Lake

Unstructured Data

8 tips for unleashing the power of unstructured data

CIO Business Intelligence

NOVEMBER 28, 2023

With organizations seeking to become more data-driven with business decisions, IT leaders must devise data strategies gear toward creating value from data no matter where — or in what form — it resides. Unstructured data resources can be extremely valuable for gaining business insights and solving problems.

Unstructured Data

Unstructured Data Data-driven Visualization Data Quality

Data governance in the age of generative AI

AWS Big Data

FEBRUARY 29, 2024

Data is your generative AI differentiator, and a successful generative AI implementation depends on a robust data strategy incorporating a comprehensive data governance approach. Data governance is a critical building block across all these approaches, and we see two emerging areas of focus.

Data Governance

Data Governance Unstructured Data Metadata Data Lake

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Five Modern Data Architecture Trends

David Menninger's Analyst Perspectives

MARCH 30, 2020

I was recently asked to identify key modern data architecture trends. Data architectures have changed significantly to accommodate larger volumes of data as well as new types of data such as streaming and unstructured data. Here are some of the trends I see continuing to impact data architectures.

Data Architecture

Data Architecture Unstructured Data Data Lake Data Governance

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Statistics Optimization

Data transformation takes flight at Atlanta’s Hartsfield-Jackson airport

CIO Business Intelligence

AUGUST 9, 2024

The original proof of concept was to have one data repository ingesting data from 11 sources, including flat files and data stored via APIs on premises and in the cloud, Pruitt says. There are a lot of variables that determine what should go into the data lake and what will probably stay on premise,” Pruitt says.

Data Transformation

Data Transformation Machine Learning Data Lake Dashboards

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

In the era of big data, data lakes have emerged as a cornerstone for storing vast amounts of raw data in its native format. They support structured, semi-structured, and unstructured data, offering a flexible and scalable environment for data ingestion from multiple sources.

Metadata

Metadata Snapshot Data Lake Metrics

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

3 things to get right with data management for gen AI projects

CIO Business Intelligence

OCTOBER 2, 2024

Collect, filter, and categorize data The first is a series of processes — collecting, filtering, and categorizing data — that may take several months for KM or RAG models. Structured data is relatively easy, but the unstructured data, while much more difficult to categorize, is the most valuable.

Management

Management Data Governance Cost-Benefit Structured Data

Amazon DataZone announces custom blueprints for AWS services

AWS Big Data

JUNE 26, 2024

New feature: Custom AWS service blueprints Previously, Amazon DataZone provided default blueprints that created AWS resources required for data lake, data warehouse, and machine learning use cases. You can build projects and subscribe to both unstructured and structured data assets within the Amazon DataZone portal.

Data Lake

Data Lake Data Warehouse Unstructured Data Data Governance

Straumann Group is transforming dentistry with data, AI

CIO Business Intelligence

FEBRUARY 16, 2023

The Basel, Switzerland-based company, which operates in more than 100 countries, has petabytes of data, including highly structured customer data, data about treatments and lab requests, operational data, and a massive, growing volume of unstructured data, particularly imaging data.

Unstructured Data

Unstructured Data Data Lake Prescriptive Analytics Data Warehouse

2020 Data Impact Award Winner Spotlight: Merck KGaA

Cloudera

DECEMBER 11, 2020

Without meeting GxP compliance, the Merck KGaA team could not run the enterprise data lake needed to store, curate, or process the data required to inform business decisions. It established a data governance framework within its enterprise data lake. Driving innovation with secure and governed data .

Data Lake

Data Lake Cost-Benefit Unstructured Data Data Governance

Migrate data from Azure Blob Storage to Amazon S3 using AWS Glue

AWS Big Data

OCTOBER 20, 2023

Today, we are pleased to announce new AWS Glue connectors for Azure Blob Storage and Azure Data Lake Storage that allow you to move data bi-directionally between Azure Blob Storage, Azure Data Lake Storage, and Amazon Simple Storage Service (Amazon S3). option("header","true").load("wasbs://yourblob@youraccountname.blob.core.windows.net/loadingtest-input/100mb")

Data Lake

Data Lake Big Data Data Warehouse Consulting

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Solutions data architect: These individuals design and implement data solutions for specific business needs, including data warehouses, data marts, and data lakes. Application data architect: The application data architect designs and implements data models for specific software applications.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

The Madness of Data (and analytics) Governance

Andrew White

DECEMBER 9, 2019

The outline of the call went as follows: I was taking to a central state agency who was organizing a data governance initiative (in their words) across three other state agencies. All four agencies had reported an independent but identical experience with data governance in the past. An expensive consulting engagement.

Analytics

Analytics Data Lake Data Governance Data Warehouse

Simplify external object access in Amazon Redshift using automatic mounting of the AWS Glue Data Catalog

AWS Big Data

JULY 28, 2023

Amazon Redshift now makes it easier for you to run queries in AWS data lakes by automatically mounting the AWS Glue Data Catalog. You no longer have to create an external schema in Amazon Redshift to use the data lake tables cataloged in the Data Catalog.

Data Lake

Data Lake Data Governance Data Warehouse Data-driven

Why Spreadsheets Are Your Secret Weapon for Efficient Data Governance

Alation

APRIL 6, 2023

Data governance is traditionally applied to structured data assets that are most often found in databases and information systems. The ability to connect straight to the source allows knowledge workers to work natively in spreadsheets, pulling data directly from true data sources like the data warehouse or data lake.

Data Governance

Data Governance Metadata Cost-Benefit Structured Data

What is an open data lakehouse and why you should care?

IBM Big Data Hub

JANUARY 17, 2023

A data lakehouse is an emerging data management architecture that improves efficiency and converges data warehouse and data lake capabilities driven by a need to improve efficiency and obtain critical insights faster. Let’s start with why data lakehouses are becoming increasingly important.

Data Lake

Data Lake Metadata Data Warehouse Data Governance

How Data Management and Big Data Analytics Speed Up Business Growth

BizAcuity

APRIL 14, 2022

For example, one company let all its data scientists access and make changes to their data tables for report generation, which caused inconsistency and cost the company significantly. The best way to avoid poor data quality is having a strict data governance system in place. Unstructured Data Management.

Big Data

Big Data Data Analytics Management Unstructured Data

Demystifying Modern Data Platforms

Cloudera

SEPTEMBER 15, 2022

Mark: While most discussions of modern data platforms focus on comparing the key components, it is important to understand how they all fit together. The collection of source data shown on your left is composed of both structured and unstructured data from the organization’s internal and external sources.

Data Lake

Data Lake Data Architecture Data-driven Data Warehouse

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

Those decentralization efforts appeared under different monikers through time, e.g., data marts versus data warehousing implementations (a popular architectural debate in the era of structured data) then enterprise-wide data lakes versus smaller, typically BU-Specific, “data ponds”.

Metadata

Metadata Cost-Benefit Enterprise Interactive

Get maximum value out of your cloud data warehouse with Amazon Redshift

AWS Big Data

APRIL 19, 2023

Building an optimal data system As data grows at an extraordinary rate, data proliferation across your data stores, data warehouse, and data lakes can become a challenge. This performance innovation allows Nasdaq to have a multi-use data lake between teams.

Data Warehouse

Data Warehouse Data Lake Unstructured Data Optimization

Celebrating Data Superheroes: The 2021 Data Impact Awards Winners

Cloudera

NOVEMBER 18, 2021

By adopting a custom developed application based on the Cloudera ecosystem, Carrefour has combined the legacy systems into one platform which provides access to customer data in a single data lake. In doing so, Bank of the West has modernized and centralized its Big Data platform in just one year.

Data Lake

Data Lake Cost-Benefit Digital Transformation Risk

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Big Data Hub

AUGUST 4, 2023

Data democratization instead refers to the simplification of all processes related to data, from storage architecture to data management to data security. It also requires an organization-wide data governance approach, from adopting new types of employee training to creating new policies for data storage.

Data Architecture

Data Architecture Data Lake Machine Learning Data Governance

Migrate data from Google Cloud Storage to Amazon S3 using AWS Glue

AWS Big Data

JULY 19, 2023

We’ve seen that there is a demand to design applications that enable data to be portable across cloud environments and give you the ability to derive insights from one or more data sources. With this connector, you can bring the data from Google Cloud Storage to Amazon S3.

Big Data

Big Data Consulting Software Unstructured Data

What Is Data Modernization? 5 Benefits Worth Knowing

Alation

APRIL 19, 2022

Data modernization is the process of transferring data to modern cloud-based databases from outdated or siloed legacy databases, including structured and unstructured data. In that sense, data modernization is synonymous with cloud migration. What Is the Role of the Cloud in Data Modernization?

Cost-Benefit

Cost-Benefit Data Governance Manufacturing Data Architecture

Data architecture strategy for data quality

IBM Big Data Hub

JANUARY 5, 2023

The first generation of data architectures represented by enterprise data warehouse and business intelligence platforms were characterized by thousands of ETL jobs, tables, and reports that only a small group of specialized data engineers understood, resulting in an under-realized positive impact on the business.

Data Architecture

Data Architecture Data Quality Strategy Data Lake

Five benefits of a data catalog

IBM Big Data Hub

DECEMBER 16, 2022

For example, data catalogs have evolved to deliver governance capabilities like managing data quality and data privacy and compliance. It uses metadata and data management tools to organize all data assets within your organization. This is especially helpful when handling massive amounts of big data.

Metadata

Metadata Data Quality Data-driven Data Governance

Doing a 180 on Customer 360 – The Preferred Path to Customer Insights

Cloudera

OCTOBER 30, 2018

The abundant growth of data, maturation of machine algorithms, and future regulatory compliance demands from the European Union’s General Data Protection Regulation (GDPR) will shift the landscape for creating a single source of the truth for customer data.

Unstructured Data

Unstructured Data Data Lake Machine Learning Interactive

A Guide to Data Analytics in the Travel Industry

Alation

MARCH 21, 2023

To fully realize data’s value, organizations in the travel industry need to dismantle data silos so that they can securely and efficiently leverage analytics across their organizations. What is big data in the travel and tourism industry? What are common data challenges for the travel industry?

Data Analytics

Data Analytics Analytics Data-driven Big Data

It’s not your data. It’s how you use it. Unlock the power of data & build foundations of a data driven organisation

CIO Business Intelligence

MAY 24, 2022

Today transactional data is the largest segment, which includes streaming and data flows. EXTRACTING VALUE FROM DATA. One of the biggest challenges presented by having massive volumes of disparate unstructured data is extracting useable information and insights. Oil and Gas.

Data-driven

Data-driven Data Lake Data Warehouse Machine Learning

Databricks Scores Massive Funding Round, Continues to Expand Its Offerings

David Menninger's Analyst Perspectives

JANUARY 29, 2025

Over time, the worlds of data lakes and data warehouses collided. Databricks introduced the concept of a data lakehouse , adding Databricks SQL as well as open table formats. Databricks was also rated Exemplary in our Data Intelligence , Data Integration and Data Governance Buyers Guides.

IT Dashboards Unstructured Data Big Data

Tapping into the benefits of an open data lakehouse for enterprise AI

CIO Business Intelligence

NOVEMBER 27, 2024

In short, it takes data—and a lot of it. As it stands, many large organizations find themselves relying on a mix of solutions, platforms, and architectures to handle the volume of structured and unstructured data that has been created as their operations have expanded.

Enterprise

Enterprise Unstructured Data Data Lake Data Warehouse

Revolutionizing data management: Trends driving security, scalability, and governance in 2025

CIO Business Intelligence

JANUARY 30, 2025

Cloud-native data lakes and warehouses simplify analytics by integrating structured and unstructured data. Enhanced interoperability between tools enables seamless data sharing and collaborative decision-making across teams.

Management

Management Data-driven Data Governance Unstructured Data

Is Your Data Catalog Ready for the AI Age?

BI-Survey

FEBRUARY 27, 2025

Figure 1: Enterprise Data Catalogs interact with AI in two ways These regulations require organizations to document and control both traditional and generative AI models, whether they build them or incorporate them into their own applications, thus driving demand for data catalogs that support compliance.

Unstructured Data

Unstructured Data Metadata Data Quality Data Governance

Prioritizing AI investments: Balancing short-term gains with long-term vision

CIO Business Intelligence

FEBRUARY 18, 2025

If we revisit our durable goods industry example and consider prioritizing data quality through aggregation in a multi-tier architecture and cloud data platform first, we can achieve the prerequisite needed to build data quality and data trust first. Agentic AI is here to stay and will gain tremendous momentum in 2024.

Machine Learning

Machine Learning Data Quality Enterprise Sales

Your data’s wasted without predictive AI. Here’s how to fix that

CIO Business Intelligence

MAY 6, 2025

Customer data in Salesforce, product usage data in Snowflake and financials in Oracle none integrated Regional systems using different naming conventions and field formats This fragmentation leads to inconsistent definitions, duplication of work and multiple versions of the truth.

Prescriptive Analytics

Prescriptive Analytics Predictive Analytics Descriptive Analytics ROI

8 tips for unleashing the power of unstructured data

Data governance in the age of generative AI

Webinars

Trending Sources

Five Modern Data Architecture Trends

Webinars

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Choosing an open table format for your transactional data lake on AWS

Data transformation takes flight at Atlanta’s Hartsfield-Jackson airport

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

Top analytics announcements of AWS re:Invent 2024

3 things to get right with data management for gen AI projects

Amazon DataZone announces custom blueprints for AWS services

Straumann Group is transforming dentistry with data, AI

2020 Data Impact Award Winner Spotlight: Merck KGaA

Migrate data from Azure Blob Storage to Amazon S3 using AWS Glue

What is a data architect? Skills, salaries, and how to become a data framework master

The Madness of Data (and analytics) Governance

Simplify external object access in Amazon Redshift using automatic mounting of the AWS Glue Data Catalog

Why Spreadsheets Are Your Secret Weapon for Efficient Data Governance

What is an open data lakehouse and why you should care?

How Data Management and Big Data Analytics Speed Up Business Growth

Demystifying Modern Data Platforms

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Get maximum value out of your cloud data warehouse with Amazon Redshift

Celebrating Data Superheroes: The 2021 Data Impact Awards Winners

Data democratization: How data architecture can drive business decisions and AI initiatives

Migrate data from Google Cloud Storage to Amazon S3 using AWS Glue

What Is Data Modernization? 5 Benefits Worth Knowing

Data architecture strategy for data quality

Five benefits of a data catalog

Doing a 180 on Customer 360 – The Preferred Path to Customer Insights

A Guide to Data Analytics in the Travel Industry

It’s not your data. It’s how you use it. Unlock the power of data & build foundations of a data driven organisation

Databricks Scores Massive Funding Round, Continues to Expand Its Offerings

Tapping into the benefits of an open data lakehouse for enterprise AI

Revolutionizing data management: Trends driving security, scalability, and governance in 2025

Is Your Data Catalog Ready for the AI Age?

Prioritizing AI investments: Balancing short-term gains with long-term vision

Your data’s wasted without predictive AI. Here’s how to fix that

Stay Connected