Data Architecture, Data Lake and Unstructured Data

Data Architecture

Data Lake

Unstructured Data

Five Modern Data Architecture Trends

David Menninger's Analyst Perspectives

MARCH 30, 2020

I was recently asked to identify key modern data architecture trends. Data architectures have changed significantly to accommodate larger volumes of data as well as new types of data such as streaming and unstructured data.

Data Architecture

Data Architecture Unstructured Data Data Lake Data Governance

Unstructured data management and governance using AWS AI/ML and analytics services

AWS Big Data

OCTOBER 25, 2023

Unstructured data is information that doesn’t conform to a predefined schema or isn’t organized according to a preset data model. Unstructured information may have a little or a lot of structure but in ways that are unexpected or inconsistent. Text, images, audio, and videos are common examples of unstructured data.

Unstructured Data

Unstructured Data Metadata Management Analytics

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights. They are the same.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Run Apache XTable in AWS Lambda for background conversion of open table formats

AWS Big Data

NOVEMBER 26, 2024

This post was co-written with Dipankar Mazumdar, Staff Data Engineering Advocate with AWS Partner OneHouse. Data architecture has evolved significantly to handle growing data volumes and diverse workloads. In practice, OTFs are used in a broad range of analytical workloads, from business intelligence to machine learning.

Metadata

Metadata Data Lake Snapshot Data Warehouse

Amazon Web Services named a Leader in the 2024 Gartner Magic Quadrant for Data Integration Tools

AWS Big Data

FEBRUARY 26, 2025

The Gartner Magic Quadrant evaluates 20 data integration tool vendors based on two axesAbility to Execute and Completeness of Vision. Discover, prepare, and integrate all your data at any scale AWS Glue is a fully managed, serverless data integration service that simplifies data preparation and transformation across diverse data sources.

Data Integration

Data Integration Data Lake Data Warehouse Unstructured Data

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Synchronize data lakes with CDC-based UPSERT using open table format, AWS Glue, and Amazon MSK

AWS Big Data

JULY 31, 2024

In the current industry landscape, data lakes have become a cornerstone of modern data architecture, serving as repositories for vast amounts of structured and unstructured data. Maintaining data consistency and integrity across distributed data lakes is crucial for decision-making and analytics.

Data Lake

Data Lake Marketing Data Processing Management

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Statistics Optimization

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. In this post, we describe Orca’s journey building a transactional data lake using Amazon Simple Storage Service (Amazon S3), Apache Iceberg, and AWS Analytics.

Data Lake

Data Lake Analytics Snapshot Data Quality

Carhartt turns to data under new CIO

CIO Business Intelligence

NOVEMBER 25, 2022

As part of that transformation, Agusti has plans to integrate a data lake into the company’s data architecture and expects two AI proofs of concept (POCs) to be ready to move into production within the quarter. Today, we backflush our data lake through our data warehouse. We’re still in that journey.”

Data Lake

Data Lake Data Warehouse Unstructured Data Data Architecture

Building a Beautiful Data Lakehouse

CIO Business Intelligence

MARCH 9, 2022

As a result, users can easily find what they need, and organizations avoid the operational and cost burdens of storing unneeded or duplicate data copies. Newer data lakes are highly scalable and can ingest structured and semi-structured data along with unstructured data like text, images, video, and audio.

Data Lake

Data Lake Unstructured Data Data Warehouse Big Data

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

We also examine how centralized, hybrid and decentralized data architectures support scalable, trustworthy ecosystems. As data-centric AI, automated metadata management and privacy-aware data sharing mature, the opportunity to embed data quality into the enterprises core has never been more significant.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architecture is a complex and varied field and different organizations and industries have unique needs when it comes to their data architects. Solutions data architect: These individuals design and implement data solutions for specific business needs, including data warehouses, data marts, and data lakes.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

Data architecture strategy for data quality

IBM Big Data Hub

JANUARY 5, 2023

Several factors determine the quality of your enterprise data like accuracy, completeness, consistency, to name a few. But there’s another factor of data quality that doesn’t get the recognition it deserves: your data architecture. How the right data architecture improves data quality.

Data Architecture

Data Architecture Data Quality Strategy Data Lake

Databricks’ new data lakehouse aims at media, entertainment sector

CIO Business Intelligence

APRIL 25, 2022

The other 10% represents the effort of initial deployment, data-loading, configuration and the setup of administrative tasks and analysis that is specific to the customer, the Henschen said. The joint solution with Labelbox is targeted toward media companies and is expected to help firms derive more value out of unstructured data.

Recreation/Entertainment

Recreation/Entertainment Data Lake Data Warehouse Unstructured Data

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Big Data Hub

AUGUST 4, 2023

Today, the way businesses use data is much more fluid; data literate employees use data across hundreds of apps, analyze data for better decision-making, and access data from numerous locations. Then, it applies these insights to automate and orchestrate the data lifecycle.

Data Architecture

Data Architecture Data Lake Machine Learning Data Governance

Habib Bank manages data at scale with Cloudera Data Platform

Cloudera

NOVEMBER 17, 2022

The Solution: CDP Private Cloud brings a next-generation hybrid architecture with cloud-native benefits to HBL’s data platform. HBL started their data journey in 2019 when data lake initiative was started to consolidate complex data sources and enable the bank to use single version of truth for decision making.

Management

Management Data Lake Consulting Unstructured Data

Chose Both: Data Fabric and Data Lakehouse

Cloudera

SEPTEMBER 12, 2022

And second, for the data that is used, 80% is semi- or unstructured. Combining and analyzing both structured and unstructured data is a whole new challenge to come to grips with, let alone doing so across different infrastructures. These answers must be reliable and delivered quickly. Better together.

Unstructured Data

Unstructured Data Data Architecture Data Lake Snapshot

Real estate CIOs drive deals with data

CIO Business Intelligence

JULY 26, 2023

The only thing we have on premise, I believe, is a data server with a bunch of unstructured data on it for our legal team,” says Grady Ligon, who was named Re/Max’s first CIO in October 2022.

Data Lake

Data Lake Digital Transformation Machine Learning Data Architecture

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

This data store provides your organization with the holistic customer records view that is needed for operational efficiency of RAG-based generative AI applications. For building such a data store, an unstructured data store would be best. This is typically unstructured data and is updated in a non-incremental fashion.

Data Lake

Data Lake Unstructured Data Management Snapshot

Educating ChatGPT on Data Lakehouse

Cloudera

MARCH 17, 2023

As the use of ChatGPT becomes more prevalent, I frequently encounter customers and data users citing ChatGPT’s responses in their discussions. I love the enthusiasm surrounding ChatGPT and the eagerness to learn about modern data architectures such as data lakehouses, data meshes, and data fabrics.

Unstructured Data

Unstructured Data Data Lake Data Warehouse Machine Learning

Belcorp reimagines R&D with AI

CIO Business Intelligence

JUNE 28, 2023

The R&D laboratories produced large volumes of unstructured data, which were stored in various formats, making it difficult to access and trace. The initial stage involved establishing the data architecture, which provided the ability to handle the data more effectively and systematically. “We

Digital Transformation

Digital Transformation Cost-Benefit Informatics Data mining

Demystifying Modern Data Platforms

Cloudera

SEPTEMBER 15, 2022

Mark: While most discussions of modern data platforms focus on comparing the key components, it is important to understand how they all fit together. The high-level architecture shown below forms the backdrop for the exploration. Luke: Let’s talk about some of the fundamentals of modern data architecture.

Data Lake

Data Lake Data Architecture Data-driven Data Warehouse

Data science vs data analytics: Unpacking the differences

IBM Big Data Hub

SEPTEMBER 19, 2023

Data science is an area of expertise that combines many disciplines such as mathematics, computer science, software engineering and statistics. It focuses on data collection and management of large-scale structured and unstructured data for various academic and business applications.

Data Science

Data Science Data Analytics Prescriptive Analytics Analytics

A comparative assessment of digital transformation in Italy

CIO Business Intelligence

APRIL 24, 2024

In fact, AMA collects a huge amount of structured and unstructured data from bins, collection vehicles, facilities, and user reports, and until now, this data has remained disconnected, managed by disparate systems and interfaces, through Excel spreadsheets.

Digital Transformation

Digital Transformation Business Intelligence Unstructured Data Data Lake

Get maximum value out of your cloud data warehouse with Amazon Redshift

AWS Big Data

APRIL 19, 2023

Building an optimal data system As data grows at an extraordinary rate, data proliferation across your data stores, data warehouse, and data lakes can become a challenge. This performance innovation allows Nasdaq to have a multi-use data lake between teams.

Data Warehouse

Data Warehouse Data Lake Unstructured Data Optimization

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

Those decentralization efforts appeared under different monikers through time, e.g., data marts versus data warehousing implementations (a popular architectural debate in the era of structured data) then enterprise-wide data lakes versus smaller, typically BU-Specific, “data ponds”.

Metadata

Metadata Cost-Benefit Enterprise Interactive

How Data Management and Big Data Analytics Speed Up Business Growth

BizAcuity

APRIL 14, 2022

There are a wide range of problems that are presented to organizations when working with big data. Challenges associated with Data Management and Optimizing Big Data. Unscalable data architecture. Scalable data architecture is not restricted to high storage space. Unstructured Data Management.

Big Data

Big Data Data Analytics Management Analytics

The New Normal for FP&A: Data Analytics

Jedox

OCTOBER 22, 2020

Gartner defines “dark data” as the data organizations collect, process, and store during regular business activities, but doesn’t use any further. Gartner also estimates 80% of all data is “dark”, while 93% of unstructured data is “dark.”.

Data Analytics

Data Analytics Analytics Unstructured Data Data mining

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

JANUARY 8, 2024

Kinesis Data Streams has native integrations with other AWS services such as AWS Glue and Amazon EventBridge to build real-time streaming applications on AWS. Refer to Amazon Kinesis Data Streams integrations for additional details. It provides the ability to collect data from tens of thousands of data sources and ingest in real time.

Analytics

Analytics IoT Data-driven Snapshot

5 misconceptions about cloud data warehouses

IBM Big Data Hub

FEBRUARY 2, 2023

This approach has several benefits, such as streamlined migration of data from on-premises to the cloud, reduced query tuning requirements and continuity in SRE tooling, automations, and personnel. This enabled data-driven analytics at scale across the organization 4.

Data Warehouse

Data Warehouse Cost-Benefit Unstructured Data Data Architecture

The year’s top 10 enterprise AI trends — so far

CIO Business Intelligence

SEPTEMBER 21, 2023

Enterprises still aren’t extracting enough value from unstructured data hidden away in documents, though, says Nick Kramer, VP for applied solutions at management consultancy SSA & Company. Data warehouses then evolved into data lakes, and then data fabrics and other enterprise-wide data architectures.

Enterprise

Enterprise Consulting Modeling Cost-Benefit

What Is Data Modernization? 5 Benefits Worth Knowing

Alation

APRIL 19, 2022

Data modernization is the process of transferring data to modern cloud-based databases from outdated or siloed legacy databases, including structured and unstructured data. In that sense, data modernization is synonymous with cloud migration. Only then can you extract insights across fragmented data architecture.

Cost-Benefit

Cost-Benefit Data Governance Manufacturing Data Architecture

Your Data Architecture Holds the Key to Unlocking AI’s Full Potential

CIO Business Intelligence

APRIL 4, 2023

In order to move AI forward, we need to first build and fortify the foundational layer: data architecture. This architecture is important because, to reap the full benefits of AI, it must be built to scale across an enterprise versus individual AI applications. Constructing the right data architecture cannot be bypassed.

Data Architecture

Data Architecture Data Lake Data Warehouse Cost-Benefit

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Jet Global

OCTOBER 1, 2024

Trino allows users to run ad hoc queries across massive datasets, making real-time decision-making a reality without needing extensive data transformations. This is particularly valuable for teams that require instant answers from their data. Data Lake Analytics: Trino doesn’t just stop at databases.

Dashboards

Dashboards Data Lake Reporting Cost-Benefit

Melting the ice — How Natural Intelligence simplified a data lake migration to Apache Iceberg

AWS Big Data

APRIL 28, 2025

Many organizations turn to data lakes for the flexibility and scale needed to manage large volumes of structured and unstructured data. Recently, NI embarked on a journey to transition their legacy data lake from Apache Hive to Apache Iceberg. NIs leading brands, Top10.com

Data Lake

Data Lake Metadata Cost-Benefit Snapshot

Tapping into the benefits of an open data lakehouse for enterprise AI

CIO Business Intelligence

NOVEMBER 27, 2024

In short, it takes data—and a lot of it. As it stands, many large organizations find themselves relying on a mix of solutions, platforms, and architectures to handle the volume of structured and unstructured data that has been created as their operations have expanded.

Enterprise

Enterprise Unstructured Data Data Lake Data Warehouse

Five Modern Data Architecture Trends

Unstructured data management and governance using AWS AI/ML and analytics services

Webinars

Trending Sources

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Run Apache XTable in AWS Lambda for background conversion of open table formats

Amazon Web Services named a Leader in the 2024 Gartner Magic Quadrant for Data Integration Tools

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Synchronize data lakes with CDC-based UPSERT using open table format, AWS Glue, and Amazon MSK

Choosing an open table format for your transactional data lake on AWS

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Carhartt turns to data under new CIO

Building a Beautiful Data Lakehouse

Top analytics announcements of AWS re:Invent 2024

Data’s dark secret: Why poor quality cripples AI and growth

What is a data architect? Skills, salaries, and how to become a data framework master

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Data architecture strategy for data quality

Databricks’ new data lakehouse aims at media, entertainment sector

Data democratization: How data architecture can drive business decisions and AI initiatives

Habib Bank manages data at scale with Cloudera Data Platform

Chose Both: Data Fabric and Data Lakehouse

Real estate CIOs drive deals with data

Exploring real-time streaming for generative AI Applications

Educating ChatGPT on Data Lakehouse

Belcorp reimagines R&D with AI

Demystifying Modern Data Platforms

Data science vs data analytics: Unpacking the differences

A comparative assessment of digital transformation in Italy

Get maximum value out of your cloud data warehouse with Amazon Redshift

How Cloudera Data Flow Enables Successful Data Mesh Architectures

How Data Management and Big Data Analytics Speed Up Business Growth

The New Normal for FP&A: Data Analytics

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

5 misconceptions about cloud data warehouses

The year’s top 10 enterprise AI trends — so far

What Is Data Modernization? 5 Benefits Worth Knowing

Your Data Architecture Holds the Key to Unlocking AI’s Full Potential

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Melting the ice — How Natural Intelligence simplified a data lake migration to Apache Iceberg

Tapping into the benefits of an open data lakehouse for enterprise AI

Stay Connected