Data Lake, Strategy and Unstructured Data

8 tips for unleashing the power of unstructured data

CIO Business Intelligence

NOVEMBER 28, 2023

Making the most of enterprise data is a top concern for IT leaders today. With organizations seeking to become more data-driven with business decisions, IT leaders must devise data strategies gear toward creating value from data no matter where — or in what form — it resides.

Unstructured Data

Unstructured Data Data-driven Visualization Data Quality

Unstructured data management and governance using AWS AI/ML and analytics services

AWS Big Data

OCTOBER 25, 2023

Unstructured data is information that doesn’t conform to a predefined schema or isn’t organized according to a preset data model. Unstructured information may have a little or a lot of structure but in ways that are unexpected or inconsistent. Text, images, audio, and videos are common examples of unstructured data.

Unstructured Data

Unstructured Data Metadata Management Analytics

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights. They are the same.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

The success of GenAI models lies in your data management strategy

CIO Business Intelligence

OCTOBER 9, 2024

The data preparation process should take place alongside a long-term strategy built around GenAI use cases, such as content creation, digital assistants, and code generation. Known as data engineering, this involves setting up a data lake or lakehouse, with their data integrated with GenAI models.

Strategy

Strategy Modeling Management Data Lake

Use Apache Iceberg in a data lake to support incremental data processing

AWS Big Data

MARCH 2, 2023

Iceberg has become very popular for its support for ACID transactions in data lakes and features like schema and partition evolution, time travel, and rollback. and later supports the Apache Iceberg framework for data lakes. AWS Glue 3.0 The following diagram illustrates the solution architecture.

Data Lake

Data Lake Data Processing Metadata Snapshot

Enrich your serverless data lake with Amazon Bedrock

AWS Big Data

SEPTEMBER 26, 2024

Organizations are collecting and storing vast amounts of structured and unstructured data like reports, whitepapers, and research documents. By consolidating this information, analysts can discover and integrate data from across the organization, creating valuable data products based on a unified dataset.

Data Lake

Data Lake Cost-Benefit Unstructured Data Modeling

Complexity Drives Costs: A Look Inside BYOD and Azure Data Lakes

Jet Global

NOVEMBER 5, 2020

Option 3: Azure Data Lakes. This leads us to Microsoft’s apparent long-term strategy for D365 F&SCM reporting: Azure Data Lakes. Azure Data Lakes are highly complex and designed with a different fundamental purpose in mind than financial and operational reporting. Azure Data Lakes are complicated.

Data Lake

Data Lake OLAP Data Warehouse Unstructured Data

Outdated business apps can cloud your AI vision

CIO Business Intelligence

FEBRUARY 20, 2025

Decades-old apps designed to retain a limited amount of data due to storage costs at the time are also unlikely to integrate easily with AI tools, says Brian Klingbeil, chief strategy officer at managed services provider Ensono. The aim is to create integration pipelines that seamlessly connect different systems and data sources.

Insurance

Insurance Cost-Benefit Unstructured Data Data Lake

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Statistics Optimization

Navigating Data Entities, BYOD, and Data Lakes in Microsoft Dynamics

Jet Global

SEPTEMBER 4, 2020

There is an established body of practice around creating, managing, and accessing OLAP data (known as “cubes”). Data Lakes. There has been a lot of talk over the past year or two in the D365F&SCM world about “data lakes.” There are virtually no rules about what such data looks like. It is unstructured.

Data Lake

Data Lake OLAP Data Warehouse Unstructured Data

The Increasing Importance of Open Table Formats

David Menninger's Analyst Perspectives

OCTOBER 31, 2024

I previously wrote about the importance of open table formats to the evolution of data lakes into data lakehouses. The concept of the data lake was initially proposed as a single environment where data could be combined from multiple sources to be stored and processed to enable analysis by multiple users for multiple purposes.

Data Lake

Data Lake Unstructured Data Data Warehouse Software

Building a Beautiful Data Lakehouse

CIO Business Intelligence

MARCH 9, 2022

As a result, users can easily find what they need, and organizations avoid the operational and cost burdens of storing unneeded or duplicate data copies. Newer data lakes are highly scalable and can ingest structured and semi-structured data along with unstructured data like text, images, video, and audio.

Data Lake

Data Lake Unstructured Data Data Warehouse Big Data

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. In this post, we describe Orca’s journey building a transactional data lake using Amazon Simple Storage Service (Amazon S3), Apache Iceberg, and AWS Analytics.

Data Lake

Data Lake Analytics Snapshot Data Quality

Five Strategies to Accelerate Data Product Development

Cloudera

JULY 26, 2021

With this first article of the two-part series on data product strategies, I am presenting some of the emerging themes in data product development and how they inform the prerequisites and foundational capabilities of an Enterprise data platform that would serve as the backbone for developing successful data product strategies.

Strategy

Strategy Data Science Unstructured Data Marketing

Data governance in the age of generative AI

AWS Big Data

FEBRUARY 29, 2024

Data is your generative AI differentiator, and a successful generative AI implementation depends on a robust data strategy incorporating a comprehensive data governance approach. Data governance is a critical building block across all these approaches, and we see two emerging areas of focus.

Data Governance

Data Governance Unstructured Data Metadata Data Lake

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

I aim to outline pragmatic strategies to elevate data quality into an enterprise-wide capability. However, even the most sophisticated models and platforms can be undone by a single point of failure: poor data quality. This challenge remains deceptively overlooked despite its profound impact on strategy and execution.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Carhartt turns to data under new CIO

CIO Business Intelligence

NOVEMBER 25, 2022

As part of that transformation, Agusti has plans to integrate a data lake into the company’s data architecture and expects two AI proofs of concept (POCs) to be ready to move into production within the quarter. Like many CIOs, Carhartt’s top digital leader is aware that data is the key to making advanced technologies work.

Data Lake

Data Lake Data Warehouse Unstructured Data Data Architecture

The rise of the data lakehouse: A new era of data value

CIO Business Intelligence

AUGUST 18, 2022

Previously, Walgreens was attempting to perform that task with its data lake but faced two significant obstacles: cost and time. Those challenges are well-known to many organizations as they have sought to obtain analytical knowledge from their vast amounts of data. Lakehouses redeem the failures of some data lakes.

Data Lake

Data Lake Data Warehouse Unstructured Data Business Intelligence

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

In the era of big data, data lakes have emerged as a cornerstone for storing vast amounts of raw data in its native format. They support structured, semi-structured, and unstructured data, offering a flexible and scalable environment for data ingestion from multiple sources.

Metadata

Metadata Snapshot Data Lake Metrics

Rocket Mortgage lays foundation for generative AI success

CIO Business Intelligence

MARCH 29, 2024

That’s why Rocket Mortgage has been a vigorous implementor of machine learning and AI technologies — and why CIO Brian Woodring emphasizes a “human in the loop” AI strategy that will not be pinned down to any one generative AI model. It’s a powerful strategy.” So too is keeping your options open.

Data Lake

Data Lake Machine Learning Data Warehouse Unstructured Data

What is Dark Data, Why Does it Matter, and Why Are Humans Still Needed?

Timo Elliott

JANUARY 3, 2022

It’s stored in corporate data warehouses, data lakes, and a myriad of other locations – and while some of it is put to good use, it’s estimated that around 73% of this data remains unexplored. In this way, you can turn dark data into insights and help drive business improvements. Learn More.

IT

IT Unstructured Data Data Quality Machine Learning

Acquisitions on the Horizon in BI and Data Analytics Industry?

Sisense

MAY 28, 2019

For the leaders that acquire or merge with other emerging leaders , this type of convergence is a strategy to accelerate growth, expand capabilities, and deliver on product roadmaps years ahead of schedule. Two orthogonal approaches to data analytics have developed in this decade of BI: 1. The truth always rises to the surface.

Data Analytics

Data Analytics Data Lake Analytics Unstructured Data

Data transformation takes flight at Atlanta’s Hartsfield-Jackson airport

CIO Business Intelligence

AUGUST 9, 2024

The original proof of concept was to have one data repository ingesting data from 11 sources, including flat files and data stored via APIs on premises and in the cloud, Pruitt says. There are a lot of variables that determine what should go into the data lake and what will probably stay on premise,” Pruitt says.

Data Transformation

Data Transformation Machine Learning Data Lake Dashboards

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

Straumann Group is transforming dentistry with data, AI

CIO Business Intelligence

FEBRUARY 16, 2023

The Basel, Switzerland-based company, which operates in more than 100 countries, has petabytes of data, including highly structured customer data, data about treatments and lab requests, operational data, and a massive, growing volume of unstructured data, particularly imaging data.

Unstructured Data

Unstructured Data Data Lake Prescriptive Analytics Data Warehouse

Databricks’ new data lakehouse aims at media, entertainment sector

CIO Business Intelligence

APRIL 25, 2022

The data lakehouse is a relatively new data architecture concept, first championed by Cloudera, which offers both storage and analytics capabilities as part of the same solution, in contrast to the concepts for data lake and data warehouse which, respectively, store data in native format, and structured data, often in SQL format.

Recreation/Entertainment

Recreation/Entertainment Data Lake Data Warehouse Unstructured Data

Why Modernizing the First Mile of the Data Pipeline Can Accelerate all Analytics

Cloudera

AUGUST 13, 2021

Every enterprise is trying to collect and analyze data to get better insights into their business. Whether it is consuming log files, sensor metrics, and other unstructured data, most enterprises manage and deliver data to the data lake and leverage various applications like ETL tools, search engines, and databases for analysis.

Analytics

Analytics Data Lake Unstructured Data Data Strategy

FINRA CIO Steve Randich pushes the public cloud forward

CIO Business Intelligence

FEBRUARY 10, 2023

Deploying new data types for machine learning Mai-Lan Tomsen-Bukovec, vice president of foundational data services at AWS, sees the cloud giant’s enterprise customers deploying more unstructured data, as well as wider varieties of data sets, to inform the accuracy and training of ML models of late.

Unstructured Data

Unstructured Data Data Lake Machine Learning Enterprise

Amazon DataZone announces custom blueprints for AWS services

AWS Big Data

JUNE 26, 2024

New feature: Custom AWS service blueprints Previously, Amazon DataZone provided default blueprints that created AWS resources required for data lake, data warehouse, and machine learning use cases. You can build projects and subscribe to both unstructured and structured data assets within the Amazon DataZone portal.

Data Lake

Data Lake Data Warehouse Unstructured Data Data Governance

7 key Microsoft Azure analytics services (plus one extra)

CIO Business Intelligence

JUNE 29, 2022

The recent announcement of the Microsoft Intelligent Data Platform makes that more obvious, though analytics is only one part of that new brand. Here we take a look at Microsoft Azure’s essential analytics services, what they are used for, and how they come together to make a comprehensive stack for your analytics strategy in the cloud.

Data Lake

Data Lake Analytics Data Warehouse Machine Learning

Backtesting index rebalancing arbitrage with Amazon EMR and Apache Iceberg

AWS Big Data

JULY 3, 2023

Backtesting is a process used in quantitative finance to evaluate trading strategies using historical data. This helps traders determine the potential profitability of a strategy and identify any risks associated with it, enabling them to optimize it for better performance.

Snapshot

Snapshot Data Lake Testing Strategy

2020 Data Impact Award Winner Spotlight: Merck KGaA

Cloudera

DECEMBER 11, 2020

Many businesses now need to achieve free and open data access in order to derive value and improve efficiencies as they navigate the ‘new norm’ — whether that’s involved working from a home office, or the garden shed. Toolsets and strategies have had to shift to ensure controlled access to data.

Data Lake

Data Lake Cost-Benefit Unstructured Data Data Governance

Build a decentralized semantic search engine on heterogeneous data stores using autonomous agents

AWS Big Data

MAY 28, 2024

Large language models (LLMs) such as Anthropic Claude and Amazon Titan have the potential to drive automation across various business processes by processing both structured and unstructured data. Redshift Serverless is a fully functional data warehouse holding data tables maintained in real time.

Unstructured Data

Unstructured Data Data Warehouse Structured Data Testing

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

The data architect also “provides a standard common business vocabulary, expresses strategic requirements, outlines high-level integrated designs to meet those requirements, and aligns with enterprise strategy and related business architecture,” according to DAMA International’s Data Management Body of Knowledge.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Data architecture strategy for data quality

IBM Big Data Hub

JANUARY 5, 2023

The first generation of data architectures represented by enterprise data warehouse and business intelligence platforms were characterized by thousands of ETL jobs, tables, and reports that only a small group of specialized data engineers understood, resulting in an under-realized positive impact on the business.

Data Architecture

Data Architecture Data Quality Strategy Data Lake

The Data Journey: From Raw Data to Insights

Sisense

JULY 22, 2020

The growing amount and increasingly varied sources of data that every organization generates make digital transformation a daunting prospect. At Sisense, we’re dedicated to making this complex task simple, putting power in the hands of the builders of business data and strategy, and providing insights for everyone.

Slice and Dice

Slice and Dice Digital Transformation Data Warehouse Data Lake

Shutterstock capitalizes on the cloud’s cutting edge

CIO Business Intelligence

MARCH 6, 2023

Advancements in analytics and AI as well as support for unstructured data in centralized data lakes are key benefits of doing business in the cloud, and Shutterstock is capitalizing on its cloud foundation, creating new revenue streams and business models using the cloud and data lakes as key components of its innovation platform.

Data Lake

Data Lake Cost-Benefit Recreation/Entertainment Unstructured Data

A comparative assessment of digital transformation in Italy

CIO Business Intelligence

APRIL 24, 2024

In fact, AMA collects a huge amount of structured and unstructured data from bins, collection vehicles, facilities, and user reports, and until now, this data has remained disconnected, managed by disparate systems and interfaces, through Excel spreadsheets.

Digital Transformation

Digital Transformation Business Intelligence Unstructured Data Data Lake

Data science vs data analytics: Unpacking the differences

IBM Big Data Hub

SEPTEMBER 19, 2023

Data science is an area of expertise that combines many disciplines such as mathematics, computer science, software engineering and statistics. It focuses on data collection and management of large-scale structured and unstructured data for various academic and business applications.

Data Science

Data Science Data Analytics Prescriptive Analytics Analytics

TDC Digital leverages IBM Cloud for transparent billing and improved customer satisfaction

IBM Big Data Hub

MAY 19, 2023

According to the research, organizations are adopting cloud ERP models to identify the best alignment with their strategy, business development, workloads and security requirements. With high-speed file transfer, integrated services and cross-region offerings, IBM Cloud Object Storage allows you to leverage your data securely.

Unstructured Data

Unstructured Data Data Processing Manufacturing Data Lake

Data Visualization and Visual Analytics: Seeing the World of Data

Sisense

JUNE 30, 2020

Prescriptive analytics takes things a stage further: In addition to helping organizations understand causes, it helps them learn from what’s happened and shape tactics and strategies that can improve their current performance and their profitability. A simple example would be the analysis of marketing campaigns.

Visualization

Visualization Analytics Dashboards Data-driven

Belcorp reimagines R&D with AI

CIO Business Intelligence

JUNE 28, 2023

The R&D laboratories produced large volumes of unstructured data, which were stored in various formats, making it difficult to access and trace. Follow a value-focused strategy. “These stages significantly influence the iterative process of conceptualizing and rolling out a new product,” Gopalan says.

Digital Transformation

Digital Transformation Cost-Benefit Informatics Data mining

8 tips for unleashing the power of unstructured data

Unstructured data management and governance using AWS AI/ML and analytics services

Webinars

Trending Sources

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

The success of GenAI models lies in your data management strategy

Use Apache Iceberg in a data lake to support incremental data processing

Enrich your serverless data lake with Amazon Bedrock

Complexity Drives Costs: A Look Inside BYOD and Azure Data Lakes

Outdated business apps can cloud your AI vision

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Choosing an open table format for your transactional data lake on AWS

Navigating Data Entities, BYOD, and Data Lakes in Microsoft Dynamics

The Increasing Importance of Open Table Formats

Building a Beautiful Data Lakehouse

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Five Strategies to Accelerate Data Product Development

Data governance in the age of generative AI

Data’s dark secret: Why poor quality cripples AI and growth

Carhartt turns to data under new CIO

The rise of the data lakehouse: A new era of data value

Top analytics announcements of AWS re:Invent 2024

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

Rocket Mortgage lays foundation for generative AI success

What is Dark Data, Why Does it Matter, and Why Are Humans Still Needed?

Acquisitions on the Horizon in BI and Data Analytics Industry?

Data transformation takes flight at Atlanta’s Hartsfield-Jackson airport

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Straumann Group is transforming dentistry with data, AI

Databricks’ new data lakehouse aims at media, entertainment sector

Why Modernizing the First Mile of the Data Pipeline Can Accelerate all Analytics

FINRA CIO Steve Randich pushes the public cloud forward

Amazon DataZone announces custom blueprints for AWS services

7 key Microsoft Azure analytics services (plus one extra)

Backtesting index rebalancing arbitrage with Amazon EMR and Apache Iceberg

2020 Data Impact Award Winner Spotlight: Merck KGaA

Build a decentralized semantic search engine on heterogeneous data stores using autonomous agents

What is a data architect? Skills, salaries, and how to become a data framework master

Data architecture strategy for data quality

The Data Journey: From Raw Data to Insights

Shutterstock capitalizes on the cloud’s cutting edge

A comparative assessment of digital transformation in Italy

Data science vs data analytics: Unpacking the differences

TDC Digital leverages IBM Cloud for transparent billing and improved customer satisfaction

Data Visualization and Visual Analytics: Seeing the World of Data

Belcorp reimagines R&D with AI

Stay Connected