Data Lake, Data Strategy and IT - Data Leaders Brief

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

AWS Big Data

OCTOBER 30, 2024

This is part two of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to load data from a legacy database (SQL Server) into a transactional data lake ( Apache Iceberg ) using AWS Glue. To start the job, choose Run. format(dbname)).config("spark.sql.catalog.glue_catalog.catalog-impl",

Data Lake

Data Lake Data Processing Optimization Machine Learning

The data flywheel: A better way to think about your data strategy

CIO Business Intelligence

OCTOBER 25, 2022

Data & Analytics is delivering on its promise. Every day, it helps countless organizations do everything from measure their ESG impact to create new streams of revenue, and consequently, companies without strong data cultures or concrete plans to build one are feeling the pressure. We discourage that thinking.

Data Strategy

Data Strategy Strategy Data Lake Data-driven

How BMW streamlined data access using AWS Lake Formation fine-grained access control

AWS Big Data

OCTOBER 29, 2024

With over 10 PB of data across 1,500 data assets, 1,000 data use cases, and more than 9000 users, the BMW CDH has become a resounding success since BMW decided to build it in a strategic collaboration with Amazon Web Services (AWS) in 2020. This led to inefficiencies in data governance and access control.

Data Lake

Data Lake Sales Metadata Machine Learning

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JUNE 10, 2024

A modern data strategy redefines and enables sharing data across the enterprise and allows for both reading and writing of a singular instance of the data using an open table format. Why Cloudinary chose Apache Iceberg Apache Iceberg is a high-performance table format for huge analytic workloads.

Data Lake

Data Lake Metadata Snapshot Analytics

Differences Between Data Lake and Data Warehouses

TDAN

SEPTEMBER 14, 2021

Data lake is a newer IT term created for a new category of data store. But just what is a data lake? According to IBM, “a data lake is a storage repository that holds an enormous amount of raw or refined data in native format until it is accessed.” That makes sense. I think the […].

Data Lake

Data Lake Data Warehouse IT Data Strategy

Steps taken to build Sevita’s first enterprise data platform

CIO Business Intelligence

OCTOBER 23, 2024

You ’re building an enterprise data platform for the first time in Sevita’s history. Our legacy architecture consisted of multiple standalone, on-prem data marts intended to integrate transactional data from roughly 30 electronic health record systems to deliver a reporting capability. What’s driving this investment?

Enterprise

Enterprise Dashboards KPI Data Lake

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS Big Data

SEPTEMBER 13, 2023

A modern data architecture is an evolutionary architecture pattern designed to integrate a data lake, data warehouse, and purpose-built stores with a unified governance model. The company wanted the ability to continue processing operational data in the secondary Region in the rare event of primary Region failure.

Data Lake

Data Lake Data Processing Metadata Snapshot

Deriving Value from Data Lakes with AI

Sisense

DECEMBER 23, 2019

Data is growing at a phenomenal rate and that’s not going to stop anytime soon. AI and ML are the only ways to derive value from massive data lakes, cloud-native data warehouses, and other huge stores of information. Once your data is prepared for analysis, the next question is: how else can AI help you?

Data Lake

Data Lake Machine Learning Data Warehouse Data Science

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

AWS Big Data

JUNE 23, 2023

Events and many other security data types are stored in Imperva’s Threat Research Multi-Region data lake. Imperva harnesses data to improve their business outcomes. As part of their solution, they are using Amazon QuickSight to unlock insights from their data.

Data Lake

Data Lake Dashboards Cost-Benefit Data Warehouse

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

AWS Big Data

APRIL 24, 2023

Building a data lake on Amazon Simple Storage Service (Amazon S3) provides numerous benefits for an organization. However, many use cases, like performing change data capture (CDC) from an upstream relational database to an Amazon S3-based data lake, require handling data at a record level.

Data Lake

Data Lake Data Governance Machine Learning Cost-Benefit

The rise of the data lakehouse: A new era of data value

CIO Business Intelligence

AUGUST 18, 2022

Previously, Walgreens was attempting to perform that task with its data lake but faced two significant obstacles: cost and time. Those challenges are well-known to many organizations as they have sought to obtain analytical knowledge from their vast amounts of data. Lakehouses redeem the failures of some data lakes.

Data Lake

Data Lake Data Warehouse Unstructured Data Business Intelligence

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

A Gartner Marketing survey found only 14% of organizations have successfully implemented a C360 solution, due to lack of consensus on what a 360-degree view means, challenges with data quality, and lack of cross-functional governance structure for customer data. You need to process this to make it ready for analysis.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Use open table format libraries on AWS Glue 5.0 for Apache Spark

AWS Big Data

DECEMBER 4, 2024

Open table formats are emerging in the rapidly evolving domain of big data management, fundamentally altering the landscape of data storage and analysis. By providing a standardized framework for data representation, open table formats break down data silos, enhance data quality, and accelerate analytics at scale.

Snapshot

Snapshot Metadata Data Lake Optimization

Data Swamp, Data Lake, Data Lakehouse: What to Know

Alation

OCTOBER 21, 2021

Data Swamp vs Data Lake. When you imagine a lake, it’s likely an idyllic image of a tree-ringed body of reflective water amid singing birds and dabbling ducks. I’ll take the lake, thank you very much. Data is the raw material for the modern business apparatus. Benefits of a Data Lake.

Data Lake

Data Lake Metadata Data Warehouse Data Governance

Data Champions: Balancing IT and Business Needs

Cloudera

SEPTEMBER 10, 2020

This is because the majority of IT departments find it near impossible to just ‘ramp up’ data use, and even more difficult to do so at scale. Data Champions find the common ground that successfully meets the requirements of both business AND IT. How do you balance the business and IT needs around data access in your organization?

IT

IT Business Objectives Digital Transformation Data-driven

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

AWS Big Data

NOVEMBER 14, 2024

This interoperability is crucial for enabling seamless data access, reducing data silos, and fostering a more flexible and efficient data ecosystem. Delta Lake UniForm is an open table format extension designed to provide a universal data representation that can be efficiently read by different processing engines.

Metadata

Metadata Data Warehouse Big Data Data Lake

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

AWS Big Data

JANUARY 12, 2024

In this post, we walk through a high-level architecture and a specific use case that demonstrates how you can continue to scale your organization’s data platform without needing to spend large amounts of development time to address data privacy concerns. The data will be consumed by downstream analytical processes.

Data Lake

Data Lake Cost-Benefit Visualization Structured Data

It’s not your data. It’s how you use it. Unlock the power of data & build foundations of a data driven organisation

CIO Business Intelligence

MAY 24, 2022

Australian research and advisory firm Adapt identifies an organisation’s ability to execute a data-driven strategy as one of 12 core competencies , identified from 30,000 conversations spanning three years with leading IT and businesses. analyse the data, using business intelligence, visualisation or data science tools.

Data-driven

Data-driven Data Lake Data Warehouse Machine Learning

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

Harness Zero Copy data sharing from Salesforce Data Cloud to Amazon Redshift for Unified Analytics – Part 2

AWS Big Data

SEPTEMBER 12, 2024

Some of the important ones for Zero Copy data sharing includes: Data sharing is supported for all provisioned RA3 instance types (ra3.16xlarge, ra3.4xlarge, and ra3.xlplus) For cross-account and cross-Region data sharing, both the producer and consumer clusters and serverless namespaces must be encrypted.

Data Lake

Data Lake Analytics Data-driven Data Strategy

What you don’t know about data management could kill your business

CIO Business Intelligence

NOVEMBER 28, 2023

IT leaders take note: At your likely current trajectory, your organization is the Titanic and its data is the iceberg. To avoid the inevitable, CIOs must get serious about data management. Data, of course, has been all the rage the past decade, having been declared the “new oil” of the digital economy.

Management

Management Data Architecture Data Lake Data Strategy

Why Modernizing the First Mile of the Data Pipeline Can Accelerate all Analytics

Cloudera

AUGUST 13, 2021

Every enterprise is trying to collect and analyze data to get better insights into their business. Whether it is consuming log files, sensor metrics, and other unstructured data, most enterprises manage and deliver data to the data lake and leverage various applications like ETL tools, search engines, and databases for analysis.

Analytics

Analytics Data Lake Unstructured Data Enterprise

SoftBank Selects Cloudera Data Platform to Leverage Customer Intelligence While Ensuring Data Security

Cloudera

MAY 9, 2023

Supporting Data Access to Achieve Data-Driven Innovation Due to the spread of COVID-19, demand for digital services has increased at SoftBank. Cloudera Data Platform (CDP) will enable SoftBank to increase resources flexibly as needed and adjust resources to meet business needs.

Data Lake

Data Lake IoT Data Governance Data-driven

Creating Data Value With a Decentralized Data Strategy

CIO Business Intelligence

APRIL 6, 2022

For decades organizations chased the Holy Grail of a centralized data warehouse/lake strategy to support business intelligence and advanced analytics. Thinking about that intelligence as having millions of loosely connected decision points at the edge requires a different strategy, and you can’t micromanage it.

Data Strategy

Data Strategy Strategy Internet of Things Data Warehouse

AWS Lake Formation 2022 year in review

AWS Big Data

JANUARY 31, 2023

Data governance is the collection of policies, processes, and systems that organizations use to ensure the quality and appropriate handling of their data throughout its lifecycle for the purpose of generating business value. In November 2022, Lake Formation introduced version 3 of its cross-account sharing feature.

Data Lake

Data Lake Data Governance Data Architecture Machine Learning

How Data Analytics Tools Eliminate Business Owner Headaches

Smart Data Collective

AUGUST 7, 2019

Big data has the power to transform any small business. One study found that 77% of small businesses don’t even have a big data strategy. If your company lacks a big data strategy, then you need to start developing one today. The task of analyzing data is no simple feat. IT log data management tool.

Data Analytics

Data Analytics Analytics Big Data Advertising

Harness Zero Copy data sharing from Salesforce Data Cloud to Amazon Redshift for Unified Analytics – Part 1

AWS Big Data

AUGUST 27, 2024

This unified view helps your sales, service, and marketing teams build personalized customer experiences, invoke data-driven actions and workflows, and safely drive AI across all Salesforce applications. Instead, you simply connect and use the data in place, unlocking its value immediately with on demand access to the most recent data.

Data Lake

Data Lake Analytics Data-driven Management

Build an Amazon Redshift data warehouse using an Amazon DynamoDB single-table design

AWS Big Data

JUNE 21, 2023

This approach comes with a heavy computational cost in terms of processing and distributing the data across multiple tables while ensuring the system is ACID-compliant at all times, which can negatively impact performance and scalability. These types of queries are suited for a data warehouse. This is called index overloading.

Data Warehouse

Data Warehouse Data Lake OLAP Cost-Benefit

Data platform, un impulso alla customer experience e ai progetti IA

CIO Business Intelligence

JUNE 17, 2024

La data platform 100% in cloud è infatti, per Grendele, la base fondante del programma di trasformazione digitale: “Ci garantisce di poter utilizzare i dati con la frequenza e la velocità di aggiornamento necessari, a differenza di quanto accadrebbe con un data warehouse”, sottolinea la Direttrice IT.

Data Lake

Data Lake Data Warehouse Data Strategy Strategy

How Etihad taps data science to optimise airline operations

CIO Business Intelligence

MARCH 9, 2022

Despite the worldwide chaos, UAE national airline Etihad has managed to generate productivity gains and cost savings from insights using data science. Etihad began its data science journey with the Cloudera Data Platform and moved its data to the cloud to set up a data lake. A change was needed. Talal Mufti.

Data Science

Data Science Data Lake Cost-Benefit Digital Transformation

Straumann Group is transforming dentistry with data, AI

CIO Business Intelligence

FEBRUARY 16, 2023

The company’s orthodontics business, for instance, makes heavy use of image processing to the point that unstructured data is growing at a pace of roughly 20% to 25% per month. For example, imaging data can be used to show patients how an aligner will change their appearance over time. “It The offensive side?

Unstructured Data

Unstructured Data Data Lake Prescriptive Analytics Data Warehouse

Data governance in the age of generative AI

AWS Big Data

FEBRUARY 29, 2024

Data is your generative AI differentiator, and a successful generative AI implementation depends on a robust data strategy incorporating a comprehensive data governance approach. As part of the transformation, the objects need to be treated to ensure data privacy (for example, PII redaction).

Data Governance

Data Governance Unstructured Data Metadata Data Lake

Building a vision for real-time artificial intelligence

CIO Business Intelligence

APRIL 12, 2023

He had been trying to gather new data insights but was frustrated at how long it was taking. Data is a key component when it comes to making accurate and timely recommendations and decisions in real time, particularly when organizations try to implement real-time artificial intelligence. Sound familiar?) It isn’t easy.

Machine Learning

Machine Learning Cost-Benefit Data-driven Strategy

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

CIO Business Intelligence

AUGUST 2, 2023

A data and analytics capability cannot emerge from an IT or business strategy alone. With both technology and business organization deeply involved in the what, why, and how of data, companies need to create cross-functional data teams to get the most out of it. That strategy is doomed to fail. What are the layers?

Manufacturing

Manufacturing Data Architecture Data Strategy Strategy

Differentiate generative AI applications with your data using AWS analytics and managed databases

AWS Big Data

SEPTEMBER 12, 2024

However, if you use generative AI with your domain-specific data, it can provide a valuable perspective for your business and enable you to build differentiated generative AI applications and products that will stand out from others. In essence, you have to enrich the generative AI models with your differentiated data.

Management

Management Analytics Data Lake Interactive

Data Architecture and Strategy in the AI Era

Cloudera

MARCH 28, 2024

At a time when AI is exploding in popularity and finding its way into nearly every facet of business operations, data has arguably never been more valuable. In fact, two thirds of respondents agreed that data lakehouses were crucial to reducing pipeline complexity.

Data Architecture

Data Architecture Strategy Data Lake Data-driven

Achieve your AI goals with an open data lakehouse approach

IBM Big Data Hub

OCTOBER 4, 2023

Artificial intelligence (AI) is now at the forefront of how enterprises work with data to help reinvent operations, improve customer experiences, and maintain a competitive advantage. It’s no longer a nice-to-have, but an integral part of a successful data strategy.

Data Lake

Data Lake Metadata Data Warehouse Cost-Benefit

Addressing the Elephant in the Room – Welcome to Today’s Cloudera

Cloudera

JUNE 13, 2024

There were thousands of attendees at the event – lining up for book signings and meetings with recruiters to fill the endless job openings for developers experienced with MapReduce and managing Big Data. This was the gold rush of the 21st century, except the gold was data. That is the key to our open data lakehouse architecture.

Big Data

Big Data Machine Learning Contextual Data Data Lake

Why Game Studios Should Exploit Visual Analytics | BizAcuity

BizAcuity

SEPTEMBER 5, 2022

Inability to get player level data from the operators. It does not make sense for most casino suppliers to opt for integrated data solutions like data warehouses or data lakes which are expensive to build and maintain. They do not have a single view of their data which affects them. The Data Strategy.

Visualization

Visualization Analytics Data Warehouse Data Lake

Unstructured data management and governance using AWS AI/ML and analytics services

AWS Big Data

OCTOBER 25, 2023

Why it’s challenging to process and manage unstructured data Unstructured data makes up a large proportion of the data in the enterprise that can’t be stored in a traditional relational database management systems (RDBMS). Understanding the data, categorizing it, storing it, and extracting insights from it can be challenging.

Unstructured Data

Unstructured Data Metadata Management Analytics

Why Can’t we Advance Healthcare and Life Sciences this Fast all the time?

Cloudera

APRIL 4, 2022

While challenges exist in data interoperability, privacy controls, ongoing compliance initiatives, etc, the industry has proven speed is possible despite these obstacles. . The usage of data lakes and automation are helping facilitate the data sharing and collaboration across the healthcare ecosystem.

Data Lake

Data Lake Digital Transformation Manufacturing Sales

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architecture is a complex and varied field and different organizations and industries have unique needs when it comes to their data architects. Solutions data architect: These individuals design and implement data solutions for specific business needs, including data warehouses, data marts, and data lakes.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Advance Your Data-first Business With a Robust ISV Ecosystem

CIO Business Intelligence

JULY 18, 2022

To fully capitalize on data-first modernization, organizations need secure access to data spread across the IT landscape. Data is in constant flux, due to exponential growth, varied formats and structure, and the velocity at which it is being generated. An ISV ecosystem at work.

Cost-Benefit

Cost-Benefit Data Lake Data Warehouse Enterprise

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

The data flywheel: A better way to think about your data strategy

Webinars

Trending Sources

How BMW streamlined data access using AWS Lake Formation fine-grained access control

Webinars

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

Differences Between Data Lake and Data Warehouses

Steps taken to build Sevita’s first enterprise data platform

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Deriving Value from Data Lakes with AI

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

The rise of the data lakehouse: A new era of data value

Create an end-to-end data strategy for Customer 360 on AWS

Use open table format libraries on AWS Glue 5.0 for Apache Spark

Data Swamp, Data Lake, Data Lakehouse: What to Know

Data Champions: Balancing IT and Business Needs

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

It’s not your data. It’s how you use it. Unlock the power of data & build foundations of a data driven organisation

Top analytics announcements of AWS re:Invent 2024

Harness Zero Copy data sharing from Salesforce Data Cloud to Amazon Redshift for Unified Analytics – Part 2

What you don’t know about data management could kill your business

Why Modernizing the First Mile of the Data Pipeline Can Accelerate all Analytics

SoftBank Selects Cloudera Data Platform to Leverage Customer Intelligence While Ensuring Data Security

Creating Data Value With a Decentralized Data Strategy

AWS Lake Formation 2022 year in review

How Data Analytics Tools Eliminate Business Owner Headaches

Harness Zero Copy data sharing from Salesforce Data Cloud to Amazon Redshift for Unified Analytics – Part 1

Build an Amazon Redshift data warehouse using an Amazon DynamoDB single-table design

Data platform, un impulso alla customer experience e ai progetti IA

How Etihad taps data science to optimise airline operations

Straumann Group is transforming dentistry with data, AI

Data governance in the age of generative AI

Building a vision for real-time artificial intelligence

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

Differentiate generative AI applications with your data using AWS analytics and managed databases

Data Architecture and Strategy in the AI Era

Achieve your AI goals with an open data lakehouse approach

Addressing the Elephant in the Room – Welcome to Today’s Cloudera

Why Game Studios Should Exploit Visual Analytics | BizAcuity

Unstructured data management and governance using AWS AI/ML and analytics services

Why Can’t we Advance Healthcare and Life Sciences this Fast all the time?

What is a data architect? Skills, salaries, and how to become a data framework master

Advance Your Data-first Business With a Robust ISV Ecosystem

Stay Connected