Book, Data Lake and Data Warehouse

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

AWS Big Data

NOVEMBER 14, 2024

This post explores how to start using Delta Lake UniForm on Amazon Web Services (AWS). You can learn how to query Delta Lake native tables through UniForm from different data warehouses or engines such as Amazon Redshift as an example of expanding data access to more engines. This takes around 2 minutes.

Metadata

Metadata Data Warehouse Big Data Data Lake

2021 Gift Giving Guide for Data Nerds

DataKitchen

DECEMBER 7, 2021

Back by popular demand, we’ve updated our data nerd Gift Giving Guide to cap off 2021. We’ve kept some classics and added some new titles that are sure to put a smile on your data nerd’s face. Here are eight highly recommendable books to help you find that special gift. ?? ?? ???. How did we get here?

Data-driven

Data-driven Data Governance Big Data Data Science

What is data architecture? A framework to manage data

CIO Business Intelligence

DECEMBER 20, 2024

Beyond breaking down silos, modern data architectures need to provide interfaces that make it easy for users to consume data using tools fit for their jobs. Data must be able to freely move to and from data warehouses, data lakes, and data marts, and interfaces must make it easy for users to consume that data.

Data Architecture

Data Architecture Management Consulting Internet of Things

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS Big Data

SEPTEMBER 13, 2023

A modern data architecture is an evolutionary architecture pattern designed to integrate a data lake, data warehouse, and purpose-built stores with a unified governance model. Of those tables, some are larger (such as in terms of record volume) than others, and some are updated more frequently than others.

Data Lake

Data Lake Data Processing Metadata Snapshot

Introducing generative AI upgrades for Apache Spark in AWS Glue (preview)

AWS Big Data

NOVEMBER 22, 2024

job reads a dataset, updated daily in an S3 bucket under different partitions, containing new book reviews from an online marketplace and runs SparkSQL to gather insights into the user votes for the book reviews. Understanding the upgrade process through an example We now show a production Glue 2.0 using the Spark Upgrade feature.

Cost-Benefit

Cost-Benefit Data-driven Software Testing

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

MLOps and DevOps: Why Data Makes It Different

O'Reilly on Data

OCTOBER 19, 2021

Adapted from the book Effective Data Science Infrastructure. Data is at the core of any ML project, so data infrastructure is a foundational concern. ML use cases rarely dictate the master data management solution, so the ML stack needs to integrate with existing data warehouses.

IT

IT Testing Experimentation Software

O’Reilly Releases First Chapters of a New Book about Logical Data Management

Data Virtualization

JANUARY 21, 2025

However, companies are still struggling to manage data effectively, to implement GenAI applications that deliver proven business value. The post OReilly Releases First Chapters of a New Book about Logical Data Management appeared first on Data Management Blog - Data Integration and Modern Data Management Articles, Analysis and Information.

Management

Management Data Integration Technology Data Warehouse

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To speed up the self-service analytics and foster innovation based on data, a solution was needed to provide ways to allow any team to create data products on their own in a decentralized manner. To create and manage the data products, smava uses Amazon Redshift , a cloud data warehouse.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Your guide to AWS Analytics at AWS re:Invent 2023

AWS Big Data

NOVEMBER 13, 2023

For those in the data world, this post provides a curated guide for all analytics sessions that you can use to quickly schedule and build your itinerary. Book your spot early for the sessions you do not want to miss. 11:30 AM – 12:30 PM (PDT) Ceasars Forum ANT318 | Accelerate innovation with end-to-end serverless data architecture.

Analytics

Analytics Data Lake Data Warehouse Data-driven

Data Modeling 201 for the cloud: designing databases for data warehouses

erwin

JUNE 7, 2022

Designing databases for data warehouses or data marts is intrinsically much different than designing for traditional OLTP systems. Accordingly, data modelers must embrace some new tricks when designing data warehouses and data marts. Figure 1: Pricing for a 4 TB data warehouse in AWS.

Data Warehouse

Data Warehouse Modeling Sales Data Lake

What you don’t know about data management could kill your business

CIO Business Intelligence

NOVEMBER 28, 2023

The knock-on impact of this lack of analyst coverage is a paucity of data about monies being spent on data management. In reality MDM ( master data management ) means Major Data Mess at most large firms, the end result of 20-plus years of throwing data into data warehouses and data lakes without a comprehensive data strategy.

Management

Management Data Architecture Data Lake Data Strategy

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

This dynamic integration of streaming data enables generative AI applications to respond promptly to changing conditions, improving their adaptability and overall performance in various tasks. To better understand this, imagine a chatbot that helps travelers book their travel.

Data Lake

Data Lake Unstructured Data Management Snapshot

What’s cooking with Amazon Redshift at AWS re:Invent 2023

AWS Big Data

NOVEMBER 15, 2023

Connect with experts, meet with book authors on data warehousing and analytics (at the Meet the Authors event on November 29 and 30, 3:00 PM – 4:00 PM), win prizes, and learn all about the latest innovations from our AWS Analytics services.

Data Lake

Data Lake Data Warehouse B2B Deep Learning

Q&A with Greg Rahn – The changing Data Warehouse market

Cloudera

DECEMBER 12, 2018

I was a student system administrator for the campus computing group and at that time they were migrating the campus phone book to a new tool, new to me, known as Oracle. After having rebuilt their data warehouse, I decided to take a little bit more of a pointed role, and I joined Oracle as a database performance engineer.

Data Warehouse

Data Warehouse Marketing Big Data Data-driven

Wonderla Holidays goes digital to enhance business and customer fun

CIO Business Intelligence

OCTOBER 18, 2022

One pulse sends 150 bytes of data. So, each band can send out 500KB to 750KB of data. To handle the huge volume of data thus generated, the company is in the process of deploying a data lake, data warehouse, and real-time analytical tools in a hybrid model.

Data Lake

Data Lake Data Warehouse Cost-Benefit Digital Transformation

Using Synapse Services with Dynamics? These Tools Make it Easier

Jet Global

MAY 27, 2022

How Synapse works with Data Lakes and Warehouses. Synapse services, data lakes, and data warehouses are often discussed together. Here’s how they correlate: Data lake: An information repository that can be stored in a variety of different ways, typically in a raw format like SQL.

Data Lake

Data Lake IT Recreation/Entertainment Data Warehouse

Build a decentralized semantic search engine on heterogeneous data stores using autonomous agents

AWS Big Data

MAY 28, 2024

The details of each step are as follows: Populate the Amazon Redshift Serverless data warehouse with company stock information stored in Amazon Simple Storage Service (Amazon S3). Redshift Serverless is a fully functional data warehouse holding data tables maintained in real time.

Unstructured Data

Unstructured Data Data Warehouse Structured Data Testing

Using other CDP services with Cloudera Operational Database

Cloudera

FEBRUARY 16, 2021

Cloudera Operational Database enables developers to quickly build future-proof applications that are architected to handle data evolution. Many business applications such as flight booking and mobile banking rely on a database that can scale and serve data at low latency. Cloudera Data Warehouse to perform ETL operations.

Machine Learning

Machine Learning Data Lake Enterprise Data Warehouse

Data Visualization and Visual Analytics: Seeing the World of Data

Sisense

JUNE 30, 2020

The data drawn from power visualizations comes from a variety of sources: Structured data , in the form of relational databases such as Excel, or unstructured data, deriving from text, video, audio, photos, the internet and smart devices. Her debut novel, The Book of Jeremiah , was published in 2019.

Visualization

Visualization Analytics Dashboards Data-driven

How AWS helped Altron Group accelerate their vision for optimized customer engagement

AWS Big Data

JULY 13, 2023

Generating business outcomes In 4 days, the Altron SI team left the Immersion Day workshop with the following: A data pipeline ingesting data from 21 sources (SQL tables and files) and combining them into three mastered and harmonized views that are cataloged for Altron’s B2B accounts.

Optimization

Optimization B2B Data Quality Sales

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

JANUARY 8, 2024

In his spare time, Raghavarao enjoys spending time with his family, reading books, and watching movies. Hang Zuo is a Senior Product Manager on the Amazon Kinesis Data Streams team at Amazon Web Services.

Analytics

Analytics IoT Data-driven Snapshot

How foundation models and data stores unlock the business potential of generative AI

IBM Big Data Hub

AUGUST 1, 2023

models are trained on IBM’s curated, enterprise-focused data lake. Fortunately, data stores serve as secure data repositories and enable foundation models to scale in both terms of their size and their training data. Foundation models focused on enterprise value IBM’s watsonx.ai All watsonx.ai

Modeling

Modeling Cost-Benefit Machine Learning Data Lake

Five benefits of a data catalog

IBM Big Data Hub

DECEMBER 16, 2022

You have a specific book in mind, but you have no idea where to find it. You enter the title of the book into the computer and the library’s digital inventory system tells you the exact section and aisle where the book is located. It uses metadata and data management tools to organize all data assets within your organization.

Metadata

Metadata Data Quality Data-driven Data Governance

Scale knowledge management use cases with generative AI

IBM Big Data Hub

JULY 27, 2023

Powering a knowledge management system with a data lakehouse Organizations need a data lakehouse to target data challenges that come with deploying an AI-powered knowledge management system. It provides the combination of data lake flexibility and data warehouse performance to help to scale AI.

Management

Management Enterprise Modeling Data Quality

This Structure has Novel Features which are of Considerable Business Interest

Peter James Thomas

APRIL 3, 2020

In fact is is the crucial final link between an organisation’s data and the people who need to use it. In many ways how people experience data capabilities will be determined by this final link. When the sadly common refrain of “we built state-of-the-art data capabilities, why is noone using them?

Dashboards

Dashboards Reporting Sales Data Lake

Why Data Culture Made Me Pack a Space Suit and Head to Orlando

Alation

FEBRUARY 13, 2020

In his book titled “The Fourth Industrial Revolution,” Klaus Schwab describes the age as, “characterized by a much more ubiquitous and mobile internet, by smaller and more powerful sensors that have become cheaper, and by artificial intelligence and machine learning.”

Machine Learning

Machine Learning Data-driven Big Data Technology

Unlock The Power of Your Data With These 19 Big Data & Data Analytics Books

datapine

AUGUST 29, 2022

With that in mind, we have prepared a list of the top 19 definitive data analytics and big data books, along with magazines and authentic readers’ reviews upvoted by the Goodreads community. Essential Big Data And Data Analytics Insights. Discover The Best Data Analytics And Big Data Books Of All Time.

Big Data

Big Data Data Analytics Analytics Data mining

What is a Data Pipeline?

Jet Global

MAY 9, 2024

The key components of a data pipeline are typically: Data Sources : The origin of the data, such as a relational database , data warehouse, data lake , file, API, or other data store. This can include tasks such as data ingestion, cleansing, filtering, aggregation, or standardization.

Data Lake

Data Lake Data Warehouse Business Intelligence Machine Learning

Enhance Trino Performance With Simba’s Powerful Connectivity

Jet Global

JANUARY 30, 2025

Its distributed architecture empowers organizations to query massive datasets across databases, data lakes, and cloud platforms with speed and reliability. Optimizing connections to your data sources is equally important, as it directly impacts the speed and efficiency of data access.

Data Lake

Data Lake Data-driven Optimization Enterprise

Remodel Your Oracle Cloud Data with a Data Lakehouse

Jet Global

NOVEMBER 21, 2023

To have any hope of generating value from growing data sets, enterprise organizations must turn to the latest technology. You’ve heard of data warehouses, and probable data lakes, but now, the data lakehouse is emerging as the new corporate buzzword. To address this, the data lakehouse was born.

Data Lake

Data Lake Data Warehouse Reporting Enterprise

What is Data Mapping?

Jet Global

FEBRUARY 23, 2024

This includes cleaning, aggregating, enriching, and restructuring data to fit the desired format. Load : Once data transformation is complete, the transformed data is loaded into the target system, such as a data warehouse, database, or another application.

Data Warehouse

Data Warehouse Reporting Data Transformation Visualization

Do the Benefits of Cloud Outweigh the Costs?

Jet Global

SEPTEMBER 19, 2023

Data Access What insights can we derive from our cloud ERP? What are the best practices for analyzing cloud ERP data? Data Management How do we create a data warehouse or data lake in the cloud using our cloud ERP? How do I access the legacy data from my previous ERP?

Cost-Benefit

Cost-Benefit Data Warehouse Reporting Enterprise

Oracle Cloud Migration FAQs Answered by Angles

Jet Global

SEPTEMBER 30, 2022

What are the best practices for analyzing cloud ERP data? Data Management. How do we create a data warehouse or data lake in the cloud using our cloud ERP? How do I access the legacy data from my previous ERP? How can we rapidly build BI reports on cloud ERP data without any help from IT?

Reporting

Reporting Data Warehouse Operational Reporting Enterprise

The Right Tool to Support Your Microsoft Dynamics Migration

Jet Global

JUNE 13, 2022

When migrating to the cloud, there are a variety of different approaches you can take to maintain your data strategy. Those options include: Data lake or Azure Data Lake Services (ADLS) is Microsoft’s new data solution, which provides unstructured date analytics through AI. Different Approaches to Migration.

Reporting

Reporting Data Lake Sales Operational Reporting

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Jet Global

OCTOBER 1, 2024

Trino allows users to run ad hoc queries across massive datasets, making real-time decision-making a reality without needing extensive data transformations. This is particularly valuable for teams that require instant answers from their data. Data Lake Analytics: Trino doesn’t just stop at databases.

Dashboards

Dashboards Data Lake Reporting Cost-Benefit

Overcome These 4 Common D365 F&SCM Challenges with Jet Reports

Jet Global

APRIL 26, 2022

For companies that operate multiple corporate entities, the most common approach is to create distinct companies within D365 F&SCM, each with its own set of books. Jet Reports now offers high performance connectivity with options to connect to Synapse/Azure Data Lakes, BYOD, SQL or your Cubes and Tabular models.

Reporting

Reporting Finance Cost-Benefit Forecasting

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

AWS Big Data

APRIL 28, 2025

Use existing AWS Glue tables This section has following prerequisites: A data lake administrator user by following Create a data lake administrator. For detailed instruction see Revoking permission using the Lake Formation console. Noritaka Sekiyama is a Principal Big Data Architect on the AWS Glue team.

Metadata

Metadata Data Lake Big Data Publishing

Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer

AWS Big Data

OCTOBER 30, 2024

This is the final part of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to process data with Amazon Redshift Spectrum and create the gold (consumption) layer. The following diagram illustrates the different layers of the data lake.

Data Lake

Data Lake Machine Learning Data Architecture Data-driven

SAP BPC Alternatives: Which One is Right for You?

Jet Global

MARCH 27, 2025

data lakes & warehouses like Cloudera, Google Big Query, etc., Scalability: Your source systems, data volumes, and calculation complexities change as your business evolves. This includes databases like Microsoft SQL server, IBM DB2, etc., ERP & accounting systems like Microsoft Dynamics 365, SAGE, Quickbooks, etc.,

Finance

Finance Reporting Cost-Benefit Forecasting

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

2021 Gift Giving Guide for Data Nerds

Webinars

Trending Sources

What is data architecture? A framework to manage data

Webinars

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Introducing generative AI upgrades for Apache Spark in AWS Glue (preview)

Top analytics announcements of AWS re:Invent 2024

MLOps and DevOps: Why Data Makes It Different

O’Reilly Releases First Chapters of a New Book about Logical Data Management

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Your guide to AWS Analytics at AWS re:Invent 2023

Data Modeling 201 for the cloud: designing databases for data warehouses

What you don’t know about data management could kill your business

Exploring real-time streaming for generative AI Applications

What’s cooking with Amazon Redshift at AWS re:Invent 2023

Q&A with Greg Rahn – The changing Data Warehouse market

Wonderla Holidays goes digital to enhance business and customer fun

Using Synapse Services with Dynamics? These Tools Make it Easier

Build a decentralized semantic search engine on heterogeneous data stores using autonomous agents

Using other CDP services with Cloudera Operational Database

Data Visualization and Visual Analytics: Seeing the World of Data

How AWS helped Altron Group accelerate their vision for optimized customer engagement

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

How foundation models and data stores unlock the business potential of generative AI

Five benefits of a data catalog

Scale knowledge management use cases with generative AI

This Structure has Novel Features which are of Considerable Business Interest

Why Data Culture Made Me Pack a Space Suit and Head to Orlando

Unlock The Power of Your Data With These 19 Big Data & Data Analytics Books

What is a Data Pipeline?

Enhance Trino Performance With Simba’s Powerful Connectivity

Remodel Your Oracle Cloud Data with a Data Lakehouse

What is Data Mapping?

Do the Benefits of Cloud Outweigh the Costs?

Oracle Cloud Migration FAQs Answered by Angles

The Right Tool to Support Your Microsoft Dynamics Migration

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Overcome These 4 Common D365 F&SCM Challenges with Jet Reports

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer

SAP BPC Alternatives: Which One is Right for You?

Stay Connected