Blog and Data Lake - Data Leaders Brief

Differentiating Between Data Lakes and Data Warehouses

Smart Data Collective

SEPTEMBER 23, 2020

While there is a lot of discussion about the merits of data warehouses, not enough discussion centers around data lakes. We talked about enterprise data warehouses in the past, so let’s contrast them with data lakes. Both data warehouses and data lakes are used when storing big data.

Differentiating Between Data Lakes and Data Warehouses

Drug Launch Case Study: Amazing Efficiency Using DataOps

Webinars

Trending Sources

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Use Apache Iceberg in a data lake to support incremental data processing

Recap of Amazon Redshift key product announcements in 2024

Multicloud data lake analytics with Amazon Athena

Retrieval Augmented ML: How Can You Best Leverage a Data Lake?

Using AWS AppSync and AWS Lake Formation to access a secure data lake through a GraphQL API

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

Synchronize data lakes with CDC-based UPSERT using open table format, AWS Glue, and Amazon MSK

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

An AI Chat Bot Wrote This Blog Post …

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

How BMW streamlined data access using AWS Lake Formation fine-grained access control

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Centralize Your Data Processes With a DataOps Process Hub

Streamline AI-driven analytics with governance: Integrating Tableau with Amazon DataZone

What is a Data Mesh?

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

Implement slowly changing dimensions in a data lake using AWS Glue and Delta

2021 Gift Giving Guide for Data Nerds

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

Implementing a Pharma Data Mesh using DataOps

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

Eight Top DataOps Trends for 2022

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

Addressing Data Mesh Technical Challenges with DataOps

Data Mart vs. Data Lake: Understanding the Difference

Simplify data lake access control for your enterprise users with trusted identity propagation in AWS IAM Identity Center, AWS Lake Formation, and Amazon S3 Access Grants

How Salesforce optimized their detection and response platform using AWS managed services

Simplify data ingestion from Amazon S3 to Amazon Redshift using auto-copy

How DataOps is Transforming Commercial Pharma Analytics

Use open table format libraries on AWS Glue 5.0 for Apache Spark

Cloudera and Snowflake Partner to Deliver the Most Comprehensive Open Data Lakehouse

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Why Data Mesh Needs Data Virtualization

Stay Connected