Data Lake and Reference - Data Leaders Brief

Data Lake

Reference

How to Implement Data Engineering in Practice?

Analytics Vidhya

DECEMBER 1, 2021

Image Source: GitHub Table of Contents What is Data Engineering? Components of Data Engineering Object Storage Object Storage MinIO Install Object Storage MinIO Data Lake with Buckets Demo Data Lake Management Conclusion References What is Data Engineering?

How to Implement Data Engineering in Practice?

Unleash deeper insights with Amazon Redshift data sharing for data lake tables

Webinars

Trending Sources

Multicloud data lake analytics with Amazon Athena

Webinars

Recap of Amazon Redshift key product announcements in 2024

Run Apache XTable in AWS Lambda for background conversion of open table formats

Load data incrementally from transactional data lakes to data warehouses

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

Enrich your serverless data lake with Amazon Bedrock

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Use Apache Iceberg in a data lake to support incremental data processing

Amazon SageMaker Lakehouse now supports attribute-based access control

Choosing an open table format for your transactional data lake on AWS

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Migrate Delta tables from Azure Data Lake Storage to Amazon S3 using AWS Glue

Synchronize data lakes with CDC-based UPSERT using open table format, AWS Glue, and Amazon MSK

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Complexity Drives Costs: A Look Inside BYOD and Azure Data Lakes

Build a real-time GDPR-aligned Apache Iceberg data lake

Implement slowly changing dimensions in a data lake using AWS Glue and Delta

Use open table format libraries on AWS Glue 5.0 for Apache Spark

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

Building end-to-end data lineage for one-time and complex queries using Amazon Athena, Amazon Redshift, Amazon Neptune and dbt

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

Build a data lake with Apache Flink on Amazon EMR

Understanding Apache Iceberg on AWS with the new technical guide

Implementing a Pharma Data Mesh using DataOps

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 1: Getting Started

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Write queries faster with Amazon Q generative SQL for Amazon Redshift

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

Streamline AI-driven analytics with governance: Integrating Tableau with Amazon DataZone

Achieve data resilience using Amazon OpenSearch Service disaster recovery with snapshot and restore

Simplify data ingestion from Amazon S3 to Amazon Redshift using auto-copy

Reference guide to build inventory management and forecasting solutions on AWS

Accelerate SQL code migration from Google BigQuery to Amazon Redshift using BladeBridge

Simplify data lake access control for your enterprise users with trusted identity propagation in AWS IAM Identity Center, AWS Lake Formation, and Amazon S3 Access Grants

Stay Connected