Remove 2023 Remove Data Lake Remove Optimization
article thumbnail

MongoDB Enhances Developer Data Platform

David Menninger's Analyst Perspectives

These include architectural optimizations to reduce memory usage and query times with more efficient batch processing to deliver better throughput, faster bulk writes and accelerated concurrent writes during data replication. also extends MongoDBs Queryable Encryption capability, which was introduced in 2023.

Data Lake 130
article thumbnail

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

AWS Big Data

Amazon Redshift enables you to efficiently query and retrieve structured and semi-structured data from open format files in Amazon S3 data lake without having to load the data into Amazon Redshift tables. Amazon Redshift extends SQL capabilities to your data lake, enabling you to run analytical queries.

Data Lake 116
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

AWS Lake Formation 2023 year in review

AWS Big Data

AWS Lake Formation and the AWS Glue Data Catalog form an integral part of a data governance solution for data lakes built on Amazon Simple Storage Service (Amazon S3) with multiple AWS analytics services integrating with them. In 2023, we released several updates to AWS Glue crawlers. Crawlers, salut!

Data Lake 119
article thumbnail

Use Apache Iceberg in a data lake to support incremental data processing

AWS Big Data

Iceberg has become very popular for its support for ACID transactions in data lakes and features like schema and partition evolution, time travel, and rollback. and later supports the Apache Iceberg framework for data lakes. AWS Glue 3.0 The following diagram illustrates the solution architecture.

Data Lake 137
article thumbnail

Build a high-performance quant research platform with Apache Iceberg

AWS Big Data

Iceberg offers distinct advantages through its metadata layer over Parquet, such as improved data management, performance optimization, and integration with various query engines. Unlike direct Amazon S3 access, Iceberg supports these operations on petabyte-scale data lakes without requiring complex custom code.

Metadata 111
article thumbnail

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake 122
article thumbnail

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

These announcements drive forward the AWS Zero-ETL vision to unify all your data, enabling you to better maximize the value of your data with comprehensive analytics and ML capabilities, and innovate faster with secure data collaboration within and across organizations.