2024, Data Lake and Data Transformation

2024

Data Lake

Data Transformation

Introducing simplified interaction with the Airflow REST API in Amazon MWAA

AWS Big Data

OCTOBER 23, 2024

123} ▶ Pre task execution logs [2024-10-21, 16:56:12 UTC] {subprocess.py:63} 63} INFO - Tmp dir root location: /tmp [2024-10-21, 16:56:12 UTC] {subprocess.py:75} 123} ▶ Pre task execution logs [2024-10-21, 16:56:12 UTC] {subprocess.py:63} 63} INFO - Tmp dir root location: /tmp [2024-10-21, 16:56:12 UTC] {subprocess.py:75}

Interactive

Interactive Testing Data-driven Data Lake

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. Then, invoke the model.

Metadata

Metadata Data Lake Modeling Data Warehouse

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

Amazon Q data integration , introduced in January 2024, allows you to use natural language to author extract, transform, load (ETL) jobs and operations in AWS Glue specific data abstraction DynamicFrame. In this post, we discuss how Amazon Q data integration transforms ETL workflow development.

Data Integration

Data Integration Visualization Data Processing Big Data

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

Amazon Redshift , a warehousing service, offers a variety of options for ingesting data from diverse sources into its high-performance, scalable environment. If storing operational data in a data warehouse is a requirement, synchronization of tables between operational data stores and Amazon Redshift tables is supported.

IoT

IoT Data Warehouse Cost-Benefit Reporting

Optimize data layout by bucketing with Amazon Athena and AWS Glue to accelerate downstream queries

AWS Big Data

APRIL 25, 2024

In the era of data, organizations are increasingly using data lakes to store and analyze vast amounts of structured and unstructured data. Data lakes provide a centralized repository for data from various sources, enabling organizations to unlock valuable insights and drive data-driven decision-making.

Optimization

Optimization Data Lake Cost-Benefit Reporting

CIO 100 Award winners drive business results with IT

CIO Business Intelligence

AUGUST 7, 2024

The following 10 award-winning projects showcase the impressive power of IT in the enterprise today and the ingenuity of modern CIOs and their teams, serving as representatives for the cohort of 2024 honorees. The end result, completed in early 2024 and now fully operational, is the data center EMR mirrored in cloud infrastructure.

IT Insurance Cost-Benefit Testing

Melting the ice — How Natural Intelligence simplified a data lake migration to Apache Iceberg

AWS Big Data

APRIL 28, 2025

Many organizations turn to data lakes for the flexibility and scale needed to manage large volumes of structured and unstructured data. Recently, NI embarked on a journey to transition their legacy data lake from Apache Hive to Apache Iceberg. NIs leading brands, Top10.com

Data Lake

Data Lake Metadata Cost-Benefit Snapshot

Stream real-time data into Apache Iceberg tables in Amazon S3 using Amazon Data Firehose

AWS Big Data

NOVEMBER 6, 2024

Second, because traditional data warehousing approaches are unable to keep up with the volume, velocity, and variety of data, engineering teams are building data lakes and adopting open data formats such as Parquet and Apache Iceberg to store their data. For Source , select Direct PUT.

Metadata

Metadata Data Lake Management Internet of Things

Streamline AWS WAF log analysis with Apache Iceberg and Amazon Data Firehose

AWS Big Data

FEBRUARY 18, 2025

To optimize their security operations, organizations are adopting modern approaches that combine real-time monitoring with scalable data analytics. They are using data lake architectures and Apache Iceberg to efficiently process large volumes of security data while minimizing operational overhead. worker_type G.1X

Snapshot

Snapshot Optimization Data Lake Metadata

Data Leaders Brief

Introducing simplified interaction with the Airflow REST API in Amazon MWAA

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Webinars

Trending Sources

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Webinars

Amazon Redshift data ingestion options

Optimize data layout by bucketing with Amazon Athena and AWS Glue to accelerate downstream queries

CIO 100 Award winners drive business results with IT

Melting the ice — How Natural Intelligence simplified a data lake migration to Apache Iceberg

Stream real-time data into Apache Iceberg tables in Amazon S3 using Amazon Data Firehose

Streamline AWS WAF log analysis with Apache Iceberg and Amazon Data Firehose

Stay Connected