Analytics, Data Transformation and Workshop

Analytics

Data Transformation

Workshop

Reference guide to build inventory management and forecasting solutions on AWS

AWS Big Data

APRIL 11, 2023

In this post, we discuss how to streamline inventory management forecasting systems with AWS managed analytics, AI/ML, and database services. Data transformation Data transformation is essential in inventory management and forecasting solutions for both data analysis around sales and inventory, as well as ML for forecasting.

Forecasting

Forecasting Management IoT Data-driven

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. About the author Naidu Rongal i is a Big Data and ML engineer at Amazon.

Metadata

Metadata Data Lake Modeling Data Warehouse

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

One key component that plays a central role in modern data architectures is the data lake, which allows organizations to store and analyze large amounts of data in a cost-effective manner and run advanced analytics and machine learning (ML) at scale. To overcome these issues, Orca decided to build a data lake.

Data Lake

Data Lake Analytics Snapshot Data Quality

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Improve observability across Amazon MWAA tasks

AWS Big Data

FEBRUARY 6, 2023

To run the scripts, refer to the Amazon MWAA analytics workshop. format(S3_BUCKET_NAME), 's3://{}/data/aggregated/green'.format(S3_BUCKET_NAME), To learn more and get hands-on experience, start with the Amazon MWAA analytics workshop and then use the scripts in the GitHub repo to gain more observability of your DAG run.

Management

Management Interactive Publishing Metadata

Orchestrate Amazon EMR Serverless jobs with AWS Step functions

AWS Big Data

OCTOBER 12, 2023

Amazon EMR Serverless provides a serverless runtime environment that simplifies the operation of analytics applications that use the latest open source frameworks, such as Apache Spark and Apache Hive. Karthik Prabhakar is a Senior Big Data Solutions Architect for Amazon EMR at AWS.

Big Data

Big Data Data-driven Management Visualization

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

AWS Big Data

APRIL 27, 2023

Amazon Athena supports the MERGE command on Apache Iceberg tables, which allows you to perform inserts, updates, and deletes in your data lake at scale using familiar SQL statements that are compliant with ACID (Atomic, Consistent, Isolated, Durable). The first task performs an initial copy of the full data into an S3 folder.

Data Lake

Data Lake Snapshot Optimization Data Transformation

Amazon EMR on EKS widens the performance gap: Run Apache Spark workloads 5.37 times faster and at 4.3 times lower cost

AWS Big Data

APRIL 12, 2023

To learn more and get started with EMR on EKS, try out the EMR on EKS Workshop and visit the EMR on EKS Best Practices Guide page. About the Authors Melody Yang is a Senior Big Data Solution Architect for Amazon EMR at AWS. Her areas of interests are open-source frameworks and automation, data engineering and DataOps.

Testing

Testing Big Data Metadata Optimization

Use Snowflake with Amazon MWAA to orchestrate data pipelines

AWS Big Data

OCTOBER 31, 2023

The solution provides an end-to-end automated workflow that includes data ingestion, transformation, analytics, and consumption. The data used for transformation and analysis is based on the publicly available New York Citi Bike dataset. Bosco Albuquerque is a Sr.

Data Processing

Data Processing Management Publishing Visualization

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

This shift addresses a growing demand for data access, which the modern data stack enables with cloud-based services and integration. There has also been a paradigm shift toward agile analytics and flexible options, where data assets can be moved around more quickly and easily, and not locked into a single vendor.

Data Warehouse

Data Warehouse Cost-Benefit Data Science Data Transformation

Build a data lake with Apache Flink on Amazon EMR

AWS Big Data

JANUARY 27, 2023

The AWS Glue Data Catalog provides a uniform repository where disparate systems can store and find metadata to keep track of data in data silos. Apache Flink is a widely used data processing engine for scalable streaming ETL, analytics, and event-driven applications. Transformed data can be stored in Amazon S3.

Data Lake

Data Lake Metadata Business Analysis Data-driven

Streamline AWS WAF log analysis with Apache Iceberg and Amazon Data Firehose

AWS Big Data

FEBRUARY 18, 2025

To optimize their security operations, organizations are adopting modern approaches that combine real-time monitoring with scalable data analytics. They are using data lake architectures and Apache Iceberg to efficiently process large volumes of security data while minimizing operational overhead.

Snapshot

Snapshot Optimization Data Lake Metadata

Stream real-time data into Apache Iceberg tables in Amazon S3 using Amazon Data Firehose

AWS Big Data

NOVEMBER 6, 2024

Iceberg brings the reliability and simplicity of SQL tables to Amazon Simple Storage Service (Amazon S3) data lakes. In Transform records , select Turn on data transformation. To learn more about using Amazon Data Firehose with Apache Iceberg, see the Firehose Developer Guide or try the Immersion day workshop.

Metadata

Metadata Data Lake Management Internet of Things

Data Leaders Brief

Reference guide to build inventory management and forecasting solutions on AWS

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Webinars

Trending Sources

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Webinars

Improve observability across Amazon MWAA tasks

Orchestrate Amazon EMR Serverless jobs with AWS Step functions

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

Amazon EMR on EKS widens the performance gap: Run Apache Spark workloads 5.37 times faster and at 4.3 times lower cost

Use Snowflake with Amazon MWAA to orchestrate data pipelines

The Modern Data Stack Explained: What The Future Holds

Build a data lake with Apache Flink on Amazon EMR

Streamline AWS WAF log analysis with Apache Iceberg and Amazon Data Firehose

Stream real-time data into Apache Iceberg tables in Amazon S3 using Amazon Data Firehose

Stay Connected