Remove Analytics Remove Data Transformation Remove Workshop
article thumbnail

Reference guide to build inventory management and forecasting solutions on AWS

AWS Big Data

In this post, we discuss how to streamline inventory management forecasting systems with AWS managed analytics, AI/ML, and database services. Data transformation Data transformation is essential in inventory management and forecasting solutions for both data analysis around sales and inventory, as well as ML for forecasting.

article thumbnail

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. About the author Naidu Rongal i is a Big Data and ML engineer at Amazon.

Metadata 105
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

One key component that plays a central role in modern data architectures is the data lake, which allows organizations to store and analyze large amounts of data in a cost-effective manner and run advanced analytics and machine learning (ML) at scale. To overcome these issues, Orca decided to build a data lake.

article thumbnail

Improve observability across Amazon MWAA tasks

AWS Big Data

To run the scripts, refer to the Amazon MWAA analytics workshop. format(S3_BUCKET_NAME), 's3://{}/data/aggregated/green'.format(S3_BUCKET_NAME), To learn more and get hands-on experience, start with the Amazon MWAA analytics workshop and then use the scripts in the GitHub repo to gain more observability of your DAG run.

article thumbnail

Orchestrate Amazon EMR Serverless jobs with AWS Step functions

AWS Big Data

Amazon EMR Serverless provides a serverless runtime environment that simplifies the operation of analytics applications that use the latest open source frameworks, such as Apache Spark and Apache Hive. Karthik Prabhakar is a Senior Big Data Solutions Architect for Amazon EMR at AWS.

Big Data 134
article thumbnail

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

AWS Big Data

Amazon Athena supports the MERGE command on Apache Iceberg tables, which allows you to perform inserts, updates, and deletes in your data lake at scale using familiar SQL statements that are compliant with ACID (Atomic, Consistent, Isolated, Durable). The first task performs an initial copy of the full data into an S3 folder.

article thumbnail

Amazon EMR on EKS widens the performance gap: Run Apache Spark workloads 5.37 times faster and at 4.3 times lower cost

AWS Big Data

To learn more and get started with EMR on EKS, try out the EMR on EKS Workshop and visit the EMR on EKS Best Practices Guide page. About the Authors Melody Yang is a Senior Big Data Solution Architect for Amazon EMR at AWS. Her areas of interests are open-source frameworks and automation, data engineering and DataOps.

Testing 103