Data Analytics, Data Transformation and Events

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

At AWS, we are committed to empowering organizations with tools that streamline data analytics and transformation processes. This integration enables data teams to efficiently transform and manage data using Athena with dbt Cloud’s robust features, enhancing the overall data workflow experience.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

SQL Streambuilder Data Transformations

Cloudera

FEBRUARY 21, 2023

SQL Stream Builder (SSB) is a versatile platform for data analytics using SQL as a part of Cloudera Streaming Analytics, built on top of Apache Flink. It enables users to easily write, run, and manage real-time continuous SQL queries on stream data and a smooth user experience. What is a data transformation?

Data Transformation

Data Transformation Data Processing Data Collection Publishing

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

Your generated jobs can use a variety of data transformations, including filters, projections, unions, joins, and aggregations, giving you the flexibility to handle complex data processing requirements. In this post, we discuss how Amazon Q data integration transforms ETL workflow development.

Data Integration

Data Integration Visualization Data Processing Big Data

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

Together with price-performance, Amazon Redshift offers capabilities such as serverless architecture, machine learning integration within your data warehouse and secure data sharing across the organization. dbt Cloud is a hosted service that helps data teams productionize dbt deployments. or a later version) database.

Data Warehouse

Data Warehouse Analytics Testing Sales

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

AWS Big Data

JUNE 29, 2023

Amazon Kinesis Data Analytics makes it easy to transform and analyze streaming data in real time. In this post, we discuss why AWS recommends moving from Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics for Apache Flink to take advantage of Apache Flink’s advanced streaming capabilities.

Data Analytics

Data Analytics Analytics IoT Data Lake

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

In addition to real-time analytics and visualization, the data needs to be shared for long-term data analytics and machine learning applications. This approach supports both the immediate needs of visualization tools such as Tableau and the long-term demands of digital twin and IoT data analytics.

IoT

IoT Machine Learning Metadata Data-driven

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

AWS Big Data

NOVEMBER 22, 2024

In this post, we’ll walk through an example ETL process that uses session reuse to efficiently create, populate, and query temporary staging tables across the full data transformation workflow—all within the same persistent Amazon Redshift database session. Building event-driven applications with Amazon EventBridge and Lambda.

Data Warehouse

Data Warehouse Recreation/Entertainment Cost-Benefit Data-driven

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. If we talk about Big Data, data visualization is crucial to more successfully drive high-level decision making.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

AWS Big Data

OCTOBER 11, 2023

The lift and shift migration approach is limited in its ability to transform businesses because it relies on outdated, legacy technologies and architectures that limit flexibility and slow down productivity. This may require frequent truncation in certain tables to retain only the latest stream of events.

Management

Management Metadata Analytics Dashboards

Use AWS Glue to streamline SFTP data processing

AWS Big Data

AUGUST 13, 2024

In this blog post, we explore how to use the SFTP Connector for AWS Glue from the AWS Marketplace to efficiently process data from Secure File Transfer Protocol (SFTP) servers into Amazon Simple Storage Service (Amazon S3) , further empowering your data analytics and insights. The following diagram illustrates this architecture.

Data Processing

Data Processing Visualization Data Lake Data Processing

Gain insights from historical location data using Amazon Location Service and AWS analytics services

AWS Big Data

MARCH 13, 2024

Data analytics – Business analysts gather operational insights from multiple data sources, including the location data collected from the vehicles. Using EventBridge integration, filtered positional updates are published to an EventBridge event bus. The ingestion approach is not in scope of this post.

Analytics

Analytics IoT Metadata Internet of Things

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

The key to scaling the Institutional Division operating model are the following considerations: Data as a product approach – Use techniques like event storming and domain-driven design to capture business events and their meanings.

Metadata

Metadata Data Governance Data Quality Data-driven

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

AWS Big Data

JANUARY 6, 2025

Amazon AppFlow is a fully managed integration service that you can use to securely transfer data from software as a service (SaaS) applications, such as Google BigQuery, Salesforce, SAP, HubSpot, and ServiceNow, to Amazon Web Services (AWS) services such as Amazon Simple Storage Service (Amazon S3) and Amazon Redshift, in just a few clicks.

Analytics

Analytics Data Warehouse Big Data Metrics

The Best Data Management Tools For Small Businesses

Smart Data Collective

APRIL 29, 2020

The extraction of raw data, transforming to a suitable format for business needs, and loading into a data warehouse. Data transformation. This process helps to transform raw data into clean data that can be analysed and aggregated. Data analytics and visualisation.

Management

Management Data Warehouse Digital Transformation Dashboards

Deliver decompressed Amazon CloudWatch Logs to Amazon S3 and Splunk using Amazon Data Firehose

AWS Big Data

APRIL 2, 2024

You can use Amazon Data Firehose to aggregate and deliver log events from your applications and services captured in Amazon CloudWatch Logs to your Amazon Simple Storage Service (Amazon S3) bucket and Splunk destinations, for use cases such as data analytics, security analysis, application troubleshooting etc.

Metadata

Metadata Marketing Analytics Data Transformation

What is business analytics? Using data to improve business outcomes

CIO Business Intelligence

JULY 5, 2022

Business analytics can help you improve operational efficiency, better understand your customers, project future outcomes, glean insights to aid in decision-making, measure performance, drive growth, discover hidden trends, generate leads, and scale your business in the right direction, according to digital skills training company Simplilearn.

Business Analytics

Business Analytics Prescriptive Analytics Data mining Diagnostic Analytics

Reference guide to build inventory management and forecasting solutions on AWS

AWS Big Data

APRIL 11, 2023

Data ingestion and storage Retail businesses have event-driven data that requires action from downstream processes. It’s critical for an inventory management application to handle the data ingestion and storage for changing demands. The volume and velocity of data can change in the retail industry each day.

Forecasting

Forecasting Management IoT Data-driven

Harnessing Streaming Data: Insights at the Speed of Life

Sisense

OCTOBER 15, 2020

We live in a world of data: There’s more of it than ever before, in a ceaselessly expanding array of forms and locations. Dealing with Data is your window into the ways data teams are tackling the challenges of this new world to help their companies and their customers thrive. billion market by 2025.

Dashboards

Dashboards IoT Optimization Internet of Things

What is DataOps? Collaborative, cross-functional analytics

CIO Business Intelligence

DECEMBER 22, 2022

Research firm Gartner further describes the methodology as one focused on “improving the communication, integration, and automation of data flows between data managers and data consumers across an organization.” The approach values continuous delivery of analytic insights with the primary goal of satisfying the customer.

Analytics

Analytics Machine Learning Data mining Data Quality

Set up alerts and orchestrate data quality rules with AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

Furthermore, it allows for necessary actions to be taken, such as rectifying errors in the data source, refining data transformation processes, and updating data quality rules. An EventBridge rule receives an event notification from the AWS Glue Data Quality evaluations including the results.

Data Quality

Data Quality Metrics Data-driven Visualization

Monitor data pipelines in a serverless data lake

AWS Big Data

AUGUST 9, 2023

The advent of rapid adoption of serverless data lake architectures—with ever-growing datasets that need to be ingested from a variety of sources, followed by complex data transformation and machine learning (ML) pipelines—can present a challenge. These event changes are also routed to the same SNS topic.

Data Lake

Data Lake Metrics Cost-Benefit Testing

7 key Microsoft Azure analytics services (plus one extra)

CIO Business Intelligence

JUNE 29, 2022

And as businesses contend with increasingly large amounts of data, the cloud is fast becoming the logical place where analytics work gets done. For many enterprises, Microsoft Azure has become a central hub for analytics. Azure Data Factory. Azure Data Explorer. Azure Synapse Analytics. Azure Databricks.

Data Lake

Data Lake Analytics Data Warehouse Machine Learning

An AI Chat Bot Wrote This Blog Post …

DataKitchen

DECEMBER 9, 2022

ChatGPT> DataOps, or data operations, is a set of practices and technologies that organizations use to improve the speed, quality, and reliability of their data analytics processes. Overall, DataOps is an essential component of modern data-driven organizations. Query> DataOps. Query> Write an essay on DataOps.

Machine Learning

Machine Learning Data-driven Optimization Data Analytics

How healthcare organizations can analyze and create insights using price transparency data

AWS Big Data

OCTOBER 11, 2023

The data in the machine-readable files can provide valuable insights to understand the true cost of healthcare services and compare prices and quality across hospitals. The availability of machine-readable files opens up new possibilities for data analytics, allowing organizations to analyze large amounts of pricing data.

Visualization

Visualization Dashboards Data-driven Gap analysis

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

It does this by helping teams handle the T in ETL (extract, transform, and load) processes. It allows users to write data transformation code, run it, and test the output, all within the framework it provides. Data pipeline dbt, an open-source tool, can be installed in the AWS environment and set up to work with Amazon MWAA.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Cloudera

JULY 21, 2022

Cloudera users can securely connect Rill to a source of event stream data, such as Cloudera DataFlow , model data into Rill’s cloud-based Druid service, and share live operational dashboards within minutes via Rill’s interactive metrics dashboard or any connected BI solution. Cloudera Data Warehouse). Apache Hive.

Metrics

Metrics Slice and Dice Data Warehouse Dashboards

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

AWS Big Data

NOVEMBER 13, 2023

The upstream data pipeline is a robust system that integrates various data sources, including Amazon Kinesis and Amazon Managed Streaming for Apache Kafka (Amazon MSK) for handling clickstream events, Amazon Relational Database Service (Amazon RDS) for delta transactions, and Amazon DynamoDB for delta game-related information.

Data Warehouse

Data Warehouse Analytics Data Lake Data Science

Building Better Data Models to Unlock Next-Level Intelligence

Sisense

MAY 11, 2021

You can’t talk about data analytics without talking about data modeling. These two functions are nearly inseparable as we move further into a world of analytics that blends sources of varying volume, variety, veracity, and velocity. When it comes to data modeling, function determines form.

Modeling

Modeling Big Data IoT Data Warehouse

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

AWS Big Data

AUGUST 1, 2023

You can do this by updating the CloudFormation stack with a flag that includes the CDC and data transformation steps. This will enable both the CDC steps and the data transformation steps for the Jira data. The DataBrew job performs data transformation and filtering tasks. Choose Update.

Data Lake

Data Lake Data Transformation Data-driven Cost-Benefit

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

The data products from the Business Vault and Data Mart stages are now available for consumers. smava decided to use Tableau for business intelligence, data visualization, and further analytics. The data transformations are managed with dbt to simplify the workflow governance and team collaboration.

Data Lake

Data Lake Data Warehouse Data-driven B2B

How CFM built a well-governed and scalable data-engineering platform using Amazon EMR for financial features generation

AWS Big Data

SEPTEMBER 13, 2024

Another advantage was improved security, particularly the ability to contain the potential impact area in the event of credential leaks or account compromises. Joel has led data transformation projects on fraud analytics, claims automation, and Master Data Management.

Interactive

Interactive Strategy Cost-Benefit Data Governance

Use Snowflake with Amazon MWAA to orchestrate data pipelines

AWS Big Data

OCTOBER 31, 2023

Customers rely on data from different sources such as mobile applications, clickstream events from websites, historical data, and more to deduce meaningful patterns to optimize their products, services, and processes. With over 20 years of experience in storage and data analytics, he also holds a PhD from Stanford University.

Data Processing

Data Processing Management Publishing Visualization

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Alation

OCTOBER 27, 2022

Few actors in the modern data stack have inspired the enthusiasm and fervent support as dbt. This data transformation tool enables data analysts and engineers to transform, test and document data in the cloud data warehouse. Bindu Chandramohan, Lead, Data Analytics, Alation : Thanks, Jason!

Metrics

Metrics Dashboards Sales Reporting

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

AWS Big Data

MARCH 3, 2023

Additionally, the scale is significant because the multi-tenant data sources provide a continuous stream of testing activity, and our users require quick data refreshes as well as historical context for up to a decade due to compliance and regulatory demands. Finally, data integrity is of paramount importance.

Software

Software Data Lake Testing Cost-Benefit

Migrate your existing SQL-based ETL workload to an AWS serverless ETL infrastructure using AWS Glue

AWS Big Data

JULY 31, 2023

You can examine various events from the stack creation process on the Events tab. This concludes creating data sources on the AWS Glue job canvas. Next, we add transformations by combining data from these different tables. Under Transforms , choose SQL Query. Choose Submit.

Sales

Sales Data Warehouse Visualization Testing

What is a Data Pipeline?

Jet Global

MAY 9, 2024

A data pipeline is a series of processes that move raw data from one or more sources to one or more destinations, often transforming and processing the data along the way. This can include tasks such as data ingestion, cleansing, filtering, aggregation, or standardization.

Data Lake

Data Lake Data Warehouse Business Intelligence Machine Learning

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Jet Global

OCTOBER 1, 2024

Trino allows users to run ad hoc queries across massive datasets, making real-time decision-making a reality without needing extensive data transformations. This is particularly valuable for teams that require instant answers from their data. Data Lake Analytics: Trino doesn’t just stop at databases.

Dashboards

Dashboards Data Lake Reporting Cost-Benefit

Enhancing Your BI Experience With Apache Iceberg

Jet Global

JULY 16, 2024

By providing a consistent and stable backend, Apache Iceberg ensures that data remains immutable and query performance is optimized, thus enabling businesses to trust and rely on their BI tools for critical insights. It provides a stable schema, supports complex data transformations, and ensures atomic operations.

Dashboards

Dashboards Data-driven Reporting Business Intelligence

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

Third-party data might include industry benchmarks, data feeds (such as weather and social media), and/or anonymized customer data. Four Approaches to Data Analytics The world of data analytics is constantly and quickly changing. Data Transformation and Enrichment Data can be enriched for analysis.

Analytics

Analytics Cost-Benefit Visualization Dashboards

3 Ways Logi Symphony Leverages AI for Actionable Insights

Jet Global

APRIL 24, 2024

On top of the advanced embedded analytics capabilities it provides, Logi Symphony provides users advanced analytics with the help of their unique knowledge and datasets. Unlike competitors who lock you into their pre-built AI solutions, Logi AI empowers you with the freedom to choose. Privacy Policy.

Business Intelligence

Business Intelligence Dashboards Data-driven Reporting

Stream real-time data into Apache Iceberg tables in Amazon S3 using Amazon Data Firehose

AWS Big Data

NOVEMBER 6, 2024

This example uses Direct PUT as the source, but the same steps can be applied for other Firehose sources such as Kinesis Data Streams, and Amazon MSK. For the Firehose stream name , enter firehose-iceberg-events-1. Create a Firehose stream: Go to the Amazon Data Firehose console. Choose Create Firehose stream.

Metadata

Metadata Data Lake Management Internet of Things

Streamline AWS WAF log analysis with Apache Iceberg and Amazon Data Firehose

AWS Big Data

FEBRUARY 18, 2025

To optimize their security operations, organizations are adopting modern approaches that combine real-time monitoring with scalable data analytics. They are using data lake architectures and Apache Iceberg to efficiently process large volumes of security data while minimizing operational overhead.

Snapshot

Snapshot Optimization Data Lake Metadata

Building and operating data pipelines at scale using CI/CD, Amazon MWAA and Apache Spark on Amazon EMR by Wipro

AWS Big Data

FEBRUARY 25, 2025

If data mapping has been enabled within the data processing job, then the structured data is prepared based on the given schema. This output is passed to next phase where data transformations and business validations can be applied. After this step, data is loaded to specified target.

Data Processing

Data Processing Machine Learning Data-driven Cost-Benefit

Ingest telemetry messages in near real time with Amazon API Gateway, Amazon Data Firehose, and Amazon Location Service

AWS Big Data

NOVEMBER 14, 2024

Key services in the solution include Amazon API Gateway , Amazon Data Firehose , and Amazon Location Service. The challenge In the event of a disaster e.g. water flood, there is usually a lack of terrestrial data connectivity that prevents monitoring stations from taking actionable measures in real time.

Data Lake

Data Lake Metadata Testing Data-driven

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

SQL Streambuilder Data Transformations

Webinars

Trending Sources

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Webinars

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

How EUROGATE established a data mesh architecture using Amazon DataZone

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

Biggest Trends in Data Visualization Taking Shape in 2022

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

Use AWS Glue to streamline SFTP data processing

Gain insights from historical location data using Amazon Location Service and AWS analytics services

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

The Best Data Management Tools For Small Businesses

Deliver decompressed Amazon CloudWatch Logs to Amazon S3 and Splunk using Amazon Data Firehose

What is business analytics? Using data to improve business outcomes

Reference guide to build inventory management and forecasting solutions on AWS

Harnessing Streaming Data: Insights at the Speed of Life

What is DataOps? Collaborative, cross-functional analytics

Set up alerts and orchestrate data quality rules with AWS Glue Data Quality

Monitor data pipelines in a serverless data lake

7 key Microsoft Azure analytics services (plus one extra)

An AI Chat Bot Wrote This Blog Post …

How healthcare organizations can analyze and create insights using price transparency data

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

Simplify Metrics on Apache Druid With Rill Data and Cloudera

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

Building Better Data Models to Unlock Next-Level Intelligence

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

How smava makes loans transparent and affordable using Amazon Redshift Serverless

How CFM built a well-governed and scalable data-engineering platform using Amazon EMR for financial features generation

Use Snowflake with Amazon MWAA to orchestrate data pipelines

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

Migrate your existing SQL-based ETL workload to an AWS serverless ETL infrastructure using AWS Glue

What is a Data Pipeline?

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Enhancing Your BI Experience With Apache Iceberg

What Is Embedded Analytics?

3 Ways Logi Symphony Leverages AI for Actionable Insights

Stream real-time data into Apache Iceberg tables in Amazon S3 using Amazon Data Firehose

Streamline AWS WAF log analysis with Apache Iceberg and Amazon Data Firehose

Building and operating data pipelines at scale using CI/CD, Amazon MWAA and Apache Spark on Amazon EMR by Wipro

Ingest telemetry messages in near real time with Amazon API Gateway, Amazon Data Firehose, and Amazon Location Service

Stay Connected