Data Transformation, Data Warehouse and Events

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

The need for streamlined data transformations As organizations increasingly adopt cloud-based data lakes and warehouses, the demand for efficient data transformation tools has grown. Using Athena and the dbt adapter, you can transform raw data in Amazon S3 into well-structured tables suitable for analytics.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

In fact, by putting a single label like AI on all the steps of a data-driven business process, we have effectively not only blurred the process, but we have also blurred the particular characteristics that make each step separately distinct, uniquely critical, and ultimately dependent on specialized, specific technologies at each step.

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis. or a later version) database.

Data Warehouse

Data Warehouse Analytics Testing Sales

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

Your generated jobs can use a variety of data transformations, including filters, projections, unions, joins, and aggregations, giving you the flexibility to handle complex data processing requirements. Next, the merged data is filtered to include only a specific geographic region.

Data Integration

Data Integration Visualization Data Processing Big Data

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

AWS Big Data

NOVEMBER 22, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that you can use to analyze your data at scale. Building event-driven applications with Amazon EventBridge and Lambda. Scheduling SQL scripts to simplify data load, unload, and refresh of materialized views.

Data Warehouse

Data Warehouse Recreation/Entertainment Cost-Benefit Data-driven

Database vs. Data Warehouse: What’s the Difference?

Jet Global

MAY 28, 2019

Whether the reporting is being done by an end user, a data science team, or an AI algorithm, the future of your business depends on your ability to use data to drive better quality for your customers at a lower cost. So, when it comes to collecting, storing, and analyzing data, what is the right choice for your enterprise?

Data Warehouse

Data Warehouse Reporting Business Intelligence Sales

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

AWS Big Data

JANUARY 6, 2025

Amazon AppFlow is a fully managed integration service that you can use to securely transfer data from software as a service (SaaS) applications, such as Google BigQuery, Salesforce, SAP, HubSpot, and ServiceNow, to Amazon Web Services (AWS) services such as Amazon Simple Storage Service (Amazon S3) and Amazon Redshift, in just a few clicks.

Analytics

Analytics Data Warehouse Big Data Metrics

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

AWS Big Data

JANUARY 30, 2025

Diagram 1: Overall architecture of the solution, using AWS Step Functions, Amazon Redshift and Amazon S3 The following AWS services were used to shape our new ETL architecture: Amazon Redshift A fully managed, petabyte-scale data warehouse service in the cloud. The following Diagram 2 shows this workflow.

Data Warehouse

Data Warehouse Data Architecture Machine Learning Data Transformation

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

AWS Database Migration Service (AWS DMS) is used to securely transfer the relevant data to a central Amazon Redshift cluster. The data in the central data warehouse in Amazon Redshift is then processed for analytical needs and the metadata is shared to the consumers through Amazon DataZone.

IoT

IoT Machine Learning Metadata Data-driven

Birst automates the creation of data warehouses in Snowflake

Birst BI

FEBRUARY 25, 2020

Managing large-scale data warehouse systems has been known to be very administrative, costly, and lead to analytic silos. The good news is that Snowflake, the cloud data platform, lowers costs and administrative overhead. When did you begin a technology partnership with Snowflake and why?

Data Warehouse

Data Warehouse Cost-Benefit Data Architecture Enterprise

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

Federated queries allow querying data across Amazon RDS for MySQL and PostgreSQL data sources without the need for extract, transform, and load (ETL) pipelines. If storing operational data in a data warehouse is a requirement, synchronization of tables between operational data stores and Amazon Redshift tables is supported.

IoT

IoT Data Warehouse Cost-Benefit Reporting

7 key Microsoft Azure analytics services (plus one extra)

CIO Business Intelligence

JUNE 29, 2022

The recent announcement of the Microsoft Intelligent Data Platform makes that more obvious, though analytics is only one part of that new brand. Azure Data Factory. Azure Data Lake Analytics. Data warehouses are designed for questions you already know you want to ask about your data, again and again.

Data Lake

Data Lake Analytics Data Warehouse Machine Learning

Ensuring Data Transformation Quality with dbt Core

Wayne Yaddow

MARCH 14, 2025

How dbt Core aids data teams test, validate, and monitor complex data transformations and conversions Photo by NASA on Unsplash Introduction dbt Core, an open-source framework for developing, testing, and documenting SQL-based data transformations, has become a must-have tool for modern data teams as the complexity of data pipelines grows.

Data Transformation

Data Transformation Testing Unstructured Data Data Quality

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. How does Data Virtualization complement Data Warehousing and SOA Architectures?

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Use AWS Glue to streamline SFTP data processing

AWS Big Data

AUGUST 13, 2024

You will load the event data from the SFTP site, join it to the venue data stored on Amazon S3, apply transformations, and store the data in Amazon S3. The event and venue files are from the TICKIT dataset. For Node parents , select Rename Venue data and Rename Event data.

Data Processing

Data Processing Visualization Data Lake Data Processing

The Best Data Management Tools For Small Businesses

Smart Data Collective

APRIL 29, 2020

The extraction of raw data, transforming to a suitable format for business needs, and loading into a data warehouse. Data transformation. This process helps to transform raw data into clean data that can be analysed and aggregated. Data analytics and visualisation.

Management

Management Data Warehouse Digital Transformation Dashboards

Stored procedure enhancements in Amazon Redshift

AWS Big Data

SEPTEMBER 6, 2023

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. With Amazon Redshift, you can analyze all your data to derive holistic insights about your business and your customers. You can also schedule stored procedures to automate data processing on Amazon Redshift. Satesh Sonti is a Sr.

Data Warehouse

Data Warehouse Insurance Statistics Software

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

In this post, we delve into a case study for a retail use case, exploring how the Data Build Tool (dbt) was used effectively within an AWS environment to build a high-performing, efficient, and modern data platform. It does this by helping teams handle the T in ETL (extract, transform, and load) processes.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

These nodes can implement analytical platforms like data lake houses, data warehouses, or data marts, all united by producing data products. Education and enablement – Conduct learning interventions to upskill teams on understanding and using the data as a product approach.

Metadata

Metadata Data Governance Data Quality Data-driven

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

AWS Big Data

NOVEMBER 13, 2023

Amazon Redshift is a fully managed data warehousing service that offers both provisioned and serverless options, making it more efficient to run and scale analytics without having to manage your data warehouse. These upstream data sources constitute the data producer components.

Data Warehouse

Data Warehouse Analytics Data Lake Data Science

How Chime Financial uses AWS to build a serverless stream analytics platform and defeat fraudsters

AWS Big Data

SEPTEMBER 19, 2023

Problem statement In order to keep up with the rapid movement of fraudsters, our decision platform must continuously monitor user events and respond in real-time. However, our legacy data warehouse-based solution was not equipped for this challenge. Amazon DynamoDB is another data source for our Streaming 2.0

Analytics

Analytics Risk Big Data Machine Learning

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To speed up the self-service analytics and foster innovation based on data, a solution was needed to provide ways to allow any team to create data products on their own in a decentralized manner. To create and manage the data products, smava uses Amazon Redshift , a cloud data warehouse.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Cloudera

JULY 21, 2022

Cloudera users can securely connect Rill to a source of event stream data, such as Cloudera DataFlow , model data into Rill’s cloud-based Druid service, and share live operational dashboards within minutes via Rill’s interactive metrics dashboard or any connected BI solution. Cloudera Data Warehouse). Apache Hive.

Metrics

Metrics Slice and Dice Data Warehouse Dashboards

Simplify data transfer: Google BigQuery to Amazon S3 using Amazon AppFlow

AWS Big Data

OCTOBER 5, 2023

Amazon AppFlow , a fully managed data integration service, has been at the forefront of streamlining data transfer between AWS services, software as a service (SaaS) applications, and now Google BigQuery. Architecture Let’s review the architecture to transfer data from Google BigQuery to Amazon S3 using Amazon AppFlow.

Data Warehouse

Data Warehouse Machine Learning Data Integration Data-driven

What is DataOps? Collaborative, cross-functional analytics

CIO Business Intelligence

DECEMBER 22, 2022

DataOps in practice To make the most of DataOps, enterprises must evolve their data management strategies to deal with data at scale and in response to real-world events as they happen, according to Dunning and Friedman. They also note DataOps fits well with microservices architectures.

Analytics

Analytics Machine Learning Data mining Software

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

With quality data at their disposal, organizations can form data warehouses for the purposes of examining trends and establishing future-facing strategies. Industry-wide, the positive ROI on quality data is well understood. Here, it all comes down to the data transformation error rate.

Data Quality

Data Quality Metrics Data-driven Management

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

As data volumes and use cases scale especially with AI and real-time analytics trust must be an architectural principle, not an afterthought. Comparison of modern data architectures : Architecture Definition Strengths Weaknesses Best used when Data warehouse Centralized, structured and curated data repository.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Build Hybrid Data Pipelines and Enable Universal Connectivity With CDF-PC Inbound Connections

Cloudera

JUNE 17, 2022

In the second blog of the Universal Data Distribution blog series , we explored how Cloudera DataFlow for the Public Cloud (CDF-PC) can help you implement use cases like data lakehouse and data warehouse ingest, cybersecurity, and log optimization, as well as IoT and streaming data collection.

Cost-Benefit

Cost-Benefit IoT Data Warehouse Manufacturing

Automate alerting and reporting for AWS Glue job resource usage

AWS Big Data

MAY 25, 2023

Data transformation plays a pivotal role in providing the necessary data insights for businesses in any organization, small and large. To gain these insights, customers often perform ETL (extract, transform, and load) jobs from their source systems and output an enriched dataset.

Reporting

Reporting Metrics Optimization Data Lake

How Infomedia built a serverless data pipeline with change data capture using AWS Glue and Apache Hudi

AWS Big Data

MARCH 15, 2023

Performance and scalability of both the data pipeline and API endpoint were key success criteria. The data pipeline needed to have sufficient performance to allow for fast turnaround in the event that data issues needed to be corrected.

Cost-Benefit

Cost-Benefit Data Processing Optimization Data-driven

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

AWS Big Data

APRIL 27, 2023

With these features, you can now build data pipelines completely in standard SQL that are serverless, more simple to build, and able to operate at scale. Typically, data transformation processes are used to perform this operation, and a final consistent view is stored in an S3 bucket or folder.

Data Lake

Data Lake Snapshot Optimization Data Transformation

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

AWS Big Data

AUGUST 1, 2023

You can do this by updating the CloudFormation stack with a flag that includes the CDC and data transformation steps. This will enable both the CDC steps and the data transformation steps for the Jira data. The DataBrew job performs data transformation and filtering tasks. Choose Update.

Data Lake

Data Lake Data Transformation Data-driven Cost-Benefit

How to use foundation models and trusted governance to manage AI workflow risk

IBM Big Data Hub

OCTOBER 16, 2023

Curated foundation models, such as those created by IBM or Microsoft, help enterprises scale and accelerate the use and impact of the most advanced AI capabilities using trusted data. In addition to natural language, models are trained on various modalities, such as code, time-series, tabular, geospatial and IT events data.

Risk

Risk Modeling Management Metadata

Building Better Data Models to Unlock Next-Level Intelligence

Sisense

MAY 11, 2021

When it comes to data modeling, function determines form. Let’s say you want to subject a dataset to some form of anomaly detection; your model might take the form of a singular event stream that can be read by an anomaly detection service. This design philosophy was adapted from our friends at Fishtown Analytics.).

Modeling

Modeling Big Data IoT Data Warehouse

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

AWS Big Data

MARCH 3, 2023

Additionally, the scale is significant because the multi-tenant data sources provide a continuous stream of testing activity, and our users require quick data refreshes as well as historical context for up to a decade due to compliance and regulatory demands. Finally, data integrity is of paramount importance.

Software

Software Data Lake Testing Cost-Benefit

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Alation

OCTOBER 27, 2022

Few actors in the modern data stack have inspired the enthusiasm and fervent support as dbt. This data transformation tool enables data analysts and engineers to transform, test and document data in the cloud data warehouse. But what does this mean from a practitioner perspective?

Metrics

Metrics Dashboards Sales Reporting

Migrate your existing SQL-based ETL workload to an AWS serverless ETL infrastructure using AWS Glue

AWS Big Data

JULY 31, 2023

This solution decouples the ETL and analytics workloads from our transactional data source Amazon Aurora, and uses Amazon Redshift as the data warehouse solution to build a data mart. We use Amazon Redshift as the data warehouse to implement the data mart solution. Under Transforms , choose SQL Query.

Sales

Sales Data Warehouse Visualization Testing

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

AWS Big Data

JANUARY 17, 2024

His expertise lies in assisting customers with migrating intricate enterprise systems and databases to AWS, constructing enterprise data warehousing and data lake platforms. He is deeply passionate about applying ML/DL and big data techniques to solve real-world problems. Aditya Shah is a Software Development Engineer at AWS.

Data Lake

Data Lake Snapshot Big Data Data-driven

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

AWS Big Data

JUNE 29, 2023

Kinesis Data Analytics for Apache Flink In our example, we perform the following actions on the streaming data: Connect to an Amazon Kinesis Data Streams data stream. View the stream data. Transform and enrich the data. Manipulate the data with Python.

Data Analytics

Data Analytics Analytics IoT Data Lake

What is a Data Pipeline?

Jet Global

MAY 9, 2024

The key components of a data pipeline are typically: Data Sources : The origin of the data, such as a relational database , data warehouse, data lake , file, API, or other data store. This can include tasks such as data ingestion, cleansing, filtering, aggregation, or standardization.

Data Lake

Data Lake Data Warehouse Business Intelligence Machine Learning

What is Data Mapping?

Jet Global

FEBRUARY 23, 2024

This field guide to data mapping will explore how data mapping connects volumes of data for enhanced decision-making. Why Data Mapping is Important Data mapping is a critical element of any data management initiative, such as data integration, data migration, data transformation, data warehousing, or automation.

Data Warehouse

Data Warehouse Reporting Data Transformation Visualization

Beyond the Dashboard: Transforming Data Into Business Impact Using Qlik + Vizlib

Jet Global

MAY 21, 2025

Unlocking the full potential of your data is about more than just visualizing it. True data transformation comes from applying insights to make impactful business decisions. I understand that I can withdraw my consent at any time. Privacy Policy.

Dashboards

Dashboards Visualization Reporting Interactive

Discover Efficient Data Extraction Through Replication With Angles Enterprise for Oracle

Jet Global

NOVEMBER 7, 2023

The answer depends on your specific business needs and the nature of the data you are working with. Both methods have advantages and disadvantages: Replication involves periodically copying data from a source system to a data warehouse or reporting database. Empower your team to add new data sources on the fly.

Enterprise

Enterprise Data Warehouse Operational Reporting Reporting

Data Prep for AI: Get Your Oracle House in Order

Jet Global

MAY 6, 2024

These tools excel at data integration, consolidating information from various financial systems (ERP, CRM, legacy) into a central hub. This eliminates data fragmentation, a major obstacle for AI. Additionally, they provide robust data transformation capabilities.

Finance

Finance Reporting Data Transformation Data-driven

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

SAP Datasphere Powers Business at the Speed of Data

Webinars

Trending Sources

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Webinars

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

Database vs. Data Warehouse: What’s the Difference?

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

How EUROGATE established a data mesh architecture using Amazon DataZone

Birst automates the creation of data warehouses in Snowflake

Amazon Redshift data ingestion options

7 key Microsoft Azure analytics services (plus one extra)

Ensuring Data Transformation Quality with dbt Core

Biggest Trends in Data Visualization Taking Shape in 2022

Use AWS Glue to streamline SFTP data processing

The Best Data Management Tools For Small Businesses

Stored procedure enhancements in Amazon Redshift

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

How Chime Financial uses AWS to build a serverless stream analytics platform and defeat fraudsters

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Simplify data transfer: Google BigQuery to Amazon S3 using Amazon AppFlow

What is DataOps? Collaborative, cross-functional analytics

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Data’s dark secret: Why poor quality cripples AI and growth

Build Hybrid Data Pipelines and Enable Universal Connectivity With CDF-PC Inbound Connections

Automate alerting and reporting for AWS Glue job resource usage

How Infomedia built a serverless data pipeline with change data capture using AWS Glue and Apache Hudi

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

How to use foundation models and trusted governance to manage AI workflow risk

Building Better Data Models to Unlock Next-Level Intelligence

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Migrate your existing SQL-based ETL workload to an AWS serverless ETL infrastructure using AWS Glue

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

What is a Data Pipeline?

What is Data Mapping?

Beyond the Dashboard: Transforming Data Into Business Impact Using Qlik + Vizlib

Discover Efficient Data Extraction Through Replication With Angles Enterprise for Oracle

Data Prep for AI: Get Your Oracle House in Order

Stay Connected