Data Architecture, Data Transformation and Management

Data Architecture

Data Transformation

Management

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

This integration enables data teams to efficiently transform and manage data using Athena with dbt Cloud’s robust features, enhancing the overall data workflow experience. This enables you to extract insights from your data without the complexity of managing infrastructure.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

Texas Rangers data transformation modernizes stadium operations

CIO Business Intelligence

OCTOBER 18, 2022

With the new stadium on the horizon, the team needed to update existing IT systems and manual business and IT processes to handle the massive volumes of new data that would soon be at their fingertips. “In Analytics, Data Management Some of our systems were old. They want that information,” she says.

Data Transformation

Data Transformation Consulting Data Lake Reporting

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Is The Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

We also examine how centralized, hybrid and decentralized data architectures support scalable, trustworthy ecosystems. As data-centric AI, automated metadata management and privacy-aware data sharing mature, the opportunity to embed data quality into the enterprises core has never been more significant.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Is The Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Making OT-IT integration a reality with new data architectures and generative AI

CIO Business Intelligence

FEBRUARY 20, 2024

Where all data – structured, semi-structured, and unstructured – is sourced, unified, and exploited in automated processes, AI tools and by highly skilled, but over-stretched, employees. Legacy data management is holding back manufacturing transformation Until now, however, this vision has remained out of reach.

Data Architecture

Data Architecture Unstructured Data Manufacturing IT

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Their terminal operations rely heavily on seamless data flows and the management of vast volumes of data. With the addition of these technologies alongside existing systems like terminal operating systems (TOS) and SAP, the number of data producers has grown substantially.

IoT

IoT Machine Learning Metadata Data-driven

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

AWS Big Data

JANUARY 6, 2025

In this post, we show you how to establish the data ingestion pipeline between Google Analytics 4, Google Sheets, and an Amazon Redshift Serverless workgroup. With Amazon AppFlow, you can run data flows at nearly any scale and at the frequency you chooseon a schedule, in response to a business event, or on demand.

Analytics

Analytics Data Warehouse Big Data Metrics

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

Within seconds of transactional data being written into Amazon Aurora (a fully managed modern relational database service offering performance and high availability at scale), the data is seamlessly made available in Amazon Redshift for analytics and machine learning. Create dbt models in dbt Cloud. Choose Create.

Data Warehouse

Data Warehouse Analytics Testing Sales

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

AWS Big Data

APRIL 29, 2025

While enabling organization-wide efficiency, the team also applied these principles to the data architecture, making sure that CLEA itself operates frugally. After evaluating various tools, we built a serverless data transformation pipeline using Amazon Athena and dbt.

Introducing blueprint discovery and other UI enhancements for Amazon OpenSearch Ingestion

AWS Big Data

MAY 22, 2024

Amazon OpenSearch Ingestion is a fully managed serverless pipeline that allows you to ingest, filter, transform, enrich, and route data to an Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection. He is deeply passionate about Data Architecture and helps customers build analytics solutions at scale on AWS.

Data Architecture

Data Architecture Visualization Data Transformation Management

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

AWS Big Data

JANUARY 30, 2025

Diagram 1: Overall architecture of the solution, using AWS Step Functions, Amazon Redshift and Amazon S3 The following AWS services were used to shape our new ETL architecture: Amazon Redshift A fully managed, petabyte-scale data warehouse service in the cloud. includes the ability to run Python scripts.

Data Warehouse

Data Warehouse Data Architecture Machine Learning Data Transformation

Lay the groundwork now for advanced analytics and AI

CIO Business Intelligence

AUGUST 3, 2023

It also helps him democratize credit union data so it can be used to improve customer service, automate the maintenance of such data by making various types of data easier to find, and provide chains of custody and audit controls to help meet regulatory needs.

Analytics

Analytics Data Lake Metadata Cost-Benefit

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

It does this by helping teams handle the T in ETL (extract, transform, and load) processes. It allows users to write data transformation code, run it, and test the output, all within the framework it provides. This separation further simplifies data management and enhances the system’s overall performance.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

Top 6 Benefits of Automating End-to-End Data Lineage

erwin

SEPTEMBER 17, 2020

With complex data architectures and systems within so many organizations, tracking data in motion and data at rest is daunting to say the least. Harvesting the data through automation seamlessly removes ambiguity and speeds up the processing time-to-market capabilities. Improved Customer and Employee Satisfaction.

Cost-Benefit

Cost-Benefit Data Governance Metadata Reporting

Cloudera Data Engineering 2021 Year End Review

Cloudera

DECEMBER 21, 2021

In working with thousands of customers deploying Spark applications, we saw significant challenges with managing Spark as well as automating, delivering, and optimizing secure data pipelines. We wanted to develop a service tailored to the data engineering practitioner built on top of a true enterprise hybrid data service platform.

Snapshot

Snapshot Data-driven Optimization Management

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

These tools empower analysts and data scientists to easily collaborate on the same data, with their choice of tools and analytic engines. No more lock-in, unnecessary data transformations, or data movement across tools and clouds just to extract insights out of the data.

Data Lake

Data Lake Data Warehouse Data Architecture Metadata

Breaking down data silos for digital success

CIO Business Intelligence

NOVEMBER 7, 2023

Given the importance of sharing information among diverse disciplines in the era of digital transformation, this concept is arguably as important as ever. The aim is to normalize, aggregate, and eventually make available to analysts across the organization data that originates in various pockets of the enterprise.

Data Warehouse

Data Warehouse Digital Transformation Data-driven Reporting

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

AWS Big Data

NOVEMBER 15, 2023

Overview of the BMW Cloud Data Hub At the BMW Group, Cloud Data Hub (CDH) is the central platform for managing company-wide data and data solutions. The difference lies in when and where data transformation takes place. In ETL, data is transformed before it’s loaded into the data warehouse.

Dashboards

Dashboards Analytics Metadata Data Warehouse

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

The Amazon Redshift integration for Apache Spark combined with AWS Glue or Amazon EMR performs transformations before loading data into Amazon Redshift. Finally, data can be loaded into Amazon Redshift with popular ETL tools like Informatica , Matillion and DBT Labs. AWS Glue 4.0

IoT

IoT Data Warehouse Cost-Benefit Reporting

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To speed up the self-service analytics and foster innovation based on data, a solution was needed to provide ways to allow any team to create data products on their own in a decentralized manner. To create and manage the data products, smava uses Amazon Redshift , a cloud data warehouse.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Automate discovery of data relationships using ML and Amazon Neptune graph technology

AWS Big Data

APRIL 19, 2023

Data mesh is a new approach to data management. Companies across industries are using a data mesh to decentralize data management to improve data agility and get value from data. A modern data architecture is critical in order to become a data-driven organization.

Technology

Technology Data-driven Machine Learning Sales

Connecting the Data Lifecycle

Cloudera

NOVEMBER 29, 2021

Data transforms businesses. That’s where the data lifecycle comes into play. Managing data and its flow, from the edge to the cloud, is one of the most important tasks in the process of gaining data intelligence. . The company needed a modern data architecture to manage the growing traffic effectively. .

Data Lake

Data Lake Data Warehouse Data Architecture Reporting

Data Integrity, the Basis for Reliable Insights

Sisense

AUGUST 28, 2020

Uncomfortable truth incoming: Most people in your organization don’t think about the quality of their data from intake to production of insights. However, as a data team member, you know how important data integrity (and a whole host of other aspects of data management) is.

Data Integration

Data Integration Testing Data Quality Data-driven

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

AWS Big Data

NOVEMBER 13, 2023

Amazon Redshift is a fully managed data warehousing service that offers both provisioned and serverless options, making it more efficient to run and scale analytics without having to manage your data warehouse. These upstream data sources constitute the data producer components.

Data Warehouse

Data Warehouse Analytics Data Lake Data Science

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. Moreover, running advanced analytics and ML on disparate data sources proved challenging. To overcome these issues, Orca decided to build a data lake.

Data Lake

Data Lake Analytics Snapshot Data Quality

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

In this post, we dive deep into the tool, walking through all steps from log ingestion, transformation, visualization, and architecture design to calculate TCO. The tool provides a YARN log collector to connect Hadoop Resource Manager to collect YARN logs. George Zhao is a Senior Data Architect at AWS ProServe.

Dashboards

Dashboards Optimization Data Lake Cost-Benefit

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

AWS Big Data

NOVEMBER 16, 2023

Amazon Redshift is a popular cloud data warehouse, offering a fully managed cloud-based service that seamlessly integrates with an organization’s Amazon Simple Storage Service (Amazon S3) data lake, real-time streams, machine learning (ML) workflows, transactional workflows, and much more—all while providing up to 7.9x

Enterprise

Enterprise Data Warehouse Data Lake Optimization

A step-by-step guide to setting up a data governance program

IBM Big Data Hub

FEBRUARY 9, 2023

In our last blog , we delved into the seven most prevalent data challenges that can be addressed with effective data governance. Today we will share our approach to developing a data governance program to drive data transformation and fuel a data-driven culture. Don’t try to do everything at once!

Data Governance

Data Governance Business Objectives Data Quality Measurement

Birst automates the creation of data warehouses in Snowflake

Birst BI

FEBRUARY 25, 2020

Managing large-scale data warehouse systems has been known to be very administrative, costly, and lead to analytic silos. The good news is that Snowflake, the cloud data platform, lowers costs and administrative overhead. What value does this joint solution provide?

Data Warehouse

Data Warehouse Cost-Benefit Data Architecture Enterprise

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Ontotext

FEBRUARY 12, 2024

This promotes data autonomy and enables decision-making about data domains without centralized gatekeepers. It also breaks down the code and data monolith and distributes it across the domain teams, which results in better management and scalability. However, data mesh is not about introducing new technologies.

Data-driven

Data-driven Data Lake Data Quality Business Objectives

The Chief Marketing Officer and the CDO – A Modern Fable

Peter James Thomas

OCTOBER 30, 2018

Where they have, I have normally found the people holding these roles to be better informed about data matters than their peers. The same goes in general for Marketing Managers. It may well be that one thing that a CDO needs to get going is a data transformation programme. It may be to introduce or expand Data Governance.

Marketing

Marketing Strategy Data Architecture Data Strategy

How to modernize data lakes with a data lakehouse architecture

IBM Big Data Hub

JULY 5, 2023

This was, without a question, a significant departure from traditional analytic environments, which often meant vendor-lock in and the inability to work with data at scale. Another unexpected challenge was the introduction of Spark as a processing framework for big data. What can you do next?

Data Lake

Data Lake Metadata Cost-Benefit Data Warehouse

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

The data mesh framework In the dynamic landscape of data management, the search for agility, scalability, and efficiency has led organizations to explore new, innovative approaches. One such innovation gaining traction is the data mesh framework. This empowers individual teams to own and manage their data.

Metadata

Metadata Data Quality Data Governance Modeling

Best BI Tools For 2024 You Need to Know

FineReport

MARCH 31, 2024

In 2024, business intelligence (BI) software has undergone significant advancements, revolutionizing data management and decision-making processes. In essence, the core capabilities of the best BI tools revolve around four essential functions: data integration, data transformation, data visualization, and reporting.

Dashboards

Dashboards Visualization Data mining Data-driven

Choosing A Graph Data Model to Best Serve Your Use Case

Ontotext

MARCH 27, 2024

They also don’t have features for enterprise data management such as schema language, data validation capabilities, interoperable serialization formats, or a proper modeling language. RDF is used extensively for data publishing and data interchange and is based on W3C and other industry standards.

Modeling

Modeling Metadata Data Quality Enterprise

Data Landscape – Navigating The Data Jungle

Anmut

MARCH 24, 2022

We could give many answers, but they all centre on the same root cause: most data leaders focus on flashy technology and symptomatic fixes instead of approaching data transformation in a way that addresses the root causes of data problems and leads to tangible results and business success. It doesn’t have to be this way.

ROI

ROI Measurement Data-driven Data Transformation

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

It enables compute such as EMR instances and storage such as Amazon Simple Storage Service (Amazon S3) data lakes to scale. For various Hadoop jobs, customers have bespoke deployment options of fully managed Amazon EMR, Amazon EMR on Amazon EKS , and EMR Serverless. George Zhao is a Senior Data Architect at AWS ProServe.

Cost-Benefit

Cost-Benefit Data Lake Dashboards Big Data

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes. Data ingestion – Steps 1 and 2 use AWS DMS, which connects to the source database and moves full and incremental data (CDC) to Amazon S3 in Parquet format. Let’s refer to this S3 bucket as the raw layer.

Data Lake

Data Lake Dashboards Metrics Metadata

Data platform trinity: Competitive or complementary?

IBM Big Data Hub

JANUARY 18, 2023

Data lakehouse: A mostly new platform. They defined it as : “ A data lakehouse is a new, open data management architecture that combines the flexibility, cost-efficiency, and scale of data lakes with the data management and ACID transactions of data warehouses, enabling business intelligence (BI) and machine learning (ML) on all data. ”.

Data Lake

Data Lake Data Warehouse Data-driven Metadata

Showpad accelerates data maturity to unlock innovation using Amazon QuickSight

AWS Big Data

APRIL 5, 2023

The company also used the opportunity to reimagine its data pipeline and architecture. A key architectural decision that Showpad took during this time was to create a portable data layer by decoupling the data transformation from visualization, ML, or ad hoc querying tools and centralizing its business logic.

Dashboards

Dashboards Reporting Cost-Benefit Visualization

Successful Data Virtualisation: more than the right choice of platform

Data Virtualization

JANUARY 20, 2021

Learn in 12 minutes: What makes a strong use case for data virtualisation How to come up with a solid Proof of Concept How to prepare your organisation for data virtualisation You’ll have read all about data virtualisation and you’ve.

Data Warehouse

Data Warehouse Data Architecture Data Transformation Big Data

Accelerate data pipeline creation with the new visual interface in Amazon OpenSearch Ingestion

AWS Big Data

APRIL 22, 2025

Amazon OpenSearch Ingestion is a fully managed serverless pipeline that allows you to ingest, filter, transform, enrich, and route data to an Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection. With this new feature, you can build pipelines in minutes without writing complex configurations manually.

Visualization

Visualization Data Transformation Management Risk

CIO 100 Award winners drive business results with IT

CIO Business Intelligence

AUGUST 7, 2024

Its AI/ML-driven predictive analysis enhanced proactive threat hunting and phishing investigations as well as automated case management for swift threat identification. However, managing API implementations across multiple applications, custom solutions, and diverse development teams presents challenges.

IT Insurance Cost-Benefit Testing

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

Now, Delta managers can get a full understanding of their data for compliance purposes. Additionally, with write-back capabilities, they can clear discrepancies and input data. This also includes IT departments that develop and manage applications used by internal stakeholders and partners.

Analytics

Analytics Cost-Benefit Visualization Dashboards

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

Texas Rangers data transformation modernizes stadium operations

Webinars

Trending Sources

Data’s dark secret: Why poor quality cripples AI and growth

Webinars

Making OT-IT integration a reality with new data architectures and generative AI

How EUROGATE established a data mesh architecture using Amazon DataZone

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

Introducing blueprint discovery and other UI enhancements for Amazon OpenSearch Ingestion

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

Lay the groundwork now for advanced analytics and AI

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

Top 6 Benefits of Automating End-to-End Data Lineage

Cloudera Data Engineering 2021 Year End Review

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Breaking down data silos for digital success

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

Amazon Redshift data ingestion options

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Automate discovery of data relationships using ML and Amazon Neptune graph technology

Connecting the Data Lifecycle

Data Integrity, the Basis for Reliable Insights

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

A step-by-step guide to setting up a data governance program

Birst automates the creation of data warehouses in Snowflake

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

The Chief Marketing Officer and the CDO – A Modern Fable

How to modernize data lakes with a data lakehouse architecture

Empowering data mesh: The tools to deliver BI excellence

Best BI Tools For 2024 You Need to Know

Choosing A Graph Data Model to Best Serve Your Use Case

Data Landscape – Navigating The Data Jungle

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Data platform trinity: Competitive or complementary?

Showpad accelerates data maturity to unlock innovation using Amazon QuickSight

Successful Data Virtualisation: more than the right choice of platform

Accelerate data pipeline creation with the new visual interface in Amazon OpenSearch Ingestion

CIO 100 Award winners drive business results with IT

What Is Embedded Analytics?

Stay Connected