Data Architecture and Data Transformation

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

The need for streamlined data transformations As organizations increasingly adopt cloud-based data lakes and warehouses, the demand for efficient data transformation tools has grown. This enables you to extract insights from your data without the complexity of managing infrastructure.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

Texas Rangers data transformation modernizes stadium operations

CIO Business Intelligence

OCTOBER 18, 2022

She decided to bring Resultant in to assist, starting with the firm’s strategic data assessment (SDA) framework, which evaluates a client’s data challenges in terms of people and processes, data models and structures, data architecture and platforms, visual analytics and reporting, and advanced analytics.

Data Transformation

Data Transformation Consulting Data Lake Reporting

Data Architecture Crash Course: Key Terms

Dataiku

AUGUST 1, 2019

We’ve set out to demystify the jargon surrounding data architecture to enable every team to understand how it impacts their objectives. Not sure what Hadoop actually is? A little fuzzy on what the difference is between cloud and on-prem storage?

Data Architecture

Data Architecture IT Data Transformation Enterprise

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Making OT-IT integration a reality with new data architectures and generative AI

CIO Business Intelligence

FEBRUARY 20, 2024

The data transformation imperative What Denso and other industry leaders realise is that for IT-OT convergence to be realised, and the benefits of AI unlocked, data transformation is vital. The company can also unify its knowledge base and promote search and information use that better meets its needs.

Data Architecture

Data Architecture Unstructured Data Manufacturing IT

Improve Business Agility by Hiring a DataOps Engineer

DataKitchen

DECEMBER 20, 2020

The DataOps Engineering skillset includes hybrid and cloud platforms, orchestration, data architecture, data integration, data transformation, CI/CD, real-time messaging, and containers. The role of the DataOps Engineer goes by several different titles and is sometimes covered by IT, dev, or analyst functions.

Data-driven

Data-driven Manufacturing Data Architecture Data Analytics

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Need for a data mesh architecture Because entities in the EUROGATE group generate vast amounts of data from various sourcesacross departments, locations, and technologiesthe traditional centralized data architecture struggles to keep up with the demands for real-time insights, agility, and scalability.

IoT

IoT Machine Learning Metadata Data-driven

Introducing blueprint discovery and other UI enhancements for Amazon OpenSearch Ingestion

AWS Big Data

MAY 22, 2024

Amazon OpenSearch Ingestion is a fully managed serverless pipeline that allows you to ingest, filter, transform, enrich, and route data to an Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection. He is deeply passionate about Data Architecture and helps customers build analytics solutions at scale on AWS.

Data Architecture

Data Architecture Visualization Data Transformation Management

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

Together with price-performance, Amazon Redshift offers capabilities such as serverless architecture, machine learning integration within your data warehouse and secure data sharing across the organization. dbt Cloud is a hosted service that helps data teams productionize dbt deployments.

Data Warehouse

Data Warehouse Analytics Testing Modeling

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

AWS Big Data

JANUARY 30, 2025

Pattern 1: Data transformation, load, and unload Several of our data pipelines included significant data transformation steps, which were primarily performed through SQL statements executed by Amazon Redshift. The following Diagram 2 shows this workflow.

Data Warehouse

Data Warehouse Data Architecture Machine Learning Data Transformation

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

AWS Big Data

JANUARY 6, 2025

With Amazon AppFlow, you can run data flows at nearly any scale and at the frequency you chooseon a schedule, in response to a business event, or on demand. You can configure data transformation capabilities such as filtering and validation to generate rich, ready-to-use data as part of the flow itself, without additional steps.

Analytics

Analytics Data Warehouse Metrics Big Data

Top 6 Benefits of Automating End-to-End Data Lineage

erwin

SEPTEMBER 17, 2020

With complex data architectures and systems within so many organizations, tracking data in motion and data at rest is daunting to say the least. Harvesting the data through automation seamlessly removes ambiguity and speeds up the processing time-to-market capabilities.

Cost-Benefit

Cost-Benefit Data Governance Metadata Reporting

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

These tools empower analysts and data scientists to easily collaborate on the same data, with their choice of tools and analytic engines. No more lock-in, unnecessary data transformations, or data movement across tools and clouds just to extract insights out of the data.

Data Lake

Data Lake Data Warehouse Data Architecture Metadata

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

In this post, we delve into a case study for a retail use case, exploring how the Data Build Tool (dbt) was used effectively within an AWS environment to build a high-performing, efficient, and modern data platform. It does this by helping teams handle the T in ETL (extract, transform, and load) processes.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

However, you might face significant challenges when planning for a large-scale data warehouse migration. The following diagram illustrates a scalable migration pattern for extract, transform, and load (ETL) scenario. The success criteria are the key performance indicators (KPIs) for each component of the data workflow.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

AWS Big Data

NOVEMBER 15, 2023

The difference lies in when and where data transformation takes place. In ETL, data is transformed before it’s loaded into the data warehouse. In ELT, raw data is loaded into the data warehouse first, then it’s transformed directly within the warehouse.

Dashboards

Dashboards Analytics Metadata Data Warehouse

Cloudera Data Engineering 2021 Year End Review

Cloudera

DECEMBER 21, 2021

We are excited to offer in Tech Preview this born-in-the-cloud table format that will help future proof data architectures at many of our public cloud customers. This enabled new use-cases with customers that were using a mix of Spark and Hive to perform data transformations. . Modernizing pipelines.

Snapshot

Snapshot Data-driven Optimization Management

Automate discovery of data relationships using ML and Amazon Neptune graph technology

AWS Big Data

APRIL 19, 2023

Independent data products often only have value if you can connect them, join them, and correlate them to create a higher order data product that creates additional insights. A modern data architecture is critical in order to become a data-driven organization.

Technology

Technology Data-driven Machine Learning Sales

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

If storing operational data in a data warehouse is a requirement, synchronization of tables between operational data stores and Amazon Redshift tables is supported. In scenarios where data transformation is required, you can use Redshift stored procedures to modify data in Redshift tables.

IoT

IoT Data Warehouse Cost-Benefit Reporting

Data Integrity, the Basis for Reliable Insights

Sisense

AUGUST 28, 2020

The former allows us to control the data before it is generated, and the latter allows us to identify if there is an issue with our data that would impact its availability, completeness, or accuracy. Process-driven data integrity: Getting data generation right. Cleaning up data that doesn’t meet data integrity standards.

Data Integration

Data Integration Testing Data Quality Data-driven

Connecting the Data Lifecycle

Cloudera

NOVEMBER 29, 2021

Data transforms businesses. That’s where the data lifecycle comes into play. Managing data and its flow, from the edge to the cloud, is one of the most important tasks in the process of gaining data intelligence. . The company needed a modern data architecture to manage the growing traffic effectively. .

Data Lake

Data Lake Data Warehouse Data Architecture Reporting

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

He mainly works with enterprise customers to help data lake migration and modernization, and provides guidance and technical assistance on big data projects such as Hadoop, Spark, data warehousing, real-time data processing, and large-scale machine learning. George Zhao is a Senior Data Architect at AWS ProServe.

Dashboards

Dashboards Optimization Data Lake Cost-Benefit

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. This ensures that the data is suitable for training purposes. These robust capabilities ensure that data within the data lake remains accurate, consistent, and reliable.

Data Lake

Data Lake Analytics Snapshot Data Quality

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

AWS Big Data

NOVEMBER 13, 2023

Key considerations Gameskraft embraces a modern data architecture, with the data lake residing in Amazon S3. To grant seamless access to the data lake, we use the innovative capabilities of Redshift Spectrum, which is a bridge between the data warehouse (Amazon Redshift) and data lake (Amazon S3).

Data Warehouse

Data Warehouse Analytics Data Lake Data Science

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

AWS Big Data

NOVEMBER 16, 2023

Data Vault 2.0 allows for the following: Agile data warehouse development Parallel data ingestion A scalable approach to handle multiple data sources even on the same entity A high level of automation Historization Full lineage support However, Data Vault 2.0

Enterprise

Enterprise Data Warehouse Data Lake Optimization

Measuring Maturity

Peter James Thomas

MARCH 9, 2020

Much as Einstein may have said that everything should be as simple as possible, but no simpler [3] , Data Maturity should be as great as necessary, but no greater; over-engineering has been the downfall of many a Data Transformation Programme.

Measurement

Measurement Data Strategy Strategy Modeling

A step-by-step guide to setting up a data governance program

IBM Big Data Hub

FEBRUARY 9, 2023

In our last blog , we delved into the seven most prevalent data challenges that can be addressed with effective data governance. Today we will share our approach to developing a data governance program to drive data transformation and fuel a data-driven culture. Don’t try to do everything at once!

Data Governance

Data Governance Business Objectives Data Quality Measurement

Choosing A Graph Data Model to Best Serve Your Use Case

Ontotext

MARCH 27, 2024

It accelerates data projects with data quality and lineage and contextualizes through ontologies , taxonomies, and vocabularies, making integrations easier. RDF is used extensively for data publishing and data interchange and is based on W3C and other industry standards. LPGs are rudimentary knowledge graphs.

Modeling

Modeling Metadata Data Quality Enterprise

The Chief Marketing Officer and the CDO – A Modern Fable

Peter James Thomas

OCTOBER 30, 2018

It may well be that one thing that a CDO needs to get going is a data transformation programme. This may purely be focused on cultural aspects of how an organisation records, shares and otherwise uses data. It may be to build a new (or a first) Data Architecture. It may be to introduce or expand Data Governance.

Marketing

Marketing Strategy Data Architecture Data Strategy

Data Landscape – Navigating The Data Jungle

Anmut

MARCH 24, 2022

We could give many answers, but they all centre on the same root cause: most data leaders focus on flashy technology and symptomatic fixes instead of approaching data transformation in a way that addresses the root causes of data problems and leads to tangible results and business success. It doesn’t have to be this way.

ROI

ROI Measurement Data-driven Data Transformation

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

He mainly works with enterprise customers to help data lake migration and modernization, and provides guidance and technical assistance on big data projects such as Hadoop, Spark, data warehousing, real-time data processing, and large-scale machine learning. George Zhao is a Senior Data Architect at AWS ProServe.

Cost-Benefit

Cost-Benefit Data Lake Dashboards Big Data

Data platform trinity: Competitive or complementary?

IBM Big Data Hub

JANUARY 18, 2023

This adds an additional ETL step, making the data even more stale. Data lakehouse was created to solve these problems. The data warehouse storage layer is removed from lakehouse architectures. Instead, continuous data transformation is performed within the BLOB storage. Data mesh: A mostly new culture.

Data Lake

Data Lake Data Warehouse Data-driven Metadata

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

Data ingestion – Steps 1 and 2 use AWS DMS, which connects to the source database and moves full and incremental data (CDC) to Amazon S3 in Parquet format. Data transformation – Steps 3 and 4 represent an EMR Serverless Spark application (Amazon EMR 6.9 Monjumi Sarma is a Data Lab Solutions Architect at AWS.

Data Lake

Data Lake Dashboards Metrics Metadata

Start Thinking About DataOps

TDAN

DECEMBER 3, 2019

Everyone’s talking about data. Data is the key to unlocking insight— the secret sauce that will help you get predictive, the fuel for business intelligence. The transformative potential in AI? It relies on data. The good news is that data has never […].

Business Intelligence

Business Intelligence Dashboards Reporting Data Architecture

Successful Data Virtualisation: more than the right choice of platform

Data Virtualization

JANUARY 20, 2021

Learn in 12 minutes: What makes a strong use case for data virtualisation How to come up with a solid Proof of Concept How to prepare your organisation for data virtualisation You’ll have read all about data virtualisation and you’ve.

Data Warehouse

Data Warehouse Data Architecture Data Transformation Big Data

BHP Leverages the Denodo Platform to Create a Logical Data Fabric

Data Virtualization

APRIL 21, 2021

BHP is a global resources company headquartered in Melbourne, Australia. It is among the world’s top producers of major commodities, including iron ore, metallurgical coal, and copper, and has substantial interests in oil and gas. BHP has operations and offices.

IT

IT Data-driven Data Architecture Data Transformation

Lay the groundwork now for advanced analytics and AI

CIO Business Intelligence

AUGUST 3, 2023

At Paytronix, which manages customer loyalty, online ordering, and other systems for its customers, director of data science Jesse Marshall wanted to reduce the custom coding of data transformations—the conversion, cleaning, and structuring of data into a form usable for analytics and reports.

Analytics

Analytics Data Lake Metadata Cost-Benefit

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

Overview of solution As a data-driven company, smava relies on the AWS Cloud to power their analytics use cases. smava ingests data from various external and internal data sources into a landing stage on the data lake based on Amazon Simple Storage Service (Amazon S3).

Data Lake

Data Lake Data Warehouse Data-driven B2B

CIO 100 Award winners drive business results with IT

CIO Business Intelligence

AUGUST 7, 2024

The company started its New Analytics Era initiative by migrating its data from outdated SQL servers to a modern AWS data lake. It then built a cutting-edge cloud-based analytics platform, designed with an innovative data architecture. It also crafted multiple machine learning and AI models to tackle business challenges.

IT

IT Insurance Cost-Benefit Testing

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

The data mesh framework In the dynamic landscape of data management, the search for agility, scalability, and efficiency has led organizations to explore new, innovative approaches. One such innovation gaining traction is the data mesh framework. This empowers individual teams to own and manage their data.

Metadata

Metadata Data Quality Data Governance Modeling

Breaking down data silos for digital success

CIO Business Intelligence

NOVEMBER 7, 2023

Given the importance of sharing information among diverse disciplines in the era of digital transformation, this concept is arguably as important as ever. The aim is to normalize, aggregate, and eventually make available to analysts across the organization data that originates in various pockets of the enterprise.

Data Warehouse

Data Warehouse Digital Transformation Data-driven Reporting

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Ontotext

FEBRUARY 12, 2024

For many organizations, a centralized data platform will fall short as it gives data teams much less autonomy over managing increasingly diverse and voluminous datasets. A centralized data engineering team focuses on building a governed self-serviced infrastructure, while domain teams use the services to build full-stack data products.

Data-driven

Data-driven Data Lake Data Quality Business Objectives

Best BI Tools For 2024 You Need to Know

FineReport

MARCH 31, 2024

Furthermore, these tools boast customization options, allowing users to tailor data sources to address areas critical to their business success, thereby generating actionable insights and customizable reports. Best BI Tools for Data Analysts 3.1 Key Features: Extensive library of pre-built connectors for diverse data sources.

Dashboards

Dashboards Visualization Data mining Data-driven

Birst automates the creation of data warehouses in Snowflake

Birst BI

FEBRUARY 25, 2020

Customers such as Crossmark , DJO Global and others use Birst with Snowflake to deliver the ultimate modern data architecture. Data never leaves Snowflake with Birst’s ability to support the reporting and self-service needs of both centralized IT and decentralized LOB teams.

Data Warehouse

Data Warehouse Cost-Benefit Data Architecture Enterprise

How to modernize data lakes with a data lakehouse architecture

IBM Big Data Hub

JULY 5, 2023

This was, without a question, a significant departure from traditional analytic environments, which often meant vendor-lock in and the inability to work with data at scale. Another unexpected challenge was the introduction of Spark as a processing framework for big data.

Data Lake

Data Lake Metadata Cost-Benefit Data Warehouse

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

Texas Rangers data transformation modernizes stadium operations

Webinars

Trending Sources

Data Architecture Crash Course: Key Terms

Webinars

Making OT-IT integration a reality with new data architectures and generative AI

Improve Business Agility by Hiring a DataOps Engineer

How EUROGATE established a data mesh architecture using Amazon DataZone

Introducing blueprint discovery and other UI enhancements for Amazon OpenSearch Ingestion

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

Top 6 Benefits of Automating End-to-End Data Lineage

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

Cloudera Data Engineering 2021 Year End Review

Automate discovery of data relationships using ML and Amazon Neptune graph technology

Amazon Redshift data ingestion options

Data Integrity, the Basis for Reliable Insights

Connecting the Data Lifecycle

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

Measuring Maturity

A step-by-step guide to setting up a data governance program

Choosing A Graph Data Model to Best Serve Your Use Case

The Chief Marketing Officer and the CDO – A Modern Fable

Data Landscape – Navigating The Data Jungle

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

Data platform trinity: Competitive or complementary?

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Start Thinking About DataOps

Successful Data Virtualisation: more than the right choice of platform

BHP Leverages the Denodo Platform to Create a Logical Data Fabric

Lay the groundwork now for advanced analytics and AI

How smava makes loans transparent and affordable using Amazon Redshift Serverless

CIO 100 Award winners drive business results with IT

Empowering data mesh: The tools to deliver BI excellence

Breaking down data silos for digital success

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Best BI Tools For 2024 You Need to Know

Birst automates the creation of data warehouses in Snowflake

How to modernize data lakes with a data lakehouse architecture

Stay Connected