Data Architecture, Data Warehouse and Publishing

Data Architecture

Data Warehouse

Publishing

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

AWS Big Data

MAY 30, 2024

Amazon Redshift is a fast, scalable, and fully managed cloud data warehouse that allows you to process and run your complex SQL analytics workloads on structured and semi-structured data. Data ingestion – Pentaho was used to ingest data sourced from multiple data publishers into the data store.

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Structured Data

Amazon Web Services named a Leader in the 2024 Gartner Magic Quadrant for Data Integration Tools

AWS Big Data

FEBRUARY 26, 2025

The Gartner Magic Quadrant evaluates 20 data integration tool vendors based on two axesAbility to Execute and Completeness of Vision. Discover, prepare, and integrate all your data at any scale AWS Glue is a fully managed, serverless data integration service that simplifies data preparation and transformation across diverse data sources.

Data Integration

Data Integration Data Lake Data Warehouse Unstructured Data

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

Amazon SageMaker Lakehouse , now generally available, unifies all your data across Amazon Simple Storage Service (Amazon S3) data lakes and Amazon Redshift data warehouses, helping you build powerful analytics and AI/ML applications on a single copy of data. Having confidence in your data is key.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

AWS Big Data

SEPTEMBER 11, 2024

This post describes how HPE Aruba automated their Supply Chain management pipeline, and re-architected and deployed their data solution by adopting a modern data architecture on AWS. The following diagram illustrates the solution architecture.

Data Architecture

Data Architecture Optimization Data Warehouse Metadata

Warehouse, Lake or a Lakehouse – What’s Right for you?

Analytics Vidhya

OCTOBER 10, 2022

This article was published as a part of the Data Science Blogathon. Introduction Most of you would know the different approaches for building a data and analytics platform. You would have already worked on systems that used traditional warehouses or Hadoop-based data lakes. Selecting one among […].

Data Lake

Data Lake Data Science Publishing Analytics

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Need for a data mesh architecture Because entities in the EUROGATE group generate vast amounts of data from various sourcesacross departments, locations, and technologiesthe traditional centralized data architecture struggles to keep up with the demands for real-time insights, agility, and scalability.

IoT

IoT Machine Learning Metadata Data-driven

Centralize near-real-time governance through alerts on Amazon Redshift data warehouses for sensitive queries

AWS Big Data

JUNE 29, 2023

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud that delivers powerful and secure insights on all your data with the best price-performance. With Amazon Redshift, you can analyze your data to derive holistic insights about your business and your customers.

Data Warehouse

Data Warehouse Dashboards Testing Visualization

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

We also examine how centralized, hybrid and decentralized data architectures support scalable, trustworthy ecosystems. As data-centric AI, automated metadata management and privacy-aware data sharing mature, the opportunity to embed data quality into the enterprises core has never been more significant.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Cloudera

SEPTEMBER 29, 2022

Each of these trends claim to be complete models for their data architectures to solve the “everything everywhere all at once” problem. Data teams are confused as to whether they should get on the bandwagon of just one of these trends or pick a combination. First, we describe how data mesh and data fabric could be related.

Data Architecture

Data Architecture Data Warehouse Metadata Sales

Power analytics as a service capabilities using Amazon Redshift

AWS Big Data

APRIL 17, 2024

The AaaS model accelerates data-driven decision-making through advanced analytics, enabling organizations to swiftly adapt to changing market trends and make informed strategic choices. times better price-performance than other cloud data warehouses. Data processing jobs enrich the data in Amazon Redshift.

Data Warehouse

Data Warehouse Analytics Cost-Benefit Data Processing

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities. This makes sure the new data platform can meet current and future business goals.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

AWS Big Data

JANUARY 30, 2025

Diagram 1: Overall architecture of the solution, using AWS Step Functions, Amazon Redshift and Amazon S3 The following AWS services were used to shape our new ETL architecture: Amazon Redshift A fully managed, petabyte-scale data warehouse service in the cloud. The following Diagram 4 shows this workflow.

Data Warehouse

Data Warehouse Data Architecture Machine Learning Data Transformation

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

While traditional extract, transform, and load (ETL) processes have long been a staple of data integration due to its flexibility, for common use cases such as replication and ingestion, they often prove time-consuming, complex, and less adaptable to the fast-changing demands of modern data architectures.

Data Integration

Data Integration Data Lake Statistics Data-driven

Automate data loading from your database into Amazon Redshift using AWS Database Migration Service (DMS), AWS Step Functions, and the Redshift Data API

AWS Big Data

JULY 2, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing ETL (extract, transform, and load), business intelligence (BI), and reporting tools.

Data Warehouse

Data Warehouse Sales Testing Big Data

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

One of the key challenges in modern big data management is facilitating efficient data sharing and access control across multiple EMR clusters. Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated. Test access using SageMaker Studio in the consumer account.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Amazon SageMaker Lakehouse provides an open data architecture that reduces data silos and unifies data across Amazon Simple Storage Service (Amazon S3) data lakes, Redshift data warehouses, and third-party and federated data sources. AWS Glue 5.0 Finally, AWS Glue 5.0

Analytics

Analytics Data Lake Metadata Data Warehouse

How to Pinpoint Where Your Organization Wins (and Loses) with Data

CIO Business Intelligence

NOVEMBER 29, 2022

A sea of complexity For years, data ecosystems have gotten more complex due to discrete (and not necessarily strategic) data-platform decisions aimed at addressing new projects, use cases, or initiatives. Layering technology on the overall data architecture introduces more complexity.

Data Architecture

Data Architecture Data Integration IoT Data-driven

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

They enable transactions on top of data lakes and can simplify data storage, management, ingestion, and processing. These transactional data lakes combine features from both the data lake and the data warehouse. Data can be organized into three different zones, as shown in the following figure.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Accelerate Amazon Redshift secure data use with Satori – Part 1

AWS Big Data

SEPTEMBER 21, 2023

Satori integrates natively with both Amazon Redshift provisioned clusters and Amazon Redshift Serverless for easy setup of your Amazon Redshift data warehouse in the secure Satori portal. In part 2, we will explore how to set up self-service data access with Satori to data stored in Amazon Redshift.

Data Warehouse

Data Warehouse Interactive Data Architecture Data-driven

Embedded BI with APIs: A Simple Path to OEM and ISV Success!

Smarten

JUNE 1, 2022

An integrated solution provides single sign-on access to data sources and data warehouses.’. The integrated augmented analytics approach includes simple tenant management to deploy with a shared data model for single-tenant mode or an isolated data model for multi-tenant mode and software as a service (SaaS) applications.

Data Warehouse

Data Warehouse Software Data Architecture Dashboards

AWS Lake Formation 2022 year in review

AWS Big Data

JANUARY 31, 2023

In this post, we are excited to summarize the features that the AWS Glue Data Catalog, AWS Glue crawler, and Lake Formation teams delivered in 2022. Whether you are a data platform builder, data engineer, data scientist, or any technology leader interested in data lake solutions, this post is for you.

Data Lake

Data Lake Data Governance Data Architecture Machine Learning

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

Data fabric and data mesh are emerging data management concepts that are meant to address the organizational change and complexities of understanding, governing and working with enterprise data in a hybrid multicloud ecosystem. The good news is that both data architecture concepts are complimentary.

Management

Management Metadata Data Architecture Data Lake

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

AWS Big Data

JULY 25, 2024

Solution To address the challenge, ATPCO sought inspiration from a modern data mesh architecture. In Amazon DataZone, data owners can publish their data and its business catalog (metadata) to ATPCO’s DataZone domain. Data consumers can then search for relevant data assets using these human-friendly metadata terms.

Data Lake

Data Lake Metadata Sales Publishing

The hidden history of Db2

IBM Big Data Hub

JULY 5, 2022

In today’s world of complex data architectures and emerging technologies, databases can sometimes be undervalued and unrecognized. Back in the 1960s and 70s, vast amounts of data were stored in the world’s new mainframe computers—many of them IBM System/360 machines—and had become a problem. They were expensive.

Data Lake

Data Lake Data Warehouse Publishing Structured Data

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS Big Data

SEPTEMBER 13, 2023

The Analytics specialty practice of AWS Professional Services (AWS ProServe) helps customers across the globe with modern data architecture implementations on the AWS Cloud. Ravi helps customers with enterprise data strategy initiatives across insurance, airlines, pharmaceutical, and financial services industries.

Data Lake

Data Lake Data Processing Metadata Snapshot

Include Embedded BI into Your Business Applications!

Smarten

MARCH 17, 2022

There are many benefits to Embedded BI approach including: World-Class Data Architecture provides access to a wealth of data sources and data warehouses, and accommodates business application architecture with single-tenant mode or multi-tenant modes.

Cost-Benefit

Cost-Benefit Data Warehouse ROI Data Architecture

5 Key Takeaways from Flink Forward 2023

Cloudera

NOVEMBER 27, 2023

Workloads that don’t expressly require the many-to-many data sharing that publish/subscribe model solves for might be better for a universal data distribution too like NiFi for real-time needs or an open table format like Iceberg where making data accessible in near real time is acceptable.

Advertising

Advertising Data Lake Data Warehouse ROI

Data platform trinity: Competitive or complementary?

IBM Big Data Hub

JANUARY 18, 2023

They defined it as : “ A data lakehouse is a new, open data management architecture that combines the flexibility, cost-efficiency, and scale of data lakes with the data management and ACID transactions of data warehouses, enabling business intelligence (BI) and machine learning (ML) on all data. ”.

Data Lake

Data Lake Data Warehouse Data-driven Metadata

Improve User Adoption of BI Tools with Embedded BI!

Smarten

MAY 30, 2023

World-Class Data Architecture provides access to a wealth of data sources and data warehouses, and accommodates business application architecture with single-tenant mode or multi-tenant modes. Flexible Deployment via public or private cloud, or enterprise on-premises hardware.

Slice and Dice

Slice and Dice Business Intelligence Cost-Benefit Dashboards

Metadata, the Neglected Stepchild of IT

Data Virtualization

DECEMBER 8, 2022

Reading Time: 3 minutes While cleaning up our archive recently, I found an old article published in 1976 about data dictionary/directory systems (DD/DS). Nowadays, we no longer use the term DD/DS, but “data catalog” or simply “metadata system”. It was written by L.

Metadata

Metadata IT Publishing Data Integration

What is Data Mesh?

Ontotext

NOVEMBER 16, 2023

Figure 1 Shows the overall idea of a data mesh with the major components: What Is a Data Mesh and How Does It Work? Think of data mesh as an operational mode for organizations with a domain-driven, decentralized data architecture.

Metadata

Metadata Data-driven Data Quality Data Architecture

Topics to watch at the Strata Data Conference in New York 2019

O'Reilly on Data

SEPTEMBER 11, 2019

Another related term, “data pipeline” (at No. Data engineering is not a new thing, however. Since 1977, for example, the Institute of Electrical and Electronics Engineers (IEEE) has published the Data Engineering Bulletin , a quarterly journal that focuses on engineering data for use with database systems [2].

IoT

IoT Big Data Data Warehouse Uncertainty

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Jet Global

OCTOBER 1, 2024

Cost Savings: By streamlining data access and reducing the need for multiple systems, Simba cuts down on maintenance and integration costs, allowing you to focus resources where they matter most. Ready to Transform Your Data Strategy? Now is the time to integrate Trino and Apache Iceberg into your data ecosystem using Simba drivers.

Dashboards

Dashboards Data Lake Reporting Cost-Benefit

Maximizing Return on Your SAP Investment with Process Runner

Jet Global

MARCH 4, 2024

Technology teams often jump into SAP data systems expecting immediate, quantifiable ROI. However, this optimism often overlooks the reality of the situation: complex data architecture, mountains of manual tasks, and hidden inefficiencies in processing. Visions of cost savings and efficiency gains dance in their minds.

ROI

ROI Data Processing Reporting Software

Your Path to Finding the Best Embedded Analytics Solution

Jet Global

JULY 28, 2023

Make sure your data environment is good-to-go. Meaning, the solutions you think about should mesh with your current data architecture. Plan how you will deliver and iterate these within your application. These must be flexible enough to meet the changing demands of users.

Analytics

Analytics Software Business Intelligence Reporting

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

These sit on top of data warehouses that are strictly governed by IT departments. The role of traditional BI platforms is to collect data from various business systems. Data Environment First off, the solutions you consider should be compatible with your current data architecture.

Analytics

Analytics Cost-Benefit Visualization Dashboards

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

AWS Big Data

APRIL 29, 2025

While enabling organization-wide efficiency, the team also applied these principles to the data architecture, making sure that CLEA itself operates frugally. After evaluating various tools, we built a serverless data transformation pipeline using Amazon Athena and dbt. However, our initial data architecture led to challenges.

Data Transformation

Data Transformation Cost-Benefit Testing Data Lake

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

AWS Big Data

APRIL 28, 2025

When the user interacts with resources within SageMaker Unified Studio, it generates IAM session credentials based on the users effective profile in the specific project context, and then users can use tools such as Amazon Athena or Amazon Redshift to query the relevant data. SageMaker Unified Studio supports Lake Formation hybrid mode.

Metadata

Metadata Data Lake Big Data Publishing

Power BI Write Back With Microsoft Fabric and Teams

Jet Global

MAY 9, 2025

The Challenge of Capturing Human Input Modern data architectures, like Microsoft Fabric, excel in collecting and processing system-generated data. Whether transactional data, operational metrics, or system logs, these platforms are optimized to deliver analytical insights from structured sources.

Visualization

Visualization Dashboards Reporting Data Architecture

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

Amazon Web Services named a Leader in the 2024 Gartner Magic Quadrant for Data Integration Tools

Webinars

Trending Sources

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Webinars

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

Warehouse, Lake or a Lakehouse – What’s Right for you?

How EUROGATE established a data mesh architecture using Amazon DataZone

Centralize near-real-time governance through alerts on Amazon Redshift data warehouses for sensitive queries

Data’s dark secret: Why poor quality cripples AI and growth

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Power analytics as a service capabilities using Amazon Redshift

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Automate data loading from your database into Amazon Redshift using AWS Database Migration Service (DMS), AWS Step Functions, and the Redshift Data API

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Top analytics announcements of AWS re:Invent 2024

How to Pinpoint Where Your Organization Wins (and Loses) with Data

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Accelerate Amazon Redshift secure data use with Satori – Part 1

Embedded BI with APIs: A Simple Path to OEM and ISV Success!

AWS Lake Formation 2022 year in review

Augmented data management: Data fabric versus data mesh

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

The hidden history of Db2

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Include Embedded BI into Your Business Applications!

5 Key Takeaways from Flink Forward 2023

Data platform trinity: Competitive or complementary?

Improve User Adoption of BI Tools with Embedded BI!

Metadata, the Neglected Stepchild of IT

What is Data Mesh?

Topics to watch at the Strata Data Conference in New York 2019

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Maximizing Return on Your SAP Investment with Process Runner

Your Path to Finding the Best Embedded Analytics Solution

What Is Embedded Analytics?

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

Power BI Write Back With Microsoft Fabric and Teams

Stay Connected