Data Architecture, Data Processing and Data Strategy

Data Architecture

Data Processing

Data Strategy

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

AWS Big Data

OCTOBER 30, 2024

This is part two of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to load data from a legacy database (SQL Server) into a transactional data lake ( Apache Iceberg ) using AWS Glue. To start the job, choose Run. format(dbname)).config("spark.sql.catalog.glue_catalog.catalog-impl",

Data Lake

Data Lake Data Processing Optimization Machine Learning

Modern Data Architecture for Telecommunications

Cloudera

SEPTEMBER 6, 2022

Data has continued to grow both in scale and in importance through this period, and today telecommunications companies are increasingly seeing data architecture as an independent organizational challenge, not merely an item on an IT checklist. Why telco should consider modern data architecture. The challenges.

Data Architecture

Data Architecture Cost-Benefit Digital Transformation Business Driver

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

A Gartner Marketing survey found only 14% of organizations have successfully implemented a C360 solution, due to lack of consensus on what a 360-degree view means, challenges with data quality, and lack of cross-functional governance structure for customer data.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

CDOs: Your AI is smart, but your ESG is dumb. Here’s how to fix it

CIO Business Intelligence

MARCH 19, 2025

However, embedding ESG into an enterprise data strategy doesnt have to start as a C-suite directive. Developers, data architects and data engineers can initiate change at the grassroots level from integrating sustainability metrics into data models to ensuring ESG data integrity and fostering collaboration with sustainability teams.

IT Data Governance Data-driven Metrics

Public or On-Prem? Telco giants are optimizing the network with the Hybrid Cloud

Cloudera

OCTOBER 19, 2022

The telecommunications industry continues to develop hybrid data architectures to support data workload virtualization and cloud migration. Telco organizations are planning to move towards hybrid multi-cloud to manage data better and support their workforces in the near future. 2- AI capability drives data monetization.

Optimization

Optimization Data Architecture Data Governance B2B

The Multifaceted Value Proposition of the Cloudera Data Platform

Cloudera

FEBRUARY 22, 2021

The Cloudera Data Platform (CDP) represents a paradigm shift in modern data architecture by addressing all existing and future analytical needs. AWS, Google or Azure) and thus allow for execution of a use case wherever it is most costs effective to do so. In particular, SDX enables clients to: .

Cost-Benefit

Cost-Benefit Data Warehouse Data Processing Data Governance

HEMA accelerates their data governance journey with Amazon DataZone

AWS Big Data

DECEMBER 19, 2024

HEMA has a bespoke enterprise architecture, built around the concept of services. Each service is hosted in a dedicated AWS account and is built and maintained by a product owner and a development team, as illustrated in the following figure. Tommaso is the Head of Data & Cloud Platforms at HEMA.

Data Governance

Data Governance Publishing Data-driven Metadata

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

AWS Big Data

FEBRUARY 27, 2024

Four-layered data lake and data warehouse architecture – The architecture comprises four layers, including the analytical layer, which houses purpose-built facts and dimension datasets that are hosted in Amazon Redshift.

Data-driven

Data-driven Data Lake Data Quality Data Governance

Big Data Opportunity in Manufacturing

TDAN

JANUARY 5, 2022

The world now runs on Big Data. Defined as information sets too large for traditional statistical analysis, Big Data represents a host of insights businesses can apply towards better practices. But what exactly are the opportunities present in big data? In manufacturing, this means opportunity.

Manufacturing

Manufacturing Big Data Statistics Data Processing

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

They enable transactions on top of data lakes and can simplify data storage, management, ingestion, and processing. These transactional data lakes combine features from both the data lake and the data warehouse. Data can be organized into three different zones, as shown in the following figure.

Data Lake

Data Lake Sales Data Warehouse Snapshot

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

AWS Big Data

JULY 14, 2023

These inputs reinforced the need of a unified data strategy across the FinOps teams. We decided to build a scalable data management product that is based on the best practices of modern data architecture. Data source locations hosted by the producer are created within the producer’s AWS Glue Data Catalog.

Finance

Finance Metadata Big Data Recreation/Entertainment

How Zurich Insurance Group built a log management solution on AWS

AWS Big Data

JULY 16, 2024

With an extensive career in the financial and tech industries, she specializes in data management and has been involved in initiatives ranging from reporting to data architecture. She currently serves as the Global Head of Cyber Data Management at Zurich Group.

Insurance

Insurance Management Cost-Benefit Optimization

The power of remote engine execution for ETL/ELT data pipelines

IBM Big Data Hub

MAY 15, 2024

Transformation styles like TETL (transform, extract, transform, load) and SQL Pushdown also synergies well with a remote engine runtime to capitalize on source/target resources and limit data movement, thus further reducing costs. With a multicloud data strategy, organizations need to optimize for data gravity and data locality.

Cost-Benefit

Cost-Benefit Data Integration Data Architecture Manufacturing

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Success criteria alignment by all stakeholders (producers, consumers, operators, auditors) is key for successful transition to a new Amazon Redshift modern data architecture. The success criteria are the key performance indicators (KPIs) for each component of the data workflow.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

The Data-Centric Revolution: Report From the Front Lines

TDAN

MARCH 17, 2020

Earlier this month we hosted the second annual Data-Centric Architecture Forum (#DCAF2020) in Fort Collins, CO. Last year, (2019) we hosted the first Data-Centric Architecture conference. In 2019, the focus was on getting a sketch of a reference architecture (click here to see). Trip report).

Reporting

Reporting Data Processing Data Architecture Data Strategy

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 2: Cloud Adoption

BizAcuity

MAY 24, 2022

IaaS provides a platform for compute, data storage and networking capabilities. IaaS is mainly used for developing softwares (testing and development, batch processing), hosting web applications and data analysis. To try and test the platforms in accordance with data strategy and governance. No pun intended.

Data-driven

Data-driven Cost-Benefit Digital Transformation Strategy

Introducing erwin Data Modeler 14.0: The next step in a tradition of data modeling excellence

erwin

SEPTEMBER 16, 2024

The gold standard in data modeling solutions for more than 30 years continues to evolve with its latest release, highlighted by: PostgreSQL 16.x Migration and modernization : It enables seamless transitions between legacy systems and modern platforms, ensuring your data architecture evolves without disruption.

Modeling

Modeling Visualization Data Governance Data Architecture

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 2: Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR

AWS Big Data

APRIL 28, 2025

In your project, in the navigation pane, choose Data. Choose the plus sign, and for Add data source , choose Add connection. For Data source name , enter postgresql_source. For Host , enter the host name of your Aurora PostgreSQL database cluster. Select PostgreSQL. For Database , enter your database name.

Big Data

Big Data Visualization Data Processing Data Processing

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

AWS Big Data

APRIL 28, 2025

Under Add a data source , choose Add connection , then choose Amazon Redshift. Enter the following parameters in the connection details, and choose Add data. Host : Enter the Amazon Redshift managed VPC endpoint. He is also the author of Simplify Big Data Analytics with Amazon EMR and AWS Certified Data Engineer Study Guide.

Metadata

Metadata Data Lake Big Data Publishing

Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer

AWS Big Data

OCTOBER 30, 2024

This is the final part of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to process data with Amazon Redshift Spectrum and create the gold (consumption) layer. His focus areas are MLOps, feature stores, data lakes, model hosting, and generative AI.

Data Lake

Data Lake Machine Learning Data Architecture Data-driven

Data Leaders Brief

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Modern Data Architecture for Telecommunications

Webinars

Trending Sources

Create an end-to-end data strategy for Customer 360 on AWS

Webinars

CDOs: Your AI is smart, but your ESG is dumb. Here’s how to fix it

Public or On-Prem? Telco giants are optimizing the network with the Hybrid Cloud

The Multifaceted Value Proposition of the Cloudera Data Platform

HEMA accelerates their data governance journey with Amazon DataZone

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

Big Data Opportunity in Manufacturing

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

How Zurich Insurance Group built a log management solution on AWS

The power of remote engine execution for ETL/ELT data pipelines

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

The Data-Centric Revolution: Report From the Front Lines

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 2: Cloud Adoption

Introducing erwin Data Modeler 14.0: The next step in a tradition of data modeling excellence

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 2: Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer

Stay Connected