Data Architecture, Data Processing and Modeling

7 types of tech debt that could cripple your business

CIO Business Intelligence

MARCH 25, 2025

Types of data debt include dark data, duplicate records, and data that hasnt been integrated with master data sources. Using the companys data in LLMs, AI agents, or other generative AI models creates more risk. Playing catch-up with AI models may not be that easy.

Risk

Risk Cost-Benefit Data-driven Digital Transformation

The future of data: A 5-pillar approach to modern data management

CIO Business Intelligence

DECEMBER 11, 2024

Digital transformation started creating a digital presence of everything we do in our lives, and artificial intelligence (AI) and machine learning (ML) advancements in the past decade dramatically altered the data landscape. Thats free money given to cloud providers and creates significant issues in end-to-end value generation.

Management

Management Data Governance Data Science Reporting

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

AWS Big Data

OCTOBER 30, 2024

This is part two of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to load data from a legacy database (SQL Server) into a transactional data lake ( Apache Iceberg ) using AWS Glue. To start the job, choose Run. format(dbname)).config("spark.sql.catalog.glue_catalog.catalog-impl",

Data Lake

Data Lake Data Processing Optimization Machine Learning

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Eight Top DataOps Trends for 2022

DataKitchen

NOVEMBER 29, 2021

In 2022, data organizations will institute robust automated processes around their AI systems to make them more accountable to stakeholders. Model developers will test for AI bias as part of their pre-deployment testing. Continuous testing, monitoring and observability will prevent biased models from deploying or continuing to operate.

Testing

Testing Data Lake Data Architecture Manufacturing

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

Together with price-performance, Amazon Redshift offers capabilities such as serverless architecture, machine learning integration within your data warehouse and secure data sharing across the organization. dbt Cloud is a hosted service that helps data teams productionize dbt deployments. Schema – dbt_zetl.

Data Warehouse

Data Warehouse Analytics Testing Sales

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Need for a data mesh architecture Because entities in the EUROGATE group generate vast amounts of data from various sourcesacross departments, locations, and technologiesthe traditional centralized data architecture struggles to keep up with the demands for real-time insights, agility, and scalability.

IoT

IoT Machine Learning Metadata Data-driven

Large Language Models and Data Management

Ontotext

JULY 24, 2023

I did some research because I wanted to create a basic framework on the intersection between large language models (LLM) and data management. But there are also a host of other issues (and cautions) to take into consideration. LLM is by its very design a language model. The technology is very new and not well understood.

Modeling

Modeling Management Structured Data Data Architecture

The Struggle Between Data Dark Ages and LLM Accuracy

Cloudera

DECEMBER 6, 2024

The AI Forecast: Data and AI in the Cloud Era , sponsored by Cloudera, aims to take an objective look at the impact of AI on business, industry, and the world at large. Therefore, the next 10%, which are small language models, are going to come into play. But 85% accuracy in the supply chain means you have no manufacturing operations.

Manufacturing

Manufacturing Forecasting Metadata Data Processing

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Cloudera

SEPTEMBER 29, 2022

Each of these trends claim to be complete models for their data architectures to solve the “everything everywhere all at once” problem. Data teams are confused as to whether they should get on the bandwagon of just one of these trends or pick a combination. It is also agnostic to where the different domains are hosted.

Data Architecture

Data Architecture Data Warehouse Metadata Sales

Huawei unveils four strategic directions for the future of finance

CIO Business Intelligence

JUNE 19, 2023

No industry generates as much actionable data as the finance industry, and as AI enters the mainstream, user behaviour and corporate production and service models will all need to quickly adapt. Resilient infrastructure is the key to delivering on the promise of real-time transformation of data into decisions, Mr. Cao said.

Finance

Finance Unstructured Data Data Processing Data Architecture

Building resilient infrastructure: the key to cloud-native, real-time decision-making

CIO Business Intelligence

JUNE 15, 2023

No industry generates as much actionable data as the finance industry, and as AI enters the mainstream, user behaviour and corporate production and service models will all need to quickly adapt. Resilient infrastructure is the key to delivering on the promise of real-time transformation of data into decisions, Mr. Cao said.

Unstructured Data

Unstructured Data Finance Data Processing Data Architecture

Power analytics as a service capabilities using Amazon Redshift

AWS Big Data

APRIL 17, 2024

Analytics as a service (AaaS) is a business model that uses the cloud to deliver analytic capabilities on a subscription basis. This model provides organizations with a cost-effective, scalable, and flexible solution for building analytics. times better price-performance than other cloud data warehouses.

Data Warehouse

Data Warehouse Analytics Cost-Benefit Data Processing

SAP enhances Datasphere and SAC for AI-driven transformation

CIO Business Intelligence

MARCH 6, 2024

SAP announced today a host of new AI copilot and AI governance features for SAP Datasphere and SAP Analytics Cloud (SAC). We have cataloging inside Datasphere: It allows you to catalog, manage metadata, all the SAP data assets we’re seeing,” said JG Chirapurath, chief marketing and solutions officer for SAP. “We

Unstructured Data

Unstructured Data Dashboards Business Intelligence Data Governance

4 paths to sustainable AI

CIO Business Intelligence

JANUARY 31, 2024

Everything from geothermal data centers to more efficient graphic processing units (GPUs) can help. But AI users must also get over the urge to use the biggest, baddest AI models to solve every problem if they truly want to fight climate change. Is it necessary for a model that can also write a sonnet to write code for us?”

Cost-Benefit

Cost-Benefit Modeling Testing IoT

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

For consumer access, a centralized catalog is necessary where producers can publish their data assets. Cross-producer data access – Consumers may need to access data from multiple producers within the same catalog environment. The producer account will host the EMR cluster and S3 buckets. VPC with the CIDR 10.0.0.0/16.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Introducing erwin Data Modeler 14.0: The next step in a tradition of data modeling excellence

erwin

SEPTEMBER 16, 2024

Continue to conquer data chaos and build your data landscape on a sturdy and standardized foundation with erwin® Data Modeler 14.0. The gold standard in data modeling solutions for more than 30 years continues to evolve with its latest release, highlighted by: PostgreSQL 16.x

Modeling

Modeling Visualization Data Governance Data Architecture

Announcing the 2020 Data Impact Award Winners

Cloudera

NOVEMBER 18, 2020

The technological linchpin of its digital transformation has been its Enterprise Data Architecture & Governance platform. It hosts over 150 big data analytics sandboxes across the region with over 200 users utilizing the sandbox for data discovery.

Internet Publishing and Broadcasting

Internet Publishing and Broadcasting Data-driven Broadcasting Digital Transformation

For IT leaders, operationalized gen AI is still a moving target

CIO Business Intelligence

FEBRUARY 28, 2024

However, getting into the more difficult types of implementations — the fine-tuned models, vector databases to provide context and up-to-date information to the AI systems, and APIs to integrate gen AI into workflows — is where problems might crop up. That’s fine, but language models are great for language. They need stability.

IT

IT Consulting Modeling Enterprise

Public or On-Prem? Telco giants are optimizing the network with the Hybrid Cloud

Cloudera

OCTOBER 19, 2022

The telecommunications industry continues to develop hybrid data architectures to support data workload virtualization and cloud migration. Telco organizations are planning to move towards hybrid multi-cloud to manage data better and support their workforces in the near future. 2- AI capability drives data monetization.

Optimization

Optimization Data Architecture Data Governance B2B

Generative AI is a make-or-break moment for CIOs

CIO Business Intelligence

AUGUST 7, 2023

We’ve found it helpful to think in terms of three archetypes: Takers use a chat interface or an API to quickly access a commodity service via a publicly available model. In shaper use cases, CIOs need to integrate existing gen AI models with internal data and systems to work together seamlessly and generate customized results.

Sales

Sales Modeling Cost-Benefit Data Processing

National Grid’s energy transformation is fueled by IT

CIO Business Intelligence

MAY 20, 2022

Modernizing a utility’s data architecture. These capabilities allow us to reduce business risk as we move off of our monolithic, on-premise environments and provide cloud resiliency and scale,” the CIO says, noting National Grid also has a major data center consolidation under way as it moves more data to the cloud.

IT

IT Internet of Things Digital Transformation Data Architecture

Four Ways Telcos Can Realize Data-Driven Transformation

Cloudera

OCTOBER 19, 2023

While navigating so many simultaneous data-dependent transformations, they must balance the need to level up their data management practices—accelerating the rate at which they ingest, manage, prepare, and analyze data—with that of governing this data.

Data-driven

Data-driven Data Architecture Predictive Modeling Digital Transformation

96 Percent of Businesses Can’t Be Wrong: How Hybrid Cloud Came to Dominate the Data Sector

Cloudera

JANUARY 26, 2022

Modern, real-time businesses require accelerated cycles of innovation that are expensive and difficult to maintain with legacy data platforms. The hybrid cloud’s premise—two data architectures fused together—gives companies options to leverage those solutions and to address decision-making criteria, on a case-by-case basis. .

Data Processing

Data Processing IoT Data Warehouse Cost-Benefit

How Swisscom automated Amazon Redshift as part of their One Data Platform solution using AWS CDK – Part 1

AWS Big Data

JUNE 12, 2024

One Data Platform The ODP architecture is based on the AWS Well Architected Framework Analytics Lens and follows the pattern of having raw, standardized, conformed, and enriched layers as described in Modern data architecture. See the following admin user code: admin_secret_kms_key_options = KmsKeyOptions(.

Data Architecture

Data Architecture Cost-Benefit Data-driven Experimentation

CDOs: Your AI is smart, but your ESG is dumb. Here’s how to fix it

CIO Business Intelligence

MARCH 19, 2025

However, embedding ESG into an enterprise data strategy doesnt have to start as a C-suite directive. Developers, data architects and data engineers can initiate change at the grassroots level from integrating sustainability metrics into data models to ensuring ESG data integrity and fostering collaboration with sustainability teams.

IT

IT Data Governance Data-driven Metrics

The Multifaceted Value Proposition of the Cloudera Data Platform

Cloudera

FEBRUARY 22, 2021

The Cloudera Data Platform (CDP) represents a paradigm shift in modern data architecture by addressing all existing and future analytical needs. More information about Cloudera Data Platform can be found at [link]. In particular, SDX enables clients to: .

Cost-Benefit

Cost-Benefit Data Warehouse Data Processing Data Governance

Data Governance and Strategy for the Global Enterprise

Cloudera

OCTOBER 1, 2022

If the organization had any experience with machine learning, it was concentrated in some team that was tucked away in a dark corner somewhere that maybe had years of experience building out some niche use case like a fraud model at a credit card company or churn models at a phone company. From a recent Cloudera roundtable event.

Data Governance

Data Governance Strategy Enterprise Machine Learning

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Tracking data changes and rollback Build your transactional data lake on AWS You can build your modern data architecture with a scalable data lake that integrates seamlessly with an Amazon Redshift powered cloud warehouse. Data can be organized into three different zones, as shown in the following figure.

Data Lake

Data Lake Sales Data Warehouse Snapshot

HEMA accelerates their data governance journey with Amazon DataZone

AWS Big Data

DECEMBER 19, 2024

HEMA has a bespoke enterprise architecture, built around the concept of services. Each service is hosted in a dedicated AWS account and is built and maintained by a product owner and a development team, as illustrated in the following figure. Tommaso is the Head of Data & Cloud Platforms at HEMA.

Data Governance

Data Governance Publishing Data-driven Metadata

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

The need for a decentralized data mesh architecture stems from the challenges organizations faced when implementing more centralized data management architectures – challenges that can attributed to both technology (e.g., difficulty to achieve cross-organizational governance model). Components of a Data Mesh.

Metadata

Metadata Cost-Benefit Enterprise Interactive

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

AWS Big Data

FEBRUARY 27, 2024

Four-layered data lake and data warehouse architecture – The architecture comprises four layers, including the analytical layer, which houses purpose-built facts and dimension datasets that are hosted in Amazon Redshift. This enables data-driven decision-making across the organization.

Data-driven

Data-driven Data Lake Data Quality Data Governance

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

JANUARY 22, 2024

Cost and resource efficiency – This is an area where Acast observed a reduction in data duplication, and therefore cost reduction (in some accounts, removing the copy of data 100%), by reading data across accounts while enabling scaling. All other teams can be data producers or data consumers.

Data-driven

Data-driven Advertising Metadata Data Architecture

The power of remote engine execution for ETL/ELT data pipelines

IBM Big Data Hub

MAY 15, 2024

Organizations require reliable data for robust AI models and accurate insights, yet the current technology landscape presents unparalleled data quality challenges. The explosion of data volume in different formats and locations and the pressure to scale AI looms as a daunting task for those responsible for deploying AI.

Cost-Benefit

Cost-Benefit Data Integration Data Architecture Manufacturing

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

Overview of solution As a data-driven company, smava relies on the AWS Cloud to power their analytics use cases. smava ingests data from various external and internal data sources into a landing stage on the data lake based on Amazon Simple Storage Service (Amazon S3).

Data Lake

Data Lake Data Warehouse Data-driven B2B

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

AWS Big Data

MAY 30, 2024

In addition, the business started moving customers onto a new commercial model, and therefore new projects would need to provision a new cluster, which meant that they needed improved performance, scalability, and availability. This model caused operational maintenance overhead and wasn’t easily expandable.

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Structured Data

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Success criteria alignment by all stakeholders (producers, consumers, operators, auditors) is key for successful transition to a new Amazon Redshift modern data architecture. The success criteria are the key performance indicators (KPIs) for each component of the data workflow. Performance – Review cluster performance metrics.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

The currently available choices include: The Amazon Redshift COPY command can load data from Amazon Simple Storage Service (Amazon S3), Amazon EMR , Amazon DynamoDB , or remote hosts over SSH. This native feature of Amazon Redshift uses massive parallel processing (MPP) to load objects directly from data sources into Redshift tables.

IoT

IoT Data Warehouse Cost-Benefit Reporting

Meet the newest Data Superheros: The Sixth Annual Data Impact Awards Finalists Are…

Cloudera

AUGUST 28, 2018

From AI models that power retail customer decision engines to utility meter analysis that disables underperforming gas turbines, these finalists demonstrate how machine learning and analytics have become mission-critical to organizations around the world. Data-driven strategies are driving change across organizations.

Machine Learning

Machine Learning Digital Transformation Consulting IoT

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. This data is sent to Apache Kafka, which is hosted on Amazon Managed Streaming for Apache Kafka (Amazon MSK).

Data Lake

Data Lake Analytics Snapshot Data Quality

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

AWS Big Data

JULY 14, 2023

These inputs reinforced the need of a unified data strategy across the FinOps teams. We decided to build a scalable data management product that is based on the best practices of modern data architecture. Our source system and domain teams were mapped as data producers, and they would have ownership of the datasets.

Finance

Finance Metadata Big Data Recreation/Entertainment

KGF 2023: Bikes To The Moon, Datastrophies, Abstract Art And A Knowledge Graph Forum To Embrace Them All

Ontotext

DECEMBER 1, 2023

Content and data management solutions based on knowledge graphs are becoming increasingly important across enterprises. from Q&A with Tim Berners-Lee ) Finally, Sumit highlighted the importance of knowledge graphs to advance semantic data architecture models that allow unified data access and empower flexible data integration.

Metadata

Metadata Sales Machine Learning Consulting

Delivering Power Platform Projects: Truths from the Field

Jen Stirrup

AUGUST 25, 2020

Is it clean data? These questions are the hard questions, and the Power Platform sidesteps it to an extent with the Common Data Model. However, if you are outside of that, then you need to understand that data and technology are two different things. What needs to be done to get it into a good shape?

Advertising

Advertising Business Intelligence Data Processing Data Architecture

How Financial Services and Insurance Streamline AI Initiatives with a Hybrid Data Platform

Cloudera

SEPTEMBER 7, 2023

With the emergence of new creative AI algorithms like large language models (LLM) fromOpenAI’s ChatGPT, Google’s Bard, Meta’s LLaMa, and Bloomberg’s BloombergGPT—awareness, interest and adoption of AI use cases across industries is at an all time high. The reality of LLMs and other “narrow” AI technologies is that none of them is turn-key.

Insurance

Insurance Risk Data-driven Finance

How Zurich Insurance Group built a log management solution on AWS

AWS Big Data

JULY 16, 2024

Lifecycle policies provide a mechanism to balance the cost of storing data and meeting retention requirements. Historic data analysis – Data stored in Amazon S3 can be queried to satisfy one-time audit or analysis tasks. Eventually, this data could be used to train ML models to support better anomaly detection.

Insurance

Insurance Management Cost-Benefit Optimization

7 types of tech debt that could cripple your business

The future of data: A 5-pillar approach to modern data management

Webinars

Trending Sources

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Webinars

Eight Top DataOps Trends for 2022

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

How EUROGATE established a data mesh architecture using Amazon DataZone

Large Language Models and Data Management

The Struggle Between Data Dark Ages and LLM Accuracy

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Huawei unveils four strategic directions for the future of finance

Building resilient infrastructure: the key to cloud-native, real-time decision-making

Power analytics as a service capabilities using Amazon Redshift

SAP enhances Datasphere and SAC for AI-driven transformation

4 paths to sustainable AI

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Introducing erwin Data Modeler 14.0: The next step in a tradition of data modeling excellence

Announcing the 2020 Data Impact Award Winners

For IT leaders, operationalized gen AI is still a moving target

Public or On-Prem? Telco giants are optimizing the network with the Hybrid Cloud

Generative AI is a make-or-break moment for CIOs

National Grid’s energy transformation is fueled by IT

Four Ways Telcos Can Realize Data-Driven Transformation

96 Percent of Businesses Can’t Be Wrong: How Hybrid Cloud Came to Dominate the Data Sector

How Swisscom automated Amazon Redshift as part of their One Data Platform solution using AWS CDK – Part 1

CDOs: Your AI is smart, but your ESG is dumb. Here’s how to fix it

The Multifaceted Value Proposition of the Cloudera Data Platform

Data Governance and Strategy for the Global Enterprise

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

HEMA accelerates their data governance journey with Amazon DataZone

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

Design a data mesh on AWS that reflects the envisioned organization

The power of remote engine execution for ETL/ELT data pipelines

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Amazon Redshift data ingestion options

Meet the newest Data Superheros: The Sixth Annual Data Impact Awards Finalists Are…

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

KGF 2023: Bikes To The Moon, Datastrophies, Abstract Art And A Knowledge Graph Forum To Embrace Them All

Delivering Power Platform Projects: Truths from the Field

How Financial Services and Insurance Streamline AI Initiatives with a Hybrid Data Platform

How Zurich Insurance Group built a log management solution on AWS

Stay Connected