Data Architecture, Metadata and Visualization

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Need for a data mesh architecture Because entities in the EUROGATE group generate vast amounts of data from various sourcesacross departments, locations, and technologiesthe traditional centralized data architecture struggles to keep up with the demands for real-time insights, agility, and scalability.

IoT

IoT Machine Learning Metadata Data-driven

5 Ways Data Modeling Is Critical to Data Governance

erwin

JANUARY 9, 2020

While it’s always been the best way to understand complex data sources and automate design standards and integrity rules, the role of data modeling continues to expand as the fulcrum of collaboration between data generators, stewards and consumers. So here’s why data modeling is so critical to data governance.

Data Governance

Data Governance Modeling Metadata Unstructured Data

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architect role Data architects are senior visionaries who translate business requirements into technology requirements and define data standards and principles, often in support of data or digital transformations. Data architect vs. data engineer The data architect and data engineer roles are closely related.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

They understand that a one-size-fits-all approach no longer works, and recognize the value in adopting scalable, flexible tools and open data formats to support interoperability in a modern data architecture to accelerate the delivery of new solutions.

Data Lake

Data Lake Snapshot Metadata Data Architecture

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

Collaborate and build faster using familiar AWS tools for model development, generative AI, data processing, and SQL analytics with Amazon Q Developer , the most capable generative AI assistant for software development, helping you along the way. Having confidence in your data is key.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Cloudera

SEPTEMBER 29, 2022

Each of these trends claim to be complete models for their data architectures to solve the “everything everywhere all at once” problem. Data teams are confused as to whether they should get on the bandwagon of just one of these trends or pick a combination. First, we describe how data mesh and data fabric could be related.

Data Architecture

Data Architecture Data Warehouse Metadata Sales

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Amazon SageMaker Unified Studio brings together functionality and tools from the range of standalone studios, query editors, and visual tools available today in Amazon EMR , AWS Glue , Amazon Redshift , Amazon Bedrock , and the existing Amazon SageMaker Studio. With AWS Glue 5.0,

Analytics

Analytics Data Lake Metadata Data Warehouse

The Future of Data Lineage and the Role of Metadata

Alation

AUGUST 18, 2022

The complex challenge here is to have the lineage be intelligently updated as the data landscape and processing dynamically bubbles and changes daily across an enterprise. Active metadata will play a critical role in automating such updates as they arise. This approach ensures lineage is easy to visualize. Why Focus on Lineage?

Metadata

Metadata Visualization Statistics Data Architecture

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

While traditional extract, transform, and load (ETL) processes have long been a staple of data integration due to its flexibility, for common use cases such as replication and ingestion, they often prove time-consuming, complex, and less adaptable to the fast-changing demands of modern data architectures.

Data Integration

Data Integration Data Lake Statistics Data-driven

What is data governance? Best practices for managing data assets

CIO Business Intelligence

MARCH 24, 2023

The program must introduce and support standardization of enterprise data. Programs must support proactive and reactive change management activities for reference data values and the structure/use of master data and metadata.

Data Governance

Data Governance Management Metadata Data Quality

Modern Data Modeling: The Foundation of Enterprise Data Management and Data Governance

erwin

MAY 13, 2020

Metadata management is the key to managing and governing your data and drawing intelligence from it. Beyond harvesting and cataloging metadata , it also must be visualized to break down the complexity of how data is organized and what data relationships there are so that meaning is explicit to all stakeholders in the data value chain.

Data Governance

Data Governance Enterprise Modeling Management

Unstructured data management and governance using AWS AI/ML and analytics services

AWS Big Data

OCTOBER 25, 2023

But most important of all, the assumed dormant value in the unstructured data is a question mark, which can only be answered after these sophisticated techniques have been applied. Therefore, there is a need to being able to analyze and extract value from the data economically and flexibly. The solution integrates data in three tiers.

Unstructured Data

Unstructured Data Metadata Management Analytics

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

Cargotec captures terabytes of IoT telemetry data from their machinery operated by numerous customers across the globe. This data needs to be ingested into a data lake, transformed, and made available for analytics, machine learning (ML), and visualization. The target accounts read data from the source account S3 buckets.

Metadata

Metadata Data Lake Machine Learning Big Data

Extracting key insights from Amazon S3 access logs with AWS Glue for Ray

AWS Big Data

SEPTEMBER 7, 2023

AWS Glue Data Catalog stores information as metadata tables, where each table specifies a single data store. The AWS Glue crawler writes metadata to the Data Catalog by classifying the data to determine the format, schema, and associated properties of the data. Disable the scheduled AWS Glue job run.

Metadata

Metadata Dashboards Metrics Visualization

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

Cloudera

MARCH 23, 2022

In fact, we recently announced the integration with our cloud ecosystem bringing the benefits of Iceberg to enterprises as they make their journey to the public cloud, and as they adopt more converged architectures like the Lakehouse. 1: Multi-function analytics . 2: Open formats. 3: Open Performance.

Metadata

Metadata Data Architecture Machine Learning Cost-Benefit

SAP Datasphere review: turning data from a technical problem to a business data product.

Jen Stirrup

MARCH 29, 2023

SAP helps to solve this search problem by offering ways to simplify business data with a solid data foundation that powers SAP Datasphere. It fits neatly with the renewed interest in data architecture, particularly data fabric architecture. They fail to get a grip on their data.

Data Warehouse

Data Warehouse Metadata Data Integration Business Intelligence

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

Data fabric and data mesh are emerging data management concepts that are meant to address the organizational change and complexities of understanding, governing and working with enterprise data in a hybrid multicloud ecosystem. The good news is that both data architecture concepts are complimentary.

Management

Management Metadata Data Architecture Data Lake

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

AWS Big Data

NOVEMBER 15, 2023

It seamlessly consolidates data from various data sources within AWS, including AWS Cost Explorer (and forecasting with Cost Explorer ), AWS Trusted Advisor , and AWS Compute Optimizer. They can use their own toolsets or rely on provided blueprints to ingest the data from source systems.

Analytics

Analytics Dashboards Metadata Data Warehouse

BI Data Lineage Solutions: Your Trusted Guide For Success

Octopai

JULY 9, 2020

Here are some scenarios in which companies found real benefits from automated data lineage solutions: Data Lineage Enables Complex Data Processing Operations. By adopting automated data lineage and automated metadata tagging, companies have the opportunity to increase their data processing speed.

Insurance

Insurance Risk Management Machine Learning Metadata

Amazon DataZone now integrates with AWS Glue Data Quality and external data quality solutions

AWS Big Data

APRIL 3, 2024

In the first part of this post, we walk through the integration between AWS Glue Data Quality and Amazon DataZone. We discuss how to visualize data quality scores in Amazon DataZone, enable AWS Glue Data Quality when creating a new Amazon DataZone data source, and enable data quality for an existing data asset.

Data Quality

Data Quality Visualization Metadata Metrics

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

Profile aggregation – When you’ve uniquely identified a customer, you can build applications in Managed Service for Apache Flink to consolidate all their metadata, from name to interaction history. Then, you transform this data into a concise format. Data exploration Data exploration helps unearth inconsistencies, outliers, or errors.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

AWS Big Data

JULY 25, 2024

ATPCO is the industry leader in providing pricing and merchandising content for airlines, global distribution systems (GDSs), online travel agencies (OTAs), and other sales channels for consumers to visually understand differences between various offers. Consume data assets as part of analyzing data to generate insights.

Data Lake

Data Lake Metadata Sales Publishing

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

Amazon Q Generative SQL capability Query Editor, an out-of-the-box web-based SQL experience in Amazon Redshift is a popular tool for data exploration, visual analysis, and data collaboration. Here’s a couple of highlights from this week and for the full list, see below.

Data Warehouse

Data Warehouse Analytics Data Lake Machine Learning

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

JANUARY 8, 2024

Kinesis Data Streams has native integrations with other AWS services such as AWS Glue and Amazon EventBridge to build real-time streaming applications on AWS. Refer to Amazon Kinesis Data Streams integrations for additional details. The raw data can be streamed to Amazon S3 for archiving.

Analytics

Analytics IoT Data-driven Snapshot

How Huron built an Amazon QuickSight Asset Catalogue with AWS CDK Based Deployment Pipeline

AWS Big Data

APRIL 26, 2023

Having an accurate and up-to-date inventory of all technical assets helps an organization ensure it can keep track of all its resources with metadata information such as their assigned oners, last updated date, used by whom, how frequently and more. This is a guest blog post co-written with Corey Johnson from Huron.

Metadata

Metadata Dashboards Visualization Consulting

How Finance is Leveraging Automated Data Lineage for Regulations Compliance

Octopai

APRIL 8, 2020

While there are many factors that led to this event, one critical dynamic was the inadequacy of the data architectures supporting banks and their risk management systems. It required banks to maintain data architecture supporting risk aggregation at all times. Automated Data Lineage Tools Provide Regulatory Solutions.

Finance

Finance Cost-Benefit Metadata Data Architecture

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

Overview of solution As a data-driven company, smava relies on the AWS Cloud to power their analytics use cases. smava ingests data from various external and internal data sources into a landing stage on the data lake based on Amazon Simple Storage Service (Amazon S3). This is the Data Mart stage.

Data Lake

Data Lake Data Warehouse Data-driven B2B

How Zurich Insurance Group built a log management solution on AWS

AWS Big Data

JULY 16, 2024

Priority 2 logs, such as operating system security logs, firewall, identity provider (IdP), email metadata, and AWS CloudTrail , are ingested into Amazon OpenSearch Service to enable the following capabilities. Develop log and trace analytics solutions with interactive queries and visualize results with high adaptability and speed.

Insurance

Insurance Management Cost-Benefit Optimization

The Cloud Connection: How Governance Supports Security

Alation

APRIL 14, 2022

In today’s AI/ML-driven world of data analytics, explainability needs a repository just as much as those doing the explaining need access to metadata, EG, information about the data being used. The Cloud Data Migration Challenge. A useful feature for exposing patterns in the data. Visual Profiling.

Metadata

Metadata Data Governance Data-driven Modeling

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

AWS Big Data

OCTOBER 9, 2024

As part of this step, you use AWS Glue Data Quality to compare data between PostgreSQL and Amazon S3 to confirm the data is valid. Complete the following steps to create an AWS Glue job using the AWS Glue visual editor to compare data between PostgreSQL and Amazon S3: Set the source as the PostgreSQL table sample_data.

Data Quality

Data Quality Data Lake Data Warehouse Metrics

Erwin Data Intelligence: A Data Partner’s Perspective

erwin

FEBRUARY 28, 2024

While the essence of success in data governance is people and not technology, having the right tools at your fingertips is crucial. Technology is an enabler, and for data governance this is essentially having an excellent metadata management tool. Next to data governance, data architecture is really embedded in our DNA.

Metadata

Metadata Data Governance Data Quality Technology

2023 Predictions: Data Trends That Will Dominate Business Agenda in APAC

Cloudera

JANUARY 5, 2023

We foresee organizations pivoting focus beyond the algorithm to things like business-ready predictive dashboards, visualizations, and applications that simplify the use of AI systems to reach conclusions. These features provide businesses with a common metadata, security, and governance model across all their data.

Cost-Benefit

Cost-Benefit Business Objectives Machine Learning Data Architecture

Data integrity vs. data quality: Is there a difference?

IBM Big Data Hub

JULY 13, 2023

Geocoding Geocoding is the process of adding location metadata to an organization’s datasets. By tagging data with geographical coordinates to track where it originated from, where it has been and where it resides, an organization can ensure national and global geographic data standards are being met.

Data Quality

Data Quality Data Integration Metadata Cost-Benefit

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

Streaming jobs constantly ingest new data to synchronize across systems and can perform enrichment, transformations, joins, and aggregations across windows of time more efficiently. OpenSearch Service offers visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5

Data Lake

Data Lake Unstructured Data Management Snapshot

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

With fast and fine-grained scaling in EMR Serverless, if a pipeline runs daily and needs to process 1 GB of data one day and 100 GB of data another day, EMR Serverless automatically scales to handle that load. Monjumi Sarma is a Data Lab Solutions Architect at AWS.

Data Lake

Data Lake Dashboards Metrics Metadata

Getting Above the Silos: The Rise of the Logical Data Fabric

Data Virtualization

APRIL 2, 2020

With so much valuable data potentially available, it can be frustrating for organizations to discover that they can’t easily work with it because it’s stuck in disconnected silos. Limited data access is a problem when organizations need timely, complete views.

IT

IT Data Architecture Data Strategy Metadata

GraphDB Empowers Scientific Projects to Fight COVID-19 and Publish Knowledge Graphs

Ontotext

APRIL 15, 2020

GraphDB’s Visual Graph can be used to explore the data as demonstrated below. As this type of data is very dynamic, the flexibility of knowledge graphs and their capacity to seamlessly integrate data from disparate sources provides researchers with valuable live insights into the COVID-19 pandemic and its consequences.

Publishing

Publishing Metadata Data mining Data Architecture

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

This is in contrast to traditional BI, which extracts insight from data outside of the app. We rely on increasingly mobile technology to comb through massive amounts of data and solve high-value problems. Plus, there is an expectation that tools be visually appealing to boot. Their dashboards were visually stunning.

Analytics

Analytics Cost-Benefit Visualization Dashboards

Introducing the HubSpot connector for AWS Glue

AWS Big Data

DECEMBER 2, 2024

AWS Glue also supports the ability to apply complex data transformations, enabling efficient data integration and preparation to meet your needs. Schema and other metadata will be registered in the AWS Glue Data Catalog, a centralized metadata repository for all your data assets.

Data Lake

Data Lake Testing Data Integration Metadata

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

AWS Big Data

APRIL 29, 2025

While enabling organization-wide efficiency, the team also applied these principles to the data architecture, making sure that CLEA itself operates frugally. After evaluating various tools, we built a serverless data transformation pipeline using Amazon Athena and dbt. However, our initial data architecture led to challenges.

Data Transformation

Data Transformation Cost-Benefit Testing Data Lake

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

AWS Big Data

APRIL 28, 2025

Create a federated connection for Amazon Redshift Complete the following steps to create a federated catalog in the Data Catalog to query the data using various preferred analytics tools such as Athena, visual ETL in SageMaker Unified Studio, Amazon EMR, and more: On the SageMaker Unified Studio console, choose your project.

Metadata

Metadata Data Lake Big Data Publishing

A Summary Of Gartner’s Recent Innovation Insight Into Data Observability

DataKitchen

AUGUST 8, 2023

Data Observability leverages five critical technologies to create a data awareness AI engine: data profiling, active metadata analysis, machine learning, data monitoring, and data lineage. Like an apartment blueprint, Data lineage provides a written document that is only marginally useful during a crisis.

Data Quality

Data Quality Testing Snapshot Reporting

“You Complete Me,” said Data Lineage to DataOps Observability.

DataKitchen

JANUARY 23, 2023

To capture a more complete picture of the data’s journey, it is important to have a DataOps Observability system in place. Data lineage is static and often lags by weeks or months. Data lineage is often considered static because it is typically based on snapshots of data and metadata taken at a specific time.

Testing

Testing Data Governance Data Quality Data-driven

How EUROGATE established a data mesh architecture using Amazon DataZone

5 Ways Data Modeling Is Critical to Data Governance

Webinars

Trending Sources

What is a data architect? Skills, salaries, and how to become a data framework master

Webinars

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Top analytics announcements of AWS re:Invent 2024

The Future of Data Lineage and the Role of Metadata

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

What is data governance? Best practices for managing data assets

Modern Data Modeling: The Foundation of Enterprise Data Management and Data Governance

Unstructured data management and governance using AWS AI/ML and analytics services

How Cargotec uses metadata replication to enable cross-account data sharing

Extracting key insights from Amazon S3 access logs with AWS Glue for Ray

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

SAP Datasphere review: turning data from a technical problem to a business data product.

Augmented data management: Data fabric versus data mesh

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

BI Data Lineage Solutions: Your Trusted Guide For Success

Amazon DataZone now integrates with AWS Glue Data Quality and external data quality solutions

Create an end-to-end data strategy for Customer 360 on AWS

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

How Huron built an Amazon QuickSight Asset Catalogue with AWS CDK Based Deployment Pipeline

How Finance is Leveraging Automated Data Lineage for Regulations Compliance

How smava makes loans transparent and affordable using Amazon Redshift Serverless

How Zurich Insurance Group built a log management solution on AWS

The Cloud Connection: How Governance Supports Security

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

Erwin Data Intelligence: A Data Partner’s Perspective

2023 Predictions: Data Trends That Will Dominate Business Agenda in APAC

Data integrity vs. data quality: Is there a difference?

Exploring real-time streaming for generative AI Applications

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Getting Above the Silos: The Rise of the Logical Data Fabric

GraphDB Empowers Scientific Projects to Fight COVID-19 and Publish Knowledge Graphs

What Is Embedded Analytics?

Introducing the HubSpot connector for AWS Glue

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

A Summary Of Gartner’s Recent Innovation Insight Into Data Observability

“You Complete Me,” said Data Lineage to DataOps Observability.

Stay Connected