Data Architecture, Data Lake, Data-driven and Metadata

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

For container terminal operators, data-driven decision-making and efficient data sharing are vital to optimizing operations and boosting supply chain efficiency. Together, these capabilities enable terminal operators to enhance efficiency and competitiveness in an industry that is increasingly data driven.

IoT

IoT Machine Learning Metadata Data-driven

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

AWS Big Data

NOVEMBER 14, 2024

The landscape of big data management has been transformed by the rising popularity of open table formats such as Apache Iceberg, Apache Hudi, and Linux Foundation Delta Lake. These formats, designed to address the limitations of traditional data storage systems, have become essential in modern data architectures.

Metadata

Metadata Data Warehouse Big Data Data Lake

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

Businesses are constantly evolving, and data leaders are challenged every day to meet new requirements. Customers are using AWS and Snowflake to develop purpose-built data architectures that provide the performance required for modern analytics and artificial intelligence (AI) use cases.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

What is a Data Mesh?

DataKitchen

AUGUST 3, 2021

The data mesh design pattern breaks giant, monolithic enterprise data architectures into subsystems or domains, each managed by a dedicated team. DataOps helps the data mesh deliver greater business agility by enabling decentralized domains to work in concert. . But first, let’s define the data mesh design pattern.

Data Architecture

Data Architecture Data Lake Cost-Benefit Data Warehouse

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

With the growing emphasis on data, organizations are constantly seeking more efficient and agile ways to integrate their data, especially from a wide variety of applications. In addition, organizations rely on an increasingly diverse array of digital systems, data fragmentation has become a significant challenge.

Data Integration

Data Integration Data Lake Statistics Data-driven

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JUNE 10, 2024

Enterprises and organizations across the globe want to harness the power of data to make better decisions by putting data at the center of every decision-making process. However, throughout history, data services have held dominion over their customers’ data.

Data Lake

Data Lake Metadata Snapshot Analytics

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

AWS Big Data

JULY 18, 2024

Over the years, organizations have invested in creating purpose-built, cloud-based data lakes that are siloed from one another. A major challenge is enabling cross-organization discovery and access to data across these multiple data lakes, each built on different technology stacks.

Data Lake

Data Lake Publishing Metadata Data-driven

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

Data is the foundation of innovation, agility and competitive advantage in todays digital economy. As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Data quality is no longer a back-office concern.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

AWS Big Data

OCTOBER 1, 2024

Amazon Redshift enables you to efficiently query and retrieve structured and semi-structured data from open format files in Amazon S3 data lake without having to load the data into Amazon Redshift tables. Amazon Redshift extends SQL capabilities to your data lake, enabling you to run analytical queries.

Data Lake

Data Lake Statistics Broadcasting Optimization

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS Big Data

SEPTEMBER 13, 2023

The Analytics specialty practice of AWS Professional Services (AWS ProServe) helps customers across the globe with modern data architecture implementations on the AWS Cloud. In this post, we discuss a common use case in relation to operational data processing and the solution we built using Apache Hudi and AWS Glue.

Data Lake

Data Lake Data Processing Metadata Snapshot

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. In this post, we describe Orca’s journey building a transactional data lake using Amazon Simple Storage Service (Amazon S3), Apache Iceberg, and AWS Analytics.

Data Lake

Data Lake Analytics Snapshot Data Quality

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

AUGUST 30, 2022

Data is the fuel that drives government, enables transparency, and powers citizen services. That should be easy, but when agencies don’t share data or applications, they don’t have a unified view of people. Legacy data sharing involves proliferating copies of data, creating data management, and security challenges.

Data Architecture

Data Architecture Data Lake Data Warehouse Metadata

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

AWS Big Data

JULY 21, 2023

Data-driven organizations treat data as an asset and use it across different lines of business (LOBs) to drive timely insights and better business decisions. This leads to having data across many instances of data warehouses and data lakes using a modern data architecture in separate AWS accounts.

Data Lake

Data Lake Data Warehouse Marketing Management

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

Organizations often need to manage a high volume of data that is growing at an extraordinary rate. At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. We think of this concept as inside-out data movement. Example Corp.

Data Lake

Data Lake Analytics Dashboards Metrics

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

AWS Big Data

JULY 25, 2024

In today’s data-driven world, companies across industries recognize the immense value of data in making decisions, driving innovation, and building new products to serve their customers. ATPCO’s reach is impressive, with its fare data covering over 89% of global flight schedules.

Data Lake

Data Lake Metadata Sales Publishing

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

In 2013, Amazon Web Services revolutionized the data warehousing industry by launching Amazon Redshift , the first fully-managed, petabyte-scale, enterprise-grade cloud data warehouse. Amazon Redshift made it simple and cost-effective to efficiently analyze large volumes of data using existing business intelligence tools.

Data Warehouse

Data Warehouse Analytics Data Lake Machine Learning

Data architecture strategy for data quality

IBM Big Data Hub

JANUARY 5, 2023

Poor data quality is one of the top barriers faced by organizations aspiring to be more data-driven. Ill-timed business decisions and misinformed business processes, missed revenue opportunities, failed business initiatives and complex data systems can all stem from data quality issues.

Data Architecture

Data Architecture Data Quality Strategy Data Lake

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

We are excited to announce the general availability of Apache Iceberg in Cloudera Data Platform (CDP). These tools empower analysts and data scientists to easily collaborate on the same data, with their choice of tools and analytic engines. Why integrate Apache Iceberg with Cloudera Data Platform?

Data Lake

Data Lake Data Warehouse Data Architecture Metadata

The Future of the Data Lakehouse – Open

CIO Business Intelligence

JUNE 23, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes. Iterations of the lakehouse.

Data Lake

Data Lake Data Warehouse Machine Learning Data-driven

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

This is a guest post co-written by Alex Naumov, Principal Data Architect at smava. smava believes in and takes advantage of data-driven decisions in order to become the market leader. smava believes in and takes advantage of data-driven decisions in order to become the market leader.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Lay the groundwork now for advanced analytics and AI

CIO Business Intelligence

AUGUST 3, 2023

When global technology company Lenovo started utilizing data analytics, they helped identify a new market niche for its gaming laptops, and powered remote diagnostics so their customers got the most from their servers and other devices.

Analytics

Analytics Data Lake Metadata Cost-Benefit

AWS Lake Formation 2022 year in review

AWS Big Data

JANUARY 31, 2023

Data governance is the collection of policies, processes, and systems that organizations use to ensure the quality and appropriate handling of their data throughout its lifecycle for the purpose of generating business value.

Data Lake

Data Lake Data Governance Data Architecture Machine Learning

The Future of the Data Lakehouse – Open

Cloudera

JUNE 18, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes. Iterations of the lakehouse.

Data Lake

Data Lake Data Warehouse Machine Learning Data-driven

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Big Data Hub

AUGUST 4, 2023

Data democratization, much like the term digital transformation five years ago, has become a popular buzzword throughout organizations, from IT departments to the C-suite. It’s often described as a way to simply increase data access, but the transition is about far more than that. What is data democratization?

Data Architecture

Data Architecture Data Lake Machine Learning Data Governance

AWS re:Invent 2023 Amazon Redshift Sessions Recap

AWS Big Data

DECEMBER 18, 2023

Amazon Redshift powers data-driven decisions for tens of thousands of customers every day with a fully managed, AI-powered cloud data warehouse, delivering the best price-performance for your analytics workloads. Learn more about the AWS zero-ETL future with newly launched AWS databases integrations with Amazon Redshift.

Data Warehouse

Data Warehouse Machine Learning Data-driven Data Lake

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

AWS Big Data

OCTOBER 9, 2024

Today, customers are embarking on data modernization programs by migrating on-premises data warehouses and data lakes to the AWS Cloud to take advantage of the scale and advanced analytical capabilities of the cloud. Data parity can help build confidence and trust with business users on the quality of migrated data.

Data Quality

Data Quality Data Lake Data Warehouse Metrics

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Ontotext

FEBRUARY 12, 2024

In the final part of this three-part series, we’ll explore ho w data mesh bolsters performance and helps organizations and data teams work more effectively. Usually, organizations will combine different domain topologies, depending on the trade-offs, and choose to focus on specific aspects of data mesh.

Data-driven

Data-driven Data Lake Data Quality Business Objectives

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

Data fabric and data mesh are emerging data management concepts that are meant to address the organizational change and complexities of understanding, governing and working with enterprise data in a hybrid multicloud ecosystem. The good news is that both data architecture concepts are complimentary.

Management

Management Metadata Data Architecture Data Lake

HEMA accelerates their data governance journey with Amazon DataZone

AWS Big Data

DECEMBER 19, 2024

Data has become an invaluable asset for businesses, offering critical insights to drive strategic decision-making and operational optimization. Today, this is powering every part of the organization, from the customer-favorite online cake customization feature to democratizing data to drive business insight.

Data Governance

Data Governance Publishing Data-driven Metadata

Demystifying Modern Data Platforms

Cloudera

SEPTEMBER 15, 2022

Cloudera Contributor: Mark Ramsey, PhD ~ Globally Recognized Chief Data Officer. July brings summer vacations, holiday gatherings, and for the first time in two years, the return of the Massachusetts Institute of Technology (MIT) Chief Data Officer symposium as an in-person event. Luke: What is a modern data platform?

Data Lake

Data Lake Data Architecture Data-driven Data Warehouse

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

JANUARY 8, 2024

We’re living in the age of real-time data and insights, driven by low-latency data streaming applications. The volume of time-sensitive data produced is increasing rapidly, with different formats of data being introduced across new businesses and customer use cases.

Analytics

Analytics IoT Data-driven Snapshot

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

JANUARY 22, 2024

The company uses AWS Cloud services to build data-driven products and scale engineering best practices. To ensure a sustainable data platform amid growth and profitability phases, their tech teams adopted a decentralized data mesh architecture. The solution Acast implemented is a data mesh, architected on AWS.

Data-driven

Data-driven Advertising Metadata Data Architecture

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

AWS Big Data

JULY 14, 2023

FinAuto has a unique position to look across FinOps and provide solutions that help satisfy multiple use cases with accurate, consistent, and governed delivery of data and related services. These datasets can then be used to power front end systems, ML pipelines, and data engineering teams.

Finance

Finance Metadata Big Data Recreation/Entertainment

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

This view is used to identify patterns and trends in customer behavior, which can inform data-driven decisions to improve business outcomes. In this post, we discuss how you can use purpose-built AWS services to create an end-to-end data strategy for C360 to unify and govern customer data that address these challenges.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

FMs are multimodal; they work with different data types such as text, video, audio, and images. Large language models (LLMs) are a type of FM and are pre-trained on vast amounts of text data and typically have application uses such as text generation, intelligent chatbots, or summarization.

Data Lake

Data Lake Unstructured Data Management Snapshot

How Knowledge Graphs Power Data Mesh and Data Fabric

Ontotext

APRIL 10, 2024

The data ecosystem today is crowded with dazzling buzzwords, all fighting for investment dollars. A survey in 2021 found that a data company was being funded every 45 minutes. Data ecosystems have become jungles and in spite of all the technology, data teams are struggling to create a modern data experience.

Metadata

Metadata Data Lake Data Warehouse Data Quality

Data platform trinity: Competitive or complementary?

IBM Big Data Hub

JANUARY 18, 2023

Data platform architecture has an interesting history. A read-optimized platform that can integrate data from multiple applications emerged. In another decade, the internet and mobile started the generate data of unforeseen volume, variety and velocity. It required a different data platform solution. Guess what?

Data Lake

Data Lake Data Warehouse Data-driven Metadata

CIOs rise to the ESG reporting challenge

CIO Business Intelligence

JANUARY 30, 2024

The foundation for ESG reporting, of course, is data. Always the gatekeepers of much of the data necessary for ESG reporting, CIOs are finding that companies are even more dependent on them,” says Nancy Mentesana, ESG executive director at Labrador US, a global communications firm focused on corporate disclosure documents.

Reporting

Reporting Data Quality Strategy Data-driven

Usability and Connecting Threads: How Data Fabric Makes Sense Out of Disparate Data

Ontotext

AUGUST 4, 2023

Generating actionable insights across growing data volumes and disconnected data silos is becoming increasingly challenging for organizations. Working across data islands leads to siloed thinking and the inability to implement critical business initiatives such as Customer, Product, or Asset 360. Data Fabric: Who and What?

Metadata

Metadata Data-driven Data Architecture Data Quality

What is Data Mesh?

Ontotext

NOVEMBER 16, 2023

Data mesh is still in its infancy, and data personas and organizations are craving clarity and specificity. It is critical to be aware of the “why” and “what” and fully understand the role that knowledge graphs play when considering adopting a data mesh strategy. The debate on what constitutes a data mesh rages on.

Metadata

Metadata Data-driven Data Quality Data Architecture

Why We Started the Data Intelligence Project

Alation

JULY 7, 2022

Alation launched the Data Intelligence Project in the summer of 2021 to train the next generation of data leaders. With Alation, students learn the critical skills they need to curate, govern, and discover data assets in the data-driven enterprises of today. Two data-driven careers.

Metadata

Metadata Data-driven Insurance Statistics

AWS re:Invent Recap: The Future of Cloud

Alation

DECEMBER 14, 2021

Unburdening IT from infrastructure management has driven an amazing transformation; today, mission-critical applications run across 80 regions in the world, using thousands of services on over 475 instance types. Major shifts around how people use technology and data in the cloud are only just beginning. Can we move that data?”.

Data Lake

Data Lake Data Warehouse Machine Learning Cost-Benefit

The Cloud Connection: How Governance Supports Security

Alation

APRIL 14, 2022

Moving data to the cloud can bring immense operational benefits. However, the sheer volume and complexity of today’s enterprise data can cause downstream headaches for data users. Semantics, context, and how data is tracked and used mean even more as you stretch to reach post-migration goals. Data pipeline orchestration.

Metadata

Metadata Data Governance Data-driven Modeling

How EUROGATE established a data mesh architecture using Amazon DataZone

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

Webinars

Trending Sources

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Webinars

What is a Data Mesh?

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

Data’s dark secret: Why poor quality cripples AI and growth

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Top analytics announcements of AWS re:Invent 2024

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Breaking State and Local Data Silos with Modern Data Architectures

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Data architecture strategy for data quality

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

The Future of the Data Lakehouse – Open

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Lay the groundwork now for advanced analytics and AI

AWS Lake Formation 2022 year in review

The Future of the Data Lakehouse – Open

Data democratization: How data architecture can drive business decisions and AI initiatives

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Augmented data management: Data fabric versus data mesh

HEMA accelerates their data governance journey with Amazon DataZone

Demystifying Modern Data Platforms

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

Design a data mesh on AWS that reflects the envisioned organization

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

Create an end-to-end data strategy for Customer 360 on AWS

Exploring real-time streaming for generative AI Applications

How Knowledge Graphs Power Data Mesh and Data Fabric

Data platform trinity: Competitive or complementary?

CIOs rise to the ESG reporting challenge

Usability and Connecting Threads: How Data Fabric Makes Sense Out of Disparate Data

What is Data Mesh?

Why We Started the Data Intelligence Project

AWS re:Invent Recap: The Future of Cloud

The Cloud Connection: How Governance Supports Security

Stay Connected