Data Warehouse, Metadata and Strategy

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. Subsequently, we’ll explore strategies for overcoming these challenges.

Metadata

Metadata Data Lake Modeling Data Warehouse

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

AWS Big Data

NOVEMBER 14, 2024

You can learn how to query Delta Lake native tables through UniForm from different data warehouses or engines such as Amazon Redshift as an example of expanding data access to more engines. Both Delta Lake and Iceberg metadata files reference the same data files.

Metadata

Metadata Data Warehouse Big Data Data Lake

Recap of Amazon Redshift key product announcements in 2024

AWS Big Data

DECEMBER 17, 2024

Amazon Redshift , launched in 2013, has undergone significant evolution since its inception, allowing customers to expand the horizons of data warehousing and SQL analytics. Industry-leading price-performance Amazon Redshift offers up to three times better price-performance than alternative cloud data warehouses.

Data Lake

Data Lake Data Warehouse Data-driven Optimization

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

7 Benefits of Metadata Management

erwin

FEBRUARY 19, 2021

Metadata management is key to wringing all the value possible from data assets. However, most organizations don’t use all the data at their disposal to reach deeper conclusions about how to drive revenue, achieve regulatory compliance or accomplish other strategic objectives. What Is Metadata? Harvest data.

Metadata

Metadata Management Data Quality Cost-Benefit

7 enterprise data strategy trends

CIO Business Intelligence

NOVEMBER 22, 2022

Every enterprise needs a data strategy that clearly defines the technologies, processes, people, and rules needed to safely and securely manage its information assets and practices. Here’s a quick rundown of seven major trends that will likely reshape your organization’s current data strategy in the days and months ahead.

Data Strategy

Data Strategy Strategy Enterprise Consulting

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

This post explores how the shift to a data product mindset is being implemented, the challenges faced, and the early wins that are shaping the future of data management in the Institutional Division. The following diagram illustrates the building blocks of the Institutional Data & AI Platform.

Metadata

Metadata Data Governance Data Quality Data-driven

Four Use Cases Proving the Benefits of Metadata-Driven Automation

erwin

FEBRUARY 7, 2019

Organization’s cannot hope to make the most out of a data-driven strategy, without at least some degree of metadata-driven automation. The volume and variety of data has snowballed, and so has its velocity. That’s time needed and better used for data analysis. Metadata-Driven Automation in the BFSI Industry.

Metadata

Metadata Insurance Data-driven Cost-Benefit

What Is a Metadata Management Tool?

Octopai

DECEMBER 12, 2021

What enables you to use all those gigabytes and terabytes of data you’ve collected? Metadata is the pertinent, practical details about data assets: what they are, what to use them for, what to use them with. Without metadata, data is just a heap of numbers and letters collecting dust. Where does metadata come from?

Metadata

Metadata Management Data Quality Data Governance

Metadata-Driven Data Warehouses are Ideal

TDAN

APRIL 6, 2021

A metadata-driven data warehouse (MDW) offers a modern approach that is designed to make EDW development much more simplified and faster. It makes use of metadata (data about your data) as its foundation and combines data modeling and ETL functionalities to build data warehouses.

Data Warehouse

Data Warehouse Metadata Data-driven Modeling

How Metadata Makes Data Meaningful

erwin

DECEMBER 12, 2019

Metadata is an important part of data governance, and as a result, most nascent data governance programs are rife with project plans for assessing and documenting metadata. But in many scenarios, it seems that the underlying driver of metadata collection projects is that it’s just something you do for data governance.

Metadata

Metadata Data Governance Digital Transformation Data Quality

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

I aim to outline pragmatic strategies to elevate data quality into an enterprise-wide capability. We also examine how centralized, hybrid and decentralized data architectures support scalable, trustworthy ecosystems. This challenge remains deceptively overlooked despite its profound impact on strategy and execution.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

erwin

JULY 17, 2019

Having a clearly defined digital transformation strategy is an essential best practice for successful digital transformation. But what makes a viable digital transformation strategy? Constructing A Digital Transformation Strategy: Data Enablement. With automation, data quality is systemically assured.

Digital Transformation

Digital Transformation Strategy Metadata Data-driven

What is Oracle’s generative AI strategy?

CIO Business Intelligence

JULY 6, 2023

While Microsoft, AWS, Google Cloud, and IBM have already released their generative AI offerings, rival Oracle has so far been largely quiet about its own strategy. The metadata from these applications will help the language model identify trends and understand patterns across a particular SaaS offering, Batta added.

Strategy

Strategy Metadata Enterprise Modeling

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Amazon SageMaker Lakehouse provides an open data architecture that reduces data silos and unifies data across Amazon Simple Storage Service (Amazon S3) data lakes, Redshift data warehouses, and third-party and federated data sources. With AWS Glue 5.0, AWS Glue 5.0 Finally, AWS Glue 5.0

Analytics

Analytics Data Lake Metadata Data Warehouse

Cloud Data Warehouse Migration 101: Expert Tips

Alation

JULY 28, 2022

It’s costly and time-consuming to manage on-premises data warehouses — and modern cloud data architectures can deliver business agility and innovation. However, CIOs declare that agility, innovation, security, adopting new capabilities, and time to value — never cost — are the top drivers for cloud data warehousing.

Data Warehouse

Data Warehouse Cost-Benefit Data-driven Data Governance

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

A Gartner Marketing survey found only 14% of organizations have successfully implemented a C360 solution, due to lack of consensus on what a 360-degree view means, challenges with data quality, and lack of cross-functional governance structure for customer data. Then, you transform this data into a concise format.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Do I Need a Data Catalog?

erwin

JUNE 26, 2020

If you’re serious about a data-driven strategy , you’re going to need a data catalog. Organizations need a data catalog because it enables them to create a seamless way for employees to access and consume data and business assets in an organized manner. Three Types of Metadata in a Data Catalog.

Metadata

Metadata Cost-Benefit Measurement Data-driven

A Cost-Effective Data Warehouse Solution in CDP Public Cloud – Part1

Cloudera

FEBRUARY 9, 2021

Today’s customers have a growing need for a faster end to end data ingestion to meet the expected speed of insights and overall business demand. This ‘need for speed’ drives a rethink on building a more modern data warehouse solution, one that balances speed with platform cost management, performance, and reliability.

Data Warehouse

Data Warehouse Cost-Benefit Metadata Management

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JUNE 10, 2024

A modern data strategy redefines and enables sharing data across the enterprise and allows for both reading and writing of a singular instance of the data using an open table format. When evolving such a partition definition, the data in the table prior to the change is unaffected, as is its metadata.

Data Lake

Data Lake Metadata Snapshot Analytics

When is data too clean to be useful for enterprise AI?

CIO Business Intelligence

NOVEMBER 27, 2024

Once the province of the data warehouse team, data management has increasingly become a C-suite priority, with data quality seen as key for both customer experience and business performance. But along with siloed data and compliance concerns , poor data quality is holding back enterprise AI projects.

Enterprise

Enterprise Data Quality Structured Data Modeling

Dark Data: How to Find It and What to Do with It

Timo Elliott

JANUARY 6, 2022

The data you’ve collected and saved over the years isn’t free. If storage costs are escalating in a particular area, you may have found a good source of dark data. Analyze your metadata. If you’ve yet to implement data governance, this is another great reason to get moving quickly. Data sense-making.

IT

IT Metadata Data-driven Data Governance

A Few Proven Suggestions for Handling Large Data Sets

Smart Data Collective

SEPTEMBER 26, 2021

Data is processed to generate information, which can be later used for creating better business strategies and increasing the company’s competitive edge. It’s obvious that you’ll want to use big data, but it’s not so obvious how you’re going to work with it. Preserve information: Keep your raw data raw.

Metadata

Metadata Visualization Unstructured Data Data mining

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

The data architect also “provides a standard common business vocabulary, expresses strategic requirements, outlines high-level integrated designs to meet those requirements, and aligns with enterprise strategy and related business architecture,” according to DAMA International’s Data Management Body of Knowledge.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

How to Build a Performant Data Warehouse in Redshift

Sisense

SEPTEMBER 3, 2019

This blog is intended to give an overview of the considerations you’ll want to make as you build your Redshift data warehouse to ensure you are getting the optimal performance. This results in less joins between the metric data in fact tables, and the dimensions. So let’s dive in! OLTP vs OLAP.

Data Warehouse

Data Warehouse OLAP Statistics Cost-Benefit

Data governance in the age of generative AI

AWS Big Data

FEBRUARY 29, 2024

Data is your generative AI differentiator, and a successful generative AI implementation depends on a robust data strategy incorporating a comprehensive data governance approach. Data discoverability Unlike structured data, which is managed in well-defined rows and columns, unstructured data is stored as objects.

Data Governance

Data Governance Unstructured Data Metadata Data Lake

SAP Datasphere review: turning data from a technical problem to a business data product.

Jen Stirrup

MARCH 29, 2023

What does a sound, intelligent data foundation give you? It can give business-oriented data strategy for business leaders to help drive better business decisions and ROI. It can also increase productivity by enabling the business to find the data they need when the business teams need it.

Data Warehouse

Data Warehouse Metadata Data Integration Business Intelligence

How Metadata Makes Data Meaningful

erwin

DECEMBER 12, 2019

Metadata is an important part of data governance, and as a result, most nascent data governance programs are rife with project plans for assessing and documenting metadata. But in many scenarios, it seems that the underlying driver of metadata collection projects is that it’s just something you do for data governance.

Metadata

Metadata Data Governance Digital Transformation Data Quality

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

In 2013, Amazon Web Services revolutionized the data warehousing industry by launching Amazon Redshift , the first fully-managed, petabyte-scale, enterprise-grade cloud data warehouse. Amazon Redshift made it simple and cost-effective to efficiently analyze large volumes of data using existing business intelligence tools.

Data Warehouse

Data Warehouse Analytics Data Lake Machine Learning

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

AWS Big Data

AUGUST 31, 2023

Amazon Redshift is a fast, fully managed petabyte-scale cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing business intelligence (BI) tools. Amazon Redshift also supports querying nested data with complex data types such as struct, array, and map.

Data Lake

Data Lake Data Warehouse Metadata Data Architecture

Building a Beautiful Data Lakehouse

CIO Business Intelligence

MARCH 9, 2022

But the data repository options that have been around for a while tend to fall short in their ability to serve as the foundation for big data analytics powered by AI. Traditional data warehouses, for example, support datasets from multiple sources but require a consistent data structure.

Data Lake

Data Lake Unstructured Data Data Warehouse Big Data

The Increasing Importance of Open Table Formats

David Menninger's Analyst Perspectives

OCTOBER 31, 2024

It was not until the addition of open table formats— specifically Apache Hudi, Apache Iceberg and Delta Lake—that data lakes truly became capable of supporting multiple business intelligence (BI) projects as well as data science and even operational applications and, in doing so, began to evolve into data lakehouses.

Data Lake

Data Lake Unstructured Data Data Warehouse Software

SAP rounds out data warehouse cloud functionality, renamed Datasphere

CIO Business Intelligence

MARCH 8, 2023

SAP’s Data Warehouse Cloud is evolving, gaining new features and a new name, Datasphere, as the company addresses continued diversification of the enterprise data. Khan said SAP is going beyond the usual capabilities of a data fabric by preserving the business context of the data it carries. “You

Data Warehouse

Data Warehouse Metadata Enterprise Machine Learning

Use Apache Iceberg in a data lake to support incremental data processing

AWS Big Data

MARCH 2, 2023

Apache Iceberg is an open table format for very large analytic datasets, which captures metadata information on the state of datasets as they evolve and change over time. Iceberg has become very popular for its support for ACID transactions in data lakes and features like schema and partition evolution, time travel, and rollback.

Data Lake

Data Lake Data Processing Metadata Snapshot

AWS re:Invent 2023 Amazon Redshift Sessions Recap

AWS Big Data

DECEMBER 18, 2023

Amazon Redshift powers data-driven decisions for tens of thousands of customers every day with a fully managed, AI-powered cloud data warehouse, delivering the best price-performance for your analytics workloads.

Data Warehouse

Data Warehouse Machine Learning Data-driven Data Lake

CIOs are (still) closer than ever to their dream data lakehouse

CIO Business Intelligence

OCTOBER 15, 2024

“I do think the acquisition has been a bit of a distraction, but that’s probably true anytime that kind of money starts moving around,” David Nalley, director of open-source strategy and marketing at Amazon Web Services, told me. There’s been a ton of innovation lately around the Iceberg REST catalog because the data turf war is over.

Metadata

Metadata Data Processing Uncertainty Data Warehouse

A hybrid approach in healthcare data warehousing with Amazon Redshift

AWS Big Data

FEBRUARY 21, 2023

Data warehouses play a vital role in healthcare decision-making and serve as a repository of historical data. A healthcare data warehouse can be a single source of truth for clinical quality control systems. What is a dimensional data model? What is a dimensional data model? What is a data vault?

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Metadata

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

AWS Big Data

JANUARY 24, 2023

This solution only replicates metadata in the Data Catalog, not the actual underlying data. To have a redundant data lake using Lake Formation and AWS Glue in an additional Region, we recommend replicating the Amazon S3-based storage using S3 replication , S3 sync, aws-s3-copy-sync-using-batch or S3 Batch replication process.

Data Architecture

Data Architecture Metadata Data Lake Snapshot

Governing data in relational databases using Amazon DataZone

AWS Big Data

MAY 7, 2024

Amazon DataZone allows you to simply and securely govern end-to-end data assets stored in your Amazon Redshift data warehouses or data lakes cataloged with the AWS Glue data catalog. The producer also needs to manage and publish the data asset so it’s discoverable throughout the organization.

Metadata

Metadata Data Lake Data Processing Data-driven

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

Some solutions provide read and write access to any type of source and information, advanced integration, security capabilities and metadata management that help achieve virtual and high-performance Data Services in real-time, cache or batch mode. How does Data Virtualization complement Data Warehousing and SOA Architectures?

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Unlock data across organizational boundaries using Amazon DataZone – now generally available

AWS Big Data

OCTOBER 4, 2023

An Amazon DataZone domain contains an associated business data catalog for search and discovery, a set of metadata definitions to decorate the data assets that are used for discovery purposes, and data projects with integrated analytics and ML tools for users and groups to consume and publish data assets.

Metadata

Metadata Data Lake Publishing Data Governance

Get Your Analytics Insights Instantly – Without Abandoning Central IT

Cloudera

JANUARY 21, 2021

While cloud-native, point-solution data warehouse services may serve your immediate business needs, there are dangers to the corporation as a whole when you do your own IT this way. Cloudera Data Warehouse (CDW) is here to save the day! CDW is an integrated data warehouse service within Cloudera Data Platform (CDP).

Data Warehouse

Data Warehouse Data Lake IT Analytics

Dive deep into security management: The Data on EKS Platform

AWS Big Data

APRIL 29, 2024

We show how Ranger integrates with Hadoop components like Apache Hive, Spark, Trino, Yarn, and HDFS, providing secure and efficient data management in a cloud environment. Join us as we navigate these advanced security strategies in the context of Kubernetes and cloud computing.

Management

Management Big Data Data Warehouse Metadata

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

They enable transactions on top of data lakes and can simplify data storage, management, ingestion, and processing. These transactional data lakes combine features from both the data lake and the data warehouse. One important aspect to a successful data strategy for any organization is data governance.

Data Lake

Data Lake Sales Data Warehouse Snapshot

What is a business intelligence analyst? A key role for data-driven decisions

CIO Business Intelligence

OCTOBER 26, 2023

The role is becoming increasingly important as organizations move to capitalize on the volumes of data they collect through business intelligence strategies. It’s a role that combines hard skills such as programming, data modeling, and statistics with soft skills such as communication, analytical thinking, and problem-solving.

Business Intelligence

Business Intelligence Data-driven Statistics Data Warehouse

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

Webinars

Trending Sources

Recap of Amazon Redshift key product announcements in 2024

Webinars

7 Benefits of Metadata Management

7 enterprise data strategy trends

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Four Use Cases Proving the Benefits of Metadata-Driven Automation

What Is a Metadata Management Tool?

Metadata-Driven Data Warehouses are Ideal

How Metadata Makes Data Meaningful

Data’s dark secret: Why poor quality cripples AI and growth

Constructing A Digital Transformation Strategy: Putting the Data in Digital Transformation

What is Oracle’s generative AI strategy?

Top analytics announcements of AWS re:Invent 2024

Cloud Data Warehouse Migration 101: Expert Tips

Create an end-to-end data strategy for Customer 360 on AWS

Do I Need a Data Catalog?

A Cost-Effective Data Warehouse Solution in CDP Public Cloud – Part1

How Cloudinary transformed their petabyte scale streaming data lake with Apache Iceberg and AWS Analytics

When is data too clean to be useful for enterprise AI?

Dark Data: How to Find It and What to Do with It

A Few Proven Suggestions for Handling Large Data Sets

What is a data architect? Skills, salaries, and how to become a data framework master

How to Build a Performant Data Warehouse in Redshift

Data governance in the age of generative AI

SAP Datasphere review: turning data from a technical problem to a business data product.

How Metadata Makes Data Meaningful

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Building a Beautiful Data Lakehouse

The Increasing Importance of Open Table Formats

SAP rounds out data warehouse cloud functionality, renamed Datasphere

Use Apache Iceberg in a data lake to support incremental data processing

AWS re:Invent 2023 Amazon Redshift Sessions Recap

CIOs are (still) closer than ever to their dream data lakehouse

A hybrid approach in healthcare data warehousing with Amazon Redshift

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

Governing data in relational databases using Amazon DataZone

Biggest Trends in Data Visualization Taking Shape in 2022

Unlock data across organizational boundaries using Amazon DataZone – now generally available

Get Your Analytics Insights Instantly – Without Abandoning Central IT

Dive deep into security management: The Data on EKS Platform

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

What is a business intelligence analyst? A key role for data-driven decisions

Stay Connected