2012, Data Warehouse and Metadata

2012

Data Warehouse

Metadata

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

AWS Big Data

AUGUST 15, 2024

Unifying these necessitates additional data processing, requiring each business unit to provision and maintain a separate data warehouse. This burdens business units focused solely on consuming the curated data for analysis and not concerned with data management tasks, cleansing, or comprehensive data processing.

Data Lake

Data Lake Data Warehouse Data Governance Publishing

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

AWS Big Data

NOVEMBER 14, 2024

You can learn how to query Delta Lake native tables through UniForm from different data warehouses or engines such as Amazon Redshift as an example of expanding data access to more engines. Both Delta Lake and Iceberg metadata files reference the same data files.

Metadata

Metadata Data Warehouse Big Data Data Lake

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Marketing Operations in 2025: A New Framework for Success

MORE WEBINARS

Trending Sources

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated. To address this challenge, organizations can deploy a data mesh using AWS Lake Formation that connects the multiple EMR clusters. An entity can act both as a producer of data assets and as a consumer of data assets.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Webinars

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Marketing Operations in 2025: A New Framework for Success

MORE WEBINARS

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time. Apache Iceberg offers integrations with popular data processing frameworks such as Apache Spark, Apache Flink, Apache Hive, Presto, and more.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

AWS Big Data

APRIL 25, 2024

As the queries finish running, an UNLOAD operation is invoked from the Redshift data warehouse to the S3 bucket in Account A. The pipeline then starts running stored procedures and SQL commands on Redshift Serverless.

Metadata

Metadata Data Processing Management Testing

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

AWS Big Data

APRIL 19, 2023

Many customers run big data workloads such as extract, transform, and load (ETL) on Apache Hive to create a data warehouse on Hadoop. We split the solution into two primary components: generating Spark job metadata and running the SQL on Amazon EMR. The script generates a metadata JSON file for each step.

Metadata

Metadata Testing Data Lake Consulting

10 Years Later: Who’s the GOAT of Data Catalogs?

Alation

DECEMBER 15, 2022

December 2012: Alation forms and goes to work creating the first enterprise data catalog. Later, in its inaugural report on data catalogs, Forrester Research recognizes that “Alation started the MLDC trend.”. August 2017: Alation debuts as a leader in the Gartner MQ for Metadata Management Solutions.

Metadata

Metadata Data Governance Data Quality Marketing

Amazon DataZone announces custom blueprints for AWS services

AWS Big Data

JUNE 26, 2024

New feature: Custom AWS service blueprints Previously, Amazon DataZone provided default blueprints that created AWS resources required for data lake, data warehouse, and machine learning use cases. You can build projects and subscribe to both unstructured and structured data assets within the Amazon DataZone portal.

Data Lake

Data Lake Data Warehouse Unstructured Data Data Governance

Best practices to implement near-real-time analytics using Amazon Redshift Streaming Ingestion with Amazon MSK

AWS Big Data

MARCH 11, 2024

Amazon Redshift is a fully managed, scalable cloud data warehouse that accelerates your time to insights with fast, straightforward, and secure analytics at scale. Tens of thousands of customers rely on Amazon Redshift to analyze exabytes of data and run complex analytical queries, making it the most widely used cloud data warehouse.

Analytics

Analytics Data Warehouse Optimization Metrics

Single sign-on with Amazon Redshift Serverless with Okta using Amazon Redshift Query Editor v2 and third-party SQL clients

AWS Big Data

MAY 4, 2023

Amazon Redshift Serverless makes it easy to run and scale analytics in seconds without the need to set up and manage data warehouse clusters. Customers use their preferred SQL clients to analyze their data in Redshift Serverless. An Redshift Serverless data warehouse. If you don’t have one, you can sign up for one.

Finance

Finance Data Warehouse Sales Metadata

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

Integrate Okta with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On

AWS Big Data

NOVEMBER 30, 2023

This integration simplifies the authentication and authorization process for Amazon Redshift users using Query Editor V2 or Amazon Quicksight , making it easier for them to securely access your data warehouse. Note: Your organization’s IdC instance must be in the same region as the Amazon Redshift data warehouse you’re connecting to.

Data Warehouse

Data Warehouse Finance Sales Management

How BMO improved data security with Amazon Redshift and AWS Lake Formation

AWS Big Data

MARCH 1, 2024

One of the bank’s key challenges related to strict cybersecurity requirements is to implement field level encryption for personally identifiable information (PII), Payment Card Industry (PCI), and data that is classified as high privacy risk (HPR). Only users with required permissions are allowed to access data in clear text.

Data Lake

Data Lake Data Warehouse Risk Management

Set up cross-account AWS Glue Data Catalog access using AWS Lake Formation and AWS IAM Identity Center with Amazon Redshift and Amazon QuickSight

AWS Big Data

AUGUST 5, 2024

These business units have varying landscapes, where a data lake is managed by Amazon Simple Storage Service (Amazon S3) and analytics workloads are run on Amazon Redshift , a fast, scalable, and fully managed cloud data warehouse that allows you to process and run your complex SQL analytics workloads on structured and semi-structured data.

Data Lake

Data Lake Finance Sales Management

Convergent Evolution

Peter James Thomas

AUGUST 18, 2018

That was the Science, here comes the Technology… A Brief Hydrology of Data Lakes. Even back then, these were used for activities such as Analytics , Dashboards , Statistical Modelling , Data Mining and Advanced Visualisation. This required additional investments in metadata. This is the essence of Convergent Evolution.

Data Lake

Data Lake Data Warehouse Data mining Statistics

How SumUp made digital analytics more accessible using AWS Glue

AWS Big Data

JUNE 6, 2023

Founded in 2012, SumUp is the financial partner for more than 4 million small merchants in over 35 markets worldwide, helping them start, run and grow their business. Unless, of course, the rest of their data also resides in the Google Cloud. This is a guest blog post by Mira Daniels and Sean Whitfield from SumUp.

Analytics

Analytics Data Lake Testing Optimization

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

Data governance shows up as the fourth-most-popular kind of solution that enterprise teams were adopting or evaluating during 2019. That’s a lot of priorities – especially when you group together closely related items such as data lineage and metadata management which rank nearby. We keep feeding the monster data.

Data Governance

Data Governance Machine Learning Metadata Data Science

Data Science, Past & Future

Domino Data Lab

JULY 22, 2019

The data governance, however, is still pretty much over on the data warehouse. Toward the end of the 2000s is when you first started getting teams and industry, as Josh Willis was showing really brilliantly last night, you first started getting some teams identified as “data science” teams.

Data Science

Data Science Machine Learning Data Governance Modeling

Data Leaders Brief

Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

Webinars

Trending Sources

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Webinars

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

10 Years Later: Who’s the GOAT of Data Catalogs?

Amazon DataZone announces custom blueprints for AWS services

Best practices to implement near-real-time analytics using Amazon Redshift Streaming Ingestion with Amazon MSK

Single sign-on with Amazon Redshift Serverless with Okta using Amazon Redshift Query Editor v2 and third-party SQL clients

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Integrate Okta with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On

How BMO improved data security with Amazon Redshift and AWS Lake Formation

Set up cross-account AWS Glue Data Catalog access using AWS Lake Formation and AWS IAM Identity Center with Amazon Redshift and Amazon QuickSight

Convergent Evolution

How SumUp made digital analytics more accessible using AWS Glue

Themes and Conferences per Pacoid, Episode 8

Data Science, Past & Future

Stay Connected