Data Warehouse, Reference and Testing

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis.

Data Warehouse

Data Warehouse Analytics Testing Modeling

Recap of Amazon Redshift key product announcements in 2024

AWS Big Data

DECEMBER 17, 2024

Amazon Redshift , launched in 2013, has undergone significant evolution since its inception, allowing customers to expand the horizons of data warehousing and SQL analytics. Industry-leading price-performance Amazon Redshift offers up to three times better price-performance than alternative cloud data warehouses.

Data Lake

Data Lake Data Warehouse Data-driven Optimization

Write queries faster with Amazon Q generative SQL for Amazon Redshift

AWS Big Data

NOVEMBER 7, 2024

Amazon Redshift is a fully managed, AI-powered cloud data warehouse that delivers the best price-performance for your analytics workloads at any scale. Refer to Easy analytics and cost-optimization with Amazon Redshift Serverless to get started. To test this, let’s ask Amazon Q to “delete data from web_sales table.”

Metadata

Metadata Sales Data Warehouse Optimization

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

MLOps and DevOps: Why Data Makes It Different

O'Reilly on Data

OCTOBER 19, 2021

We need robust versioning for data, models, code, and preferably even the internal state of applications—think Git on steroids to answer inevitable questions: What changed? The applications must be integrated to the surrounding business systems so ideas can be tested and validated in the real world in a controlled manner.

IT

IT Testing Experimentation Software

Unlock the power of optimization in Amazon Redshift Serverless

AWS Big Data

MARCH 10, 2025

Although traditional scaling primarily responds to query queue times, the new AI-driven scaling and optimization feature offers a more sophisticated approach by considering multiple factors including query complexity and data volume. The following screenshots show the elapsed time breakdown for each test.

Optimization

Optimization Data Warehouse Data-driven Testing

Run Apache XTable in AWS Lambda for background conversion of open table formats

AWS Big Data

NOVEMBER 26, 2024

This post was co-written with Dipankar Mazumdar, Staff Data Engineering Advocate with AWS Partner OneHouse. Data architecture has evolved significantly to handle growing data volumes and diverse workloads. For more examples and references to other posts, refer to the following GitHub repository. create_hudi_s3.py

Metadata

Metadata Data Lake Snapshot Data Warehouse

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

AWS Big Data

MAY 30, 2024

Amazon Redshift is a fast, scalable, and fully managed cloud data warehouse that allows you to process and run your complex SQL analytics workloads on structured and semi-structured data. Solution overview Amazon Redshift is an industry-leading cloud data warehouse.

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Structured Data

Implementing a Pharma Data Mesh using DataOps

DataKitchen

AUGUST 19, 2021

Each data source is updated on its own schedule, for example, daily, weekly or monthly. The DataKitchen Platform ingests data into a data lake and runs Recipes to create a data warehouse leveraged by users and self-service data analysts. The third set of domains are cached data sets (e.g.,

Data Warehouse

Data Warehouse Data Lake Manufacturing Testing

Simplify your query performance diagnostics in Amazon Redshift with Query profiler

AWS Big Data

OCTOBER 23, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that lets you analyze your data at scale. Amazon Redshift Serverless lets you access and analyze data without the usual configurations of a provisioned data warehouse. For more information, refer to Amazon Redshift clusters.

Data Warehouse

Data Warehouse Metrics Broadcasting Dashboards

The Ultimate Guide to Data Warehouse Automation and Tools

Jet Global

APRIL 19, 2021

This puts tremendous stress on the teams managing data warehouses, and they struggle to keep up with the demand for increasingly advanced analytic requests. To gather and clean data from all internal systems and gain the business insights needed to make smarter decisions, businesses need to invest in data warehouse automation.

Data Warehouse

Data Warehouse Cost-Benefit OLAP Business Intelligence

Evaluating sample Amazon Redshift data sharing architecture using Redshift Test Drive and advanced SQL analysis

AWS Big Data

SEPTEMBER 10, 2024

With the launch of Amazon Redshift Serverless and the various provisioned instance deployment options , customers are looking for tools that help them determine the most optimal data warehouse configuration to support their Amazon Redshift workloads. The following image shows the process flow.

Testing

Testing Snapshot Data Warehouse Metrics

How To Succeed As a DataOps Engineer

DataKitchen

NOVEMBER 20, 2021

Many organizations take weeks to procure and prep data sets. A DataOps Engineer can make test data available on demand. If we can provide shortcuts to members of the data team, we can help improve their productivity. . We have data profiling tools that we run to compare versions of datasets.

Testing

Testing Machine Learning Data Warehouse Analytics

Accelerate your data warehouse migration to Amazon Redshift – Part 7

AWS Big Data

OCTOBER 17, 2023

With Amazon Redshift, you can use standard SQL to query data across your data warehouse, operational data stores, and data lake. Migrating a data warehouse can be complex. You have to migrate terabytes or petabytes of data from your legacy system while not disrupting your production workload.

Data Warehouse

Data Warehouse Data Processing Data Lake Management

Find the best Amazon Redshift configuration for your workload using Redshift Test Drive

AWS Big Data

JULY 27, 2023

Amazon Redshift is a widely used, fully managed, petabyte-scale cloud data warehouse. Tens of thousands of customers use Amazon Redshift to process exabytes of data every day to power their analytics workloads. Amazon Redshift RA3 with managed storage is the newest instance type for Provisioned clusters.

Testing

Testing Data Warehouse Data Processing Snapshot

Introduction To The Basic Business Intelligence Concepts

datapine

MAY 9, 2019

Business intelligence concepts refer to the usage of digital computing technologies in the form of data warehouses, analytics and visualization with the aim of identifying and analyzing essential business-based data to generate new, actionable corporate insights. The data warehouse. 1) The raw data.

Business Intelligence

Business Intelligence Dashboards Data Warehouse Visualization

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

You can now generate data integration jobs for various data sources and destinations, including Amazon Simple Storage Service (Amazon S3) data lakes with popular file formats like CSV, JSON, and Parquet, as well as modern table formats such as Apache Hudi , Delta , and Apache Iceberg.

Data Integration

Data Integration Visualization Data Processing Data Lake

Automate deployment of an Amazon QuickSight analysis connecting to an Amazon Redshift data warehouse with an AWS CloudFormation template

AWS Big Data

FEBRUARY 16, 2023

Amazon Redshift is the most widely used data warehouse in the cloud, best suited for analyzing exabytes of data and running complex analytical queries. Amazon QuickSight is a fast business analytics service to build visualizations, perform ad hoc analysis, and quickly get business insights from your data.

Data Warehouse

Data Warehouse Sales Visualization Data Processing

Centralize near-real-time governance through alerts on Amazon Redshift data warehouses for sensitive queries

AWS Big Data

JUNE 29, 2023

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud that delivers powerful and secure insights on all your data with the best price-performance. With Amazon Redshift, you can analyze your data to derive holistic insights about your business and your customers.

Data Warehouse

Data Warehouse Dashboards Testing Visualization

Liberty Mutual CIO Monica Caldas on developing a digital-savvy workforce

CIO Business Intelligence

NOVEMBER 7, 2024

If we understand the data better and derive better insights, it enables us to offer better products and services at greater speed. We have modernized most of our data warehouses, we have put in new tools and capabilities, and that’s great, because now we’re at this next inflection point of technology with gen AI.

Insurance

Insurance Experimentation Testing Technology

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. A question arises on what level of details we need to include in the table metadata.

Metadata

Metadata Data Lake Modeling Data Warehouse

Implement data warehousing solution using dbt on Amazon Redshift

AWS Big Data

NOVEMBER 17, 2023

dbt (DataBuildTool) offers this mechanism by introducing a well-structured framework for data analysis, transformation and orchestration. It also applies general software engineering principles like integrating with git repositories, setting up DRYer code, adding functional test cases, and including external libraries.

Snapshot

Snapshot Data Processing Testing Data Warehouse

Federate to Amazon Redshift Query Editor v2 with Microsoft Entra ID

AWS Big Data

DECEMBER 10, 2024

Amazon Redshift is a fast, petabyte-scale, cloud data warehouse that tens of thousands of customers rely on to power their analytics workloads. With its massively parallel processing (MPP) architecture and columnar data storage, Amazon Redshift delivers high price-performance for complex analytical queries against large datasets.

Sales

Sales Metadata Enterprise Testing

Bridging the Gap: How ‘Data in Place’ and ‘Data in Use’ Define Complete Data Observability

DataKitchen

SEPTEMBER 21, 2023

What is Data in Place? Data in Place refers to the organized structuring and storage of data within a specific storage medium, be it a database, bucket store, files, or other storage platforms. This focus on evaluation and testing should be relentless and critical in the ‘last mile’ of the Data Journey.

Testing

Testing Data Quality Predictive Modeling Metrics

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architect Armando Vázquez identifies eight common types of data architects: Enterprise data architect: These data architects oversee an organization’s overall data architecture, defining data architecture strategy and designing and implementing architectures.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Amazon Redshift: Lower price, higher performance

AWS Big Data

OCTOBER 26, 2023

times better price-performance than other cloud data warehouses on real-world workloads using advanced techniques like concurrency scaling to support hundreds of concurrent users, enhanced string encoding for faster query performance, and Amazon Redshift Serverless performance enhancements. Amazon Redshift delivers up to 4.9

Data Warehouse

Data Warehouse Cost-Benefit Dashboards Optimization

How Eightfold AI implemented metadata security in a multi-tenant data analytics environment with Amazon Redshift

AWS Big Data

NOVEMBER 29, 2023

As part of the Talent Intelligence Platform Eightfold also exposes a data hub where each customer can access their Amazon Redshift-based data warehouse and perform ad hoc queries as well as schedule queries for reporting and data export. Many customers have implemented Amazon Redshift to support multi-tenant applications.

Metadata

Metadata Data Warehouse Analytics Data Analytics

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities. This makes sure the new data platform can meet current and future business goals.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

What is Data Pipeline? A Detailed Explanation

Smart Data Collective

OCTOBER 17, 2022

A point of data entry in a given pipeline. Examples of an origin include storage systems like data lakes, data warehouses and data sources that include IoT devices, transaction processing applications, APIs or social media. The final point to which the data has to be eventually transferred is a destination.

Data Warehouse

Data Warehouse Data Lake Visualization Big Data

Generic orchestration framework for data warehousing workloads using Amazon Redshift RSQL

AWS Big Data

APRIL 3, 2023

Tens of thousands of customers run business-critical workloads on Amazon Redshift , AWS’s fast, petabyte-scale cloud data warehouse delivering the best price-performance. With Amazon Redshift, you can query data across your data warehouse, operational data stores, and data lake using standard SQL.

Data Warehouse

Data Warehouse Testing Data Lake Data-driven

Apply fine-grained access and transformation on the SUPER data type in Amazon Redshift

AWS Big Data

JUNE 19, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing ETL (extract, transform, and load), business intelligence (BI), and reporting tools.

Data Warehouse

Data Warehouse Testing Sales Structured Data

How OLX Group migrated to Amazon Redshift RA3 for simpler, faster, and more cost-effective analytics

AWS Big Data

FEBRUARY 13, 2023

We live in a data-producing world, and as companies want to become data driven, there is the need to analyze more and more data. These analyses are often done using data warehouses. Status quo before migration Here at OLX Group, Amazon Redshift has been our choice for data warehouse for over 5 years.

Snapshot

Snapshot Data Warehouse Analytics Testing

Implement disaster recovery with Amazon Redshift

AWS Big Data

JUNE 27, 2024

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers. For additional details, refer to Automated snapshots.

Snapshot

Snapshot Data Warehouse Data Processing Strategy

Has the Data Warehouse Had Its Day?

BI-Survey

JANUARY 15, 2023

Statements from countless interviews with our customers reveal that the data warehouse is seen as a “black box” by many and understood by few business users. Therefore, it is not clear why the costly and apparently flexibility-inhibiting data warehouse is needed at all. The limiting factor is rather the data landscape.

Data Warehouse

Data Warehouse IT Data Architecture Measurement

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

AWS Big Data

NOVEMBER 16, 2023

Amazon Redshift is a popular cloud data warehouse, offering a fully managed cloud-based service that seamlessly integrates with an organization’s Amazon Simple Storage Service (Amazon S3) data lake, real-time streams, machine learning (ML) workflows, transactional workflows, and much more—all while providing up to 7.9x

Enterprise

Enterprise Data Warehouse Data Lake Optimization

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

AWS Big Data

OCTOBER 9, 2024

Today, customers are embarking on data modernization programs by migrating on-premises data warehouses and data lakes to the AWS Cloud to take advantage of the scale and advanced analytical capabilities of the cloud. In the following sections, we showcase how to configure an AWS Glue Data Quality job for comparison.

Data Quality

Data Quality Data Lake Data Warehouse Metrics

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

AWS Big Data

MARCH 6, 2025

Tens of thousands of customers use Amazon Redshift for modern data analytics at scale, delivering up to three times better price-performance and seven times better throughput than other cloud data warehouses. Refer to IAM Identity Center identity source tutorials for the IdP setup. IAM Identity Center enabled.

Visualization

Visualization Sales Data Warehouse Management

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

Reporting being part of an effective DQM, we will also go through some data quality metrics examples you can use to assess your efforts in the matter. But first, let’s define what data quality actually is. What is the definition of data quality? Why Do You Need Data Quality Management?

Data Quality

Data Quality Metrics Data-driven Management

Successfully conduct a proof of concept in Amazon Redshift

AWS Big Data

MARCH 27, 2024

Amazon Redshift is a fast, scalable, and fully managed cloud data warehouse that allows you to process and run your complex SQL analytics workloads on structured and semi-structured data. Complete the implementation tasks such as data ingestion and performance testing.

Testing

Testing Data Warehouse Metrics Cost-Benefit

Enrich your customer data with geospatial insights using Amazon Redshift, AWS Data Exchange, and Amazon QuickSight

AWS Big Data

MARCH 18, 2024

Load generic address data to Amazon Redshift Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. Redshift Serverless makes it straightforward to run analytics workloads of any size without having to manage data warehouse infrastructure.

Data Warehouse

Data Warehouse Visualization Snapshot Data-driven

Migrate from Google BigQuery to Amazon Redshift using AWS Glue and Custom Auto Loader Framework

AWS Big Data

JUNE 2, 2023

Amazon Redshift is a widely used, fully managed, petabyte-scale cloud data warehouse. Tens of thousands of customers use Amazon Redshift to process exabytes of data every day to power their analytic workloads. For more information, refer to Migrate Google BigQuery to Amazon Redshift using AWS Schema Conversion tool (SCT).

Metadata

Metadata Data Warehouse Big Data Analytics

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

AWS Big Data

JUNE 25, 2024

This post is co-authored by Vijay Gopalakrishnan, Director of Product, Salesforce Data Cloud. In today’s data-driven business landscape, organizations collect a wealth of data across various touch points and unify it in a central data warehouse or a data lake to deliver business insights.

Data Lake

Data Lake Cost-Benefit Data-driven Data Warehouse

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

AWS Big Data

OCTOBER 18, 2023

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that provides the flexibility to use provisioned or serverless compute for your analytical workloads. You can get faster insights without spending valuable time managing your data warehouse. Fault tolerance is built in.

Analytics

Analytics Data Warehouse Dashboards Testing

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

In 2013, Amazon Web Services revolutionized the data warehousing industry by launching Amazon Redshift , the first fully-managed, petabyte-scale, enterprise-grade cloud data warehouse. Amazon Redshift made it simple and cost-effective to efficiently analyze large volumes of data using existing business intelligence tools.

Data Warehouse

Data Warehouse Analytics Data Lake Machine Learning

Enhance data security and governance for Amazon Redshift Spectrum with VPC endpoints

AWS Big Data

FEBRUARY 16, 2024

Many customers are extending their data warehouse capabilities to their data lake with Amazon Redshift. They are looking to further enhance their security posture where they can enforce access policies on their data lakes based on Amazon Simple Storage Service (Amazon S3).

Data Lake

Data Lake Data Warehouse Testing Business Objectives

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Recap of Amazon Redshift key product announcements in 2024

Webinars

Trending Sources

Write queries faster with Amazon Q generative SQL for Amazon Redshift

Webinars

MLOps and DevOps: Why Data Makes It Different

Unlock the power of optimization in Amazon Redshift Serverless

Run Apache XTable in AWS Lambda for background conversion of open table formats

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

Implementing a Pharma Data Mesh using DataOps

Simplify your query performance diagnostics in Amazon Redshift with Query profiler

The Ultimate Guide to Data Warehouse Automation and Tools

Evaluating sample Amazon Redshift data sharing architecture using Redshift Test Drive and advanced SQL analysis

How To Succeed As a DataOps Engineer

Accelerate your data warehouse migration to Amazon Redshift – Part 7

Find the best Amazon Redshift configuration for your workload using Redshift Test Drive

Introduction To The Basic Business Intelligence Concepts

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Automate deployment of an Amazon QuickSight analysis connecting to an Amazon Redshift data warehouse with an AWS CloudFormation template

Centralize near-real-time governance through alerts on Amazon Redshift data warehouses for sensitive queries

Liberty Mutual CIO Monica Caldas on developing a digital-savvy workforce

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Implement data warehousing solution using dbt on Amazon Redshift

Federate to Amazon Redshift Query Editor v2 with Microsoft Entra ID

Bridging the Gap: How ‘Data in Place’ and ‘Data in Use’ Define Complete Data Observability

What is a data architect? Skills, salaries, and how to become a data framework master

Amazon Redshift: Lower price, higher performance

How Eightfold AI implemented metadata security in a multi-tenant data analytics environment with Amazon Redshift

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

What is Data Pipeline? A Detailed Explanation

Generic orchestration framework for data warehousing workloads using Amazon Redshift RSQL

Apply fine-grained access and transformation on the SUPER data type in Amazon Redshift

How OLX Group migrated to Amazon Redshift RA3 for simpler, faster, and more cost-effective analytics

Implement disaster recovery with Amazon Redshift

Has the Data Warehouse Had Its Day?

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Successfully conduct a proof of concept in Amazon Redshift

Enrich your customer data with geospatial insights using Amazon Redshift, AWS Data Exchange, and Amazon QuickSight

Migrate from Google BigQuery to Amazon Redshift using AWS Glue and Custom Auto Loader Framework

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Enhance data security and governance for Amazon Redshift Spectrum with VPC endpoints

Stay Connected