Data Processing, Data Warehouse and Testing

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis.

Data Warehouse

Data Warehouse Analytics Testing Sales

Oracle Wants to Be the Database for AI

David Menninger's Analyst Perspectives

MAY 15, 2025

Oracle recently hosted its annual Database Analyst Summit, sharing the vision and strategy for its data platform. While much of the event was under non-disclosure as product plans and launch schedules are finalized, it still served as a useful recap of the broad portfolio of data platform capabilities that Oracle has to offer.

Data Lake

Data Lake Data Warehouse Machine Learning Software

The DataOps Vendor Landscape, 2021

DataKitchen

APRIL 13, 2021

Testing and Data Observability. We have also included vendors for the specific use cases of ModelOps, MLOps, DataGovOps and DataSecOps which apply DataOps principles to machine learning, AI, data governance, and data security operations. . Genie — Distributed big data orchestration service by Netflix.

Testing

Testing Machine Learning Consulting Data Science

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Accelerate Offloading to Cloudera Data Warehouse (CDW) with Procedural SQL Support

Cloudera

JULY 16, 2021

Did you know Cloudera customers, such as SMG and Geisinger , offloaded their legacy DW environment to Cloudera Data Warehouse (CDW) to take advantage of CDW’s modern architecture and best-in-class performance? The Data Warehouse on Cloudera Data Platform provides easy to use self-service and advanced analytics use cases at scale.

Data Warehouse

Data Warehouse Data Processing Management Testing

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

AWS Big Data

MAY 30, 2024

Amazon Redshift is a fast, scalable, and fully managed cloud data warehouse that allows you to process and run your complex SQL analytics workloads on structured and semi-structured data. The system had an integration with legacy backend services that were all hosted on premises. The downside here is over-provisioning.

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Structured Data

The future of data: A 5-pillar approach to modern data management

CIO Business Intelligence

DECEMBER 11, 2024

Manish Limaye Pillar #1: Data platform The data platform pillar comprises tools, frameworks and processing and hosting technologies that enable an organization to process large volumes of data, both in batch and streaming modes. The choice of vendors should align with the broader cloud or on-premises strategy.

Management

Management Data Governance Data Science Reporting

5 Advantages of Using a Redshift Data Warehouse

Sisense

MARCH 19, 2019

To extract the maximum value from your data, it needs to be accessible, well-sorted, and easy to manipulate and store. Amazon’s Redshift data warehouse tools offer such a blend of features, but even so, it’s important to understand what it brings to the table before making a decision to integrate the system.

Data Warehouse

Data Warehouse Cost-Benefit Business Intelligence Data Processing

Find the best Amazon Redshift configuration for your workload using Redshift Test Drive

AWS Big Data

JULY 27, 2023

Amazon Redshift is a widely used, fully managed, petabyte-scale cloud data warehouse. Tens of thousands of customers use Amazon Redshift to process exabytes of data every day to power their analytics workloads. Amazon Redshift RA3 with managed storage is the newest instance type for Provisioned clusters.

Testing

Testing Data Warehouse Data Processing Snapshot

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

You can now generate data integration jobs for various data sources and destinations, including Amazon Simple Storage Service (Amazon S3) data lakes with popular file formats like CSV, JSON, and Parquet, as well as modern table formats such as Apache Hudi , Delta , and Apache Iceberg.

Data Integration

Data Integration Visualization Data Processing Big Data

Evaluating sample Amazon Redshift data sharing architecture using Redshift Test Drive and advanced SQL analysis

AWS Big Data

SEPTEMBER 10, 2024

With the launch of Amazon Redshift Serverless and the various provisioned instance deployment options , customers are looking for tools that help them determine the most optimal data warehouse configuration to support their Amazon Redshift workloads. The following image shows the process flow.

Testing

Testing Snapshot Data Warehouse Metrics

Accelerate your data warehouse migration to Amazon Redshift – Part 7

AWS Big Data

OCTOBER 17, 2023

With Amazon Redshift, you can use standard SQL to query data across your data warehouse, operational data stores, and data lake. Migrating a data warehouse can be complex. You have to migrate terabytes or petabytes of data from your legacy system while not disrupting your production workload.

Data Warehouse

Data Warehouse Data Processing Data Lake Management

Introduction To The Basic Business Intelligence Concepts

datapine

MAY 9, 2019

Business intelligence concepts refer to the usage of digital computing technologies in the form of data warehouses, analytics and visualization with the aim of identifying and analyzing essential business-based data to generate new, actionable corporate insights. The data warehouse. 1) The raw data.

Business Intelligence

Business Intelligence Dashboards Data Warehouse Visualization

Automate deployment of an Amazon QuickSight analysis connecting to an Amazon Redshift data warehouse with an AWS CloudFormation template

AWS Big Data

FEBRUARY 16, 2023

Amazon Redshift is the most widely used data warehouse in the cloud, best suited for analyzing exabytes of data and running complex analytical queries. Amazon QuickSight is a fast business analytics service to build visualizations, perform ad hoc analysis, and quickly get business insights from your data.

Data Warehouse

Data Warehouse Sales Visualization Data Processing

Federate to Amazon Redshift Query Editor v2 with Microsoft Entra ID

AWS Big Data

DECEMBER 10, 2024

Amazon Redshift is a fast, petabyte-scale, cloud data warehouse that tens of thousands of customers rely on to power their analytics workloads. With its massively parallel processing (MPP) architecture and columnar data storage, Amazon Redshift delivers high price-performance for complex analytical queries against large datasets.

Sales

Sales Metadata Enterprise Testing

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

AWS Big Data

MARCH 6, 2025

Tens of thousands of customers use Amazon Redshift for modern data analytics at scale, delivering up to three times better price-performance and seven times better throughput than other cloud data warehouses. The application has been tested successfully with versions v3.12.8 Create an OIDC IdP on IAM the console.

Visualization

Visualization Sales Data Warehouse Management

Implement data warehousing solution using dbt on Amazon Redshift

AWS Big Data

NOVEMBER 17, 2023

dbt (DataBuildTool) offers this mechanism by introducing a well-structured framework for data analysis, transformation and orchestration. It also applies general software engineering principles like integrating with git repositories, setting up DRYer code, adding functional test cases, and including external libraries.

Snapshot

Snapshot Data Processing Testing Data Warehouse

What Is Ad Hoc Reporting? Your Guide To Definition, Meaning, Examples & Benefits

datapine

JULY 1, 2020

Moreover, a host of ad hoc analysis or reporting platforms boast integrated online data visualization tools to help enhance the data exploration process. Retail: Ad hoc data analysis proves particularly effective in loss prevention in the retail sector. Ad hoc analysis has served to revolutionize the healthcare sector.

Reporting

Reporting Dashboards Cost-Benefit Visualization

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities. This makes sure the new data platform can meet current and future business goals.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

One of the key challenges in modern big data management is facilitating efficient data sharing and access control across multiple EMR clusters. Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated. Test access using Athena queries in the consumer account.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Implement disaster recovery with Amazon Redshift

AWS Big Data

JUNE 27, 2024

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers. Document the entire disaster recovery process.

Snapshot

Snapshot Data Warehouse Data Processing Strategy

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

AWS Big Data

OCTOBER 18, 2023

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that provides the flexibility to use provisioned or serverless compute for your analytical workloads. You can get faster insights without spending valuable time managing your data warehouse. Fault tolerance is built in. Choose Create workgroup.

Analytics

Analytics Data Warehouse Dashboards Testing

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To speed up the self-service analytics and foster innovation based on data, a solution was needed to provide ways to allow any team to create data products on their own in a decentralized manner. To create and manage the data products, smava uses Amazon Redshift , a cloud data warehouse.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Drinking our own champagne – Cloudera upgrades to CDP Private Cloud

Cloudera

APRIL 21, 2021

We did add some additional capacity to make parts of the testing and validation process easier, but many clusters can upgrade with no additional hardware. Part of the reason we run a single multi-tenant cluster is to make it possible to join data from different departments and get a full picture of our business. Life on CDP.

Testing

Testing Data Processing Interactive Data Warehouse

Power enterprise-grade Data Vaults with Amazon Redshift – Part 2

AWS Big Data

NOVEMBER 16, 2023

Amazon Redshift is a popular cloud data warehouse, offering a fully managed cloud-based service that seamlessly integrates with an organization’s Amazon Simple Storage Service (Amazon S3) data lake, real-time streams, machine learning (ML) workflows, transactional workflows, and much more—all while providing up to 7.9x

Enterprise

Enterprise Data Warehouse Snapshot Cost-Benefit

Top 15 data management platforms

CIO Business Intelligence

JUNE 9, 2022

All this data arrives by the terabyte, and a data management platform can help marketers make sense of it all. Marketing-focused or not, DMPs excel at negotiating with a wide array of databases, data lakes, or data warehouses, ingesting their streams of data and then cleaning, sorting, and unifying the information therein.

Management

Management Advertising Data Lake Sales

Governing data in relational databases using Amazon DataZone

AWS Big Data

MAY 7, 2024

It also makes it easier for engineers, data scientists, product managers, analysts, and business users to access data throughout an organization to discover, use, and collaborate to derive data-driven insights. The architecture illustrates how the solution works in a multi-account environment, which is a common scenario.

Metadata

Metadata Data Lake Data Processing Data-driven

How Gilead used Amazon Redshift to quickly and cost-effectively load third-party medical claims data

AWS Big Data

NOVEMBER 8, 2023

Because Gilead is expanding into biologics and large molecule therapies, and has an ambitious goal of launching 10 innovative therapies by 2030, there is heavy emphasis on using data with AI and machine learning (ML) to accelerate the drug discovery pipeline. This data volume is expected to increase monthly and is fully refreshed each month.

Data Lake

Data Lake Data Warehouse Cost-Benefit Optimization

Break data silos and stream your CDC data with Amazon Redshift streaming and Amazon MSK

AWS Big Data

DECEMBER 13, 2023

A CDC-based approach captures the data changes and makes them available in data warehouses for further analytics in real-time. usually a data warehouse) needs to reflect those changes in near real-time. This post showcases how to use streaming ingestion to bring data to Amazon Redshift.

Data Warehouse

Data Warehouse Snapshot Data Processing Internet of Things

Integrate Tableau and Okta with Amazon Redshift using AWS IAM Identity Center

AWS Big Data

JUNE 3, 2024

Amazon Redshift is a fast, scalable cloud data warehouse built to serve workloads at any scale. This integration positions Amazon Redshift as an IAM Identity Center-managed application, enabling you to use database role-based access control on your data warehouse for enhanced security. Choose OAuth Config File.

Data Warehouse

Data Warehouse Reporting Testing Publishing

BusinessObjects in the Cloud – No Big Rush and No Big Deal

Paul Blogs on BI

SEPTEMBER 8, 2021

Well firstly, if the main data warehouses, repositories, or application databases that BusinessObjects accesses are on premise, it makes no sense to move BusinessObjects to the cloud until you move its data sources to the cloud. You also have the option of hosting with a third party.

Data Warehouse

Data Warehouse Data Processing Data Lake Testing

From Excel to AI: How Liberty Dental revolutionized care management

CIO Business Intelligence

OCTOBER 17, 2024

The data factor I joined Liberty Dental about two and a half years ago, and the first big opportunity I saw was data, which was all over the place. We had a kind of small data warehouse on-prem. We created our data model in a way that satisfied the requirements of what we had a vision of.

Management

Management Insurance ROI Cost-Benefit

Resolve private DNS hostnames for Amazon MSK Connect

AWS Big Data

OCTOBER 20, 2023

The connectors were only able to reference hostnames in the connector configuration or plugin that are publicly resolvable and couldn’t resolve private hostnames defined in either a private hosted zone or use DNS servers in another customer network. Many customers ensure that their internal DNS applications are not publicly resolvable.

Data Processing

Data Processing Snapshot Data Warehouse Management

Important Considerations When Migrating to a Data Lake

Smart Data Collective

MARCH 30, 2022

If you don’t understand the concept, you might want to check out our previous article on the difference between data lakes and data warehouses. Migrate data, workloads, and applications. Migrate data, workloads, and applications using the preferred pattern. We propose that you test cases in small steps.

Data Lake

Data Lake Cost-Benefit Data Warehouse Big Data

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

AWS Big Data

APRIL 25, 2024

As the queries finish running, an UNLOAD operation is invoked from the Redshift data warehouse to the S3 bucket in Account A. Cross-account access has been set up between S3 buckets in Account A with resources in Account B to be able to load and unload data. Test the connection, then save your settings.

Metadata

Metadata Data Processing Management Testing

Run Apache Hive workloads using Spark SQL with Amazon EMR on EKS

AWS Big Data

OCTOBER 18, 2023

Apache Hive is a distributed, fault-tolerant data warehouse system that enables analytics at a massive scale. Spark SQL is an Apache Spark module for structured data processing. Navigate to the side menu Virtual clusters , then select the HiveDemo cluster , You can see an entry for the SparkSQL test job.

Big Data

Big Data Data Processing Interactive Testing

South Africa’s King Price Insurance moves to cloud as business grows

CIO Business Intelligence

MARCH 16, 2022

This phase includes the migration of our data warehouse and business intelligence capabilities, using Synapse and PowerBI respectively. It is important to differentiate between a cloud hosting strategy or solution and building a true cloud solution — which is the future state we all desire. Who did you involve and why?

Insurance

Insurance Cost-Benefit Data Processing Strategy

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

With quality data at their disposal, organizations can form data warehouses for the purposes of examining trends and establishing future-facing strategies. Industry-wide, the positive ROI on quality data is well understood. Your Chance: Want to test a professional analytics software? date, month, and year).

Data Quality

Data Quality Metrics Data-driven Management

Setting up and Getting Started with Cloudera’s New SQL AI Assistant

Cloudera

JANUARY 19, 2024

Supported AI models and services The SQL AI Assistant is not bundled with a specific LLM; instead it supports various LLMs and hosting services. The model can run locally, be hosted on CML infra or in the infrastructure of a trusted service provider. Log in to the Cloudera Data Warehouse service as DWAdmin.

Data Warehouse

Data Warehouse Data Processing Optimization Modeling

Migration Supporting Real-Time Analytics for Customer Experience Management

Cloudera

AUGUST 31, 2020

Given the prohibitive cost of scaling it, in addition to the new business focus on data science and the need to leverage public cloud services to support future growth and capability roadmap, SMG decided to migrate from the legacy data warehouse to Cloudera’s solution using Hive LLAP. The case for a new Data Warehouse?

Management

Management Slice and Dice Data Warehouse Analytics

Top 15 data management platforms available today

CIO Business Intelligence

SEPTEMBER 22, 2023

All this data arrives by the terabyte, and a data management platform can help marketers make sense of it all. DMPs excel at negotiating with a wide array of databases, data lakes, or data warehouses, ingesting their streams of data and then cleaning, sorting, and unifying the information therein.

Management

Management Advertising Data Lake Sales

Attribute Amazon EMR on EC2 costs to your end-users

AWS Big Data

AUGUST 27, 2024

His background is in data warehouse/data lake – architecture, development and administration. He is in data and analytical field for over 14 years. Ramesh Raghupathy is a Senior Data Architect with WWCO ProServe at AWS. He specializes in building and modernising analytical solutions.

Metrics

Metrics Dashboards Data Lake Optimization

Integrate Tableau and Microsoft Entra ID with Amazon Redshift using AWS IAM Identity Center

AWS Big Data

SEPTEMBER 3, 2024

Amazon Redshift and Tableau empower data analysis. Amazon Redshift is a cloud data warehouse that processes complex queries at scale and with speed. Tableau’s extensive capabilities and enterprise connectivity help analysts efficiently prepare, explore, and share data insights company-wide. Choose OAuth Config File.

Reporting

Reporting Publishing Data Warehouse Management

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

AWS Big Data

NOVEMBER 8, 2023

Fast-track streaming ETL with AWS streaming data services: Learn how to build streaming data pipelines across data lakes and data warehouses. Learn best practices for performance, scale, and cost control in Amazon Kinesis Data Streams, Amazon MSK, Amazon Redshift streaming ingestion, and AWS Glue streaming.

Data-driven

Data-driven Machine Learning Data Lake Cost-Benefit

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Oracle Wants to Be the Database for AI

Webinars

Trending Sources

The DataOps Vendor Landscape, 2021

Webinars

Accelerate Offloading to Cloudera Data Warehouse (CDW) with Procedural SQL Support

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

The future of data: A 5-pillar approach to modern data management

5 Advantages of Using a Redshift Data Warehouse

Find the best Amazon Redshift configuration for your workload using Redshift Test Drive

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Evaluating sample Amazon Redshift data sharing architecture using Redshift Test Drive and advanced SQL analysis

Accelerate your data warehouse migration to Amazon Redshift – Part 7

Introduction To The Basic Business Intelligence Concepts

Automate deployment of an Amazon QuickSight analysis connecting to an Amazon Redshift data warehouse with an AWS CloudFormation template

Federate to Amazon Redshift Query Editor v2 with Microsoft Entra ID

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

Implement data warehousing solution using dbt on Amazon Redshift

What Is Ad Hoc Reporting? Your Guide To Definition, Meaning, Examples & Benefits

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Implement disaster recovery with Amazon Redshift

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Drinking our own champagne – Cloudera upgrades to CDP Private Cloud

Power enterprise-grade Data Vaults with Amazon Redshift – Part 2

Top 15 data management platforms

Governing data in relational databases using Amazon DataZone

How Gilead used Amazon Redshift to quickly and cost-effectively load third-party medical claims data

Break data silos and stream your CDC data with Amazon Redshift streaming and Amazon MSK

Integrate Tableau and Okta with Amazon Redshift using AWS IAM Identity Center

BusinessObjects in the Cloud – No Big Rush and No Big Deal

From Excel to AI: How Liberty Dental revolutionized care management

Resolve private DNS hostnames for Amazon MSK Connect

Important Considerations When Migrating to a Data Lake

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

Run Apache Hive workloads using Spark SQL with Amazon EMR on EKS

South Africa’s King Price Insurance moves to cloud as business grows

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Setting up and Getting Started with Cloudera’s New SQL AI Assistant

Migration Supporting Real-Time Analytics for Customer Experience Management

Top 15 data management platforms available today

Attribute Amazon EMR on EC2 costs to your end-users

Top 20 most-asked questions about Amazon RDS for Db2 answered

Integrate Tableau and Microsoft Entra ID with Amazon Redshift using AWS IAM Identity Center

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

Stay Connected