Data Transformation, Data Warehouse, Testing and Visualization

Data Transformation

Data Warehouse

Testing

Visualization

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

This means you can refine your ETL jobs through natural follow-up questionsstarting with a basic data pipeline and progressively adding transformations, filters, and business logic through conversation. The DataFrame code generation now extends beyond AWS Glue DynamicFrame to support a broader range of data processing scenarios.

Data Integration

Data Integration Visualization Data Processing Big Data

Available Now! Automated Testing for Data Transformations

Wayne Yaddow

FEBRUARY 18, 2025

Selecting the strategies and tools for validating data transformations and data conversions in your data pipelines. Introduction Data transformations and data conversions are crucial to ensure that raw data is organized, processed, and ready for useful analysis.

Testing

Testing Data Transformation Data-driven Data Quality

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities. This makes sure the new data platform can meet current and future business goals.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

With quality data at their disposal, organizations can form data warehouses for the purposes of examining trends and establishing future-facing strategies. Industry-wide, the positive ROI on quality data is well understood. Business/Data Analyst: The business analyst is all about the “meat and potatoes” of the business.

Data Quality

Data Quality Metrics Data-driven Management

What is a DataOps Engineer?

DataKitchen

OCTOBER 5, 2021

Data operations (or data production) is a series of pipeline procedures that take raw data, progress through a series of processing and transformation steps, and output finished products in the form of dashboards, predictions, data warehouses or whatever the business requires. Their product is the data.

Testing

Testing Dashboards Measurement Experimentation

Happy Birthday, CDP Public Cloud

Cloudera

OCTOBER 13, 2020

In the beginning, CDP ran only on AWS with a set of services that supported a handful of use cases and workload types: CDP Data Warehouse: a kubernetes-based service that allows business analysts to deploy data warehouses with secure, self-service access to enterprise data. That Was Then.

Data Warehouse

Data Warehouse Machine Learning Visualization Data Lake

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

DataKitchen

JULY 27, 2023

Azure Synapse Analytics Pipelines: Azure Synapse Analytics (formerly SQL Data Warehouse) provides data exploration, data preparation, data management, and data warehousing capabilities. It provides data prep, management, and enterprise data warehousing tools. It does the job.

Machine Learning

Machine Learning Cost-Benefit Data Transformation Testing

Migrate Amazon Redshift from DC2 to RA3 to accommodate increasing data volumes and analytics demands

AWS Big Data

AUGUST 9, 2024

Dafiti’s data infrastructure relies heavily on ETL and ELT processes, with approximately 2,500 unique processes run daily. Amazon Redshift at Dafiti Amazon Redshift is a fully managed data warehouse service, and was adopted by Dafiti in 2017. Do you want to know more about what we’re doing in the data area at Dafiti?

Data Lake

Data Lake Analytics Data Warehouse Data-driven

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To speed up the self-service analytics and foster innovation based on data, a solution was needed to provide ways to allow any team to create data products on their own in a decentralized manner. To create and manage the data products, smava uses Amazon Redshift , a cloud data warehouse.

Data Lake

Data Lake Data Warehouse Data-driven B2B

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

AWS Big Data

NOVEMBER 15, 2023

Once released, consumers use datasets from different providers for analysis, machine learning (ML) workloads, and visualization. Each CDH dataset has three processing layers: source (raw data), prepared (transformed data in Parquet), and semantic (combined datasets).

Dashboards

Dashboards Analytics Metadata Data Warehouse

How to Use Apache Iceberg in CDP’s Open Lakehouse

Cloudera

AUGUST 8, 2022

The general availability covers Iceberg running within some of the key data services in CDP, including Cloudera Data Warehouse ( CDW ), Cloudera Data Engineering ( CDE ), and Cloudera Machine Learning ( CML ). Cloudera Data Engineering (Spark 3) with Airflow enabled. Cloudera Machine Learning .

Snapshot

Snapshot Data Warehouse Machine Learning Cost-Benefit

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

The modern data stack is a combination of various software tools used to collect, process, and store data on a well-integrated cloud-based data platform. It is known to have benefits in handling data due to its robustness, speed, and scalability. A typical modern data stack consists of the following: A data warehouse.

Data Warehouse

Data Warehouse Cost-Benefit Data Science Data Transformation

Use fuzzy string matching to approximate duplicate records in Amazon Redshift

AWS Big Data

FEBRUARY 8, 2023

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. Amazon Redshift enables you to run complex SQL analytics at scale and performance on terabytes to petabytes of structured and unstructured data, and make the insights widely available through popular business intelligence (BI) and analytics tools.

Data Quality

Data Quality Testing Data Warehouse Unstructured Data

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

AWS Big Data

JUNE 29, 2023

In this post, we discuss why AWS recommends moving from Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics for Apache Flink to take advantage of Apache Flink’s advanced streaming capabilities. Notebooks are provisioned quickly and provide a way for you to instantly view and analyze your streaming data.

Data Analytics

Data Analytics Analytics IoT Data Lake

Exploring the AI and data capabilities of watsonx

IBM Big Data Hub

JULY 17, 2023

By supporting open-source frameworks and tools for code-based, automated and visual data science capabilities — all in a secure, trusted studio environment — we’re already seeing excitement from companies ready to use both foundation models and machine learning to accomplish key tasks.

Machine Learning

Machine Learning Data Warehouse Modeling Cost-Benefit

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

AWS Big Data

MARCH 3, 2023

Tricentis is the global leader in continuous testing for DevOps, cloud, and enterprise applications. Speed changes everything, and continuous testing across the entire CI/CD lifecycle is the key. Tricentis instills that confidence by providing software tools that enable Agile Continuous Testing (ACT) at scale.

Software

Software Data Lake Testing Cost-Benefit

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Alation

OCTOBER 27, 2022

Few actors in the modern data stack have inspired the enthusiasm and fervent support as dbt. This data transformation tool enables data analysts and engineers to transform, test and document data in the cloud data warehouse. But what does this mean from a practitioner perspective?

Metrics

Metrics Dashboards Sales Reporting

Migrate your existing SQL-based ETL workload to an AWS serverless ETL infrastructure using AWS Glue

AWS Big Data

JULY 31, 2023

This solution decouples the ETL and analytics workloads from our transactional data source Amazon Aurora, and uses Amazon Redshift as the data warehouse solution to build a data mart. We use Amazon Redshift as the data warehouse to implement the data mart solution. Navigate to the Visual tab.

Sales

Sales Data Warehouse Visualization Testing

Best Web Analytics 2.0 Tools: Quantitative, Qualitative, Life Saving!

Occam's Razor

OCTOBER 19, 2010

If after rigorous analysis you have determined that you have evolved to a stage that you need a data warehouse then you are out of luck with Yahoo! If you can show ROI on a DW it would be a good use of your money to go with Omniture Discover, WebTrends Data Mart, Coremetrics Explore. Five Reasons And Awesome Testing Ideas.

Analytics

Analytics Testing Measurement Optimization

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

This is in contrast to traditional BI, which extracts insight from data outside of the app. We rely on increasingly mobile technology to comb through massive amounts of data and solve high-value problems. Plus, there is an expectation that tools be visually appealing to boot. Their dashboards were visually stunning.

Analytics

Analytics Cost-Benefit Visualization Dashboards

What is Data Mapping?

Jet Global

FEBRUARY 23, 2024

Why Data Mapping is Important Data mapping is a critical element of any data management initiative, such as data integration, data migration, data transformation, data warehousing, or automation. Data mapping is important for several reasons.

Data Warehouse

Data Warehouse Reporting Data Transformation Visualization

Enhancing Your BI Experience With Apache Iceberg

Jet Global

JULY 16, 2024

By providing a consistent and stable backend, Apache Iceberg ensures that data remains immutable and query performance is optimized, thus enabling businesses to trust and rely on their BI tools for critical insights. It provides a stable schema, supports complex data transformations, and ensures atomic operations.

Dashboards

Dashboards Data-driven Reporting Business Intelligence

How DeNA Co., Ltd. accelerated anonymized data quality tests up to 100 times faster using Amazon Redshift Serverless and dbt

AWS Big Data

DECEMBER 17, 2024

Conduct data quality tests on anonymized data in compliance with data policies Conduct data quality tests to quickly identify and address data quality issues, maintaining high-quality data at all times. The challenge Data quality tests require performing 1,300 tests on 10 TB of data monthly.

Data Quality

Data Quality Testing Metrics Optimization

Automating Data Warehouses in the Era of AI, Data Products and Data Lakehouses

BI-Survey

MARCH 6, 2025

While efficiency is a priority, data quality and security remain non-negotiable. Developing and maintaining data transformation pipelines are among the first tasks to be targeted for automation. However, caution is advised since accuracy, timeliness, and other aspects of data quality depend on the quality of data pipelines.

Data Warehouse

Data Warehouse Metadata Unstructured Data Data-driven

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

AWS Big Data

APRIL 29, 2025

While enabling organization-wide efficiency, the team also applied these principles to the data architecture, making sure that CLEA itself operates frugally. After evaluating various tools, we built a serverless data transformation pipeline using Amazon Athena and dbt. However, our initial data architecture led to challenges.

Data Transformation

Data Transformation Cost-Benefit Testing Data Lake

Data Leaders Brief

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Available Now! Automated Testing for Data Transformations

Webinars

Trending Sources

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Webinars

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

What is a DataOps Engineer?

Happy Birthday, CDP Public Cloud

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

Migrate Amazon Redshift from DC2 to RA3 to accommodate increasing data volumes and analytics demands

How smava makes loans transparent and affordable using Amazon Redshift Serverless

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

How to Use Apache Iceberg in CDP’s Open Lakehouse

The Modern Data Stack Explained: What The Future Holds

Use fuzzy string matching to approximate duplicate records in Amazon Redshift

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

Exploring the AI and data capabilities of watsonx

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Migrate your existing SQL-based ETL workload to an AWS serverless ETL infrastructure using AWS Glue

Best Web Analytics 2.0 Tools: Quantitative, Qualitative, Life Saving!

What Is Embedded Analytics?

What is Data Mapping?

Enhancing Your BI Experience With Apache Iceberg

How DeNA Co., Ltd. accelerated anonymized data quality tests up to 100 times faster using Amazon Redshift Serverless and dbt

Automating Data Warehouses in the Era of AI, Data Products and Data Lakehouses

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

Stay Connected

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Available Now! Automated Testing for Data Transformations

Webinars

Trending Sources

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Webinars

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

What is a DataOps Engineer?

Happy Birthday, CDP Public Cloud

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

Migrate Amazon Redshift from DC2 to RA3 to accommodate increasing data volumes and analytics demands

How smava makes loans transparent and affordable using Amazon Redshift Serverless

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

How to Use Apache Iceberg in CDP’s Open Lakehouse

The Modern Data Stack Explained: What The Future Holds

­­Use fuzzy string matching to approximate duplicate records in Amazon Redshift

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

Exploring the AI and data capabilities of watsonx

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Migrate your existing SQL-based ETL workload to an AWS serverless ETL infrastructure using AWS Glue

Best Web Analytics 2.0 Tools: Quantitative, Qualitative, Life Saving!

What Is Embedded Analytics?

What is Data Mapping?

Enhancing Your BI Experience With Apache Iceberg

How DeNA Co., Ltd. accelerated anonymized data quality tests up to 100 times faster using Amazon Redshift Serverless and dbt

Automating Data Warehouses in the Era of AI, Data Products and Data Lakehouses

How BMW Group built a serverless terabyte-scale data transformation architecture with dbt and Amazon Athena

Stay Connected

Use fuzzy string matching to approximate duplicate records in Amazon Redshift