Data Transformation, Data Warehouse and Management

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

In fact, by putting a single label like AI on all the steps of a data-driven business process, we have effectively not only blurred the process, but we have also blurred the particular characteristics that make each step separately distinct, uniquely critical, and ultimately dependent on specialized, specific technologies at each step.

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

This integration enables data teams to efficiently transform and manage data using Athena with dbt Cloud’s robust features, enhancing the overall data workflow experience. This enables you to extract insights from your data without the complexity of managing infrastructure.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis. Create dbt models in dbt Cloud.

Data Warehouse

Data Warehouse Analytics Testing Sales

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

1) What Is Data Quality Management? 4) Data Quality Best Practices. 5) How Do You Measure Data Quality? 6) Data Quality Metrics Examples. 7) Data Quality Control: Use Case. 8) The Consequences Of Bad Data Quality. 9) 3 Sources Of Low-Quality Data. 10) Data Quality Solutions: Key Attributes.

Data Quality

Data Quality Metrics Data-driven Management

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

AWS Big Data

NOVEMBER 22, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that you can use to analyze your data at scale. Reusing database sessions to simplify the connection management logic in your API implementation, reducing the complexity of the code and making it more straightforward to maintain and scale.

Data Warehouse

Data Warehouse Recreation/Entertainment Cost-Benefit Data-driven

Database vs. Data Warehouse: What’s the Difference?

Jet Global

MAY 28, 2019

Whether the reporting is being done by an end user, a data science team, or an AI algorithm, the future of your business depends on your ability to use data to drive better quality for your customers at a lower cost. So, when it comes to collecting, storing, and analyzing data, what is the right choice for your enterprise?

Data Warehouse

Data Warehouse Reporting Business Intelligence Sales

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

AWS Big Data

JANUARY 6, 2025

In this post, we show you how to establish the data ingestion pipeline between Google Analytics 4, Google Sheets, and an Amazon Redshift Serverless workgroup. With Amazon AppFlow, you can run data flows at nearly any scale and at the frequency you chooseon a schedule, in response to a business event, or on demand.

Analytics

Analytics Data Warehouse Big Data Metrics

MLOps and DevOps: Why Data Makes It Different

O'Reilly on Data

OCTOBER 19, 2021

Since software engineers manage to build ordinary software without experiencing as much pain as their counterparts in the ML department, it begs the question: should we just start treating ML projects as software engineering projects as usual, maybe educating ML practitioners about the existing best practices? Orchestration. Versioning.

IT

IT Testing Experimentation Software

The Best Data Management Tools For Small Businesses

Smart Data Collective

APRIL 29, 2020

As the world is gradually becoming more dependent on data, the services, tools and infrastructure are all the more important for businesses in every sector. Data management has become a fundamental business concern, and especially for businesses that are going through a digital transformation. What is data management?

Management

Management Data Warehouse Digital Transformation Dashboards

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

This means you can refine your ETL jobs through natural follow-up questionsstarting with a basic data pipeline and progressively adding transformations, filters, and business logic through conversation. The DataFrame code generation now extends beyond AWS Glue DynamicFrame to support a broader range of data processing scenarios.

Data Integration

Data Integration Visualization Data Processing Data Lake

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

dbt is an open source, SQL-first templating engine that allows you to write repeatable and extensible data transforms in Python and SQL. dbt is predominantly used by data warehouses (such as Amazon Redshift ) customers who are looking to keep their data transform logic separate from storage and engine.

Data Lake

Data Lake Management Metrics Data Warehouse

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Their terminal operations rely heavily on seamless data flows and the management of vast volumes of data. With the addition of these technologies alongside existing systems like terminal operating systems (TOS) and SAP, the number of data producers has grown substantially.

IoT

IoT Machine Learning Metadata Data-driven

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

We also examine how centralized, hybrid and decentralized data architectures support scalable, trustworthy ecosystems. As data-centric AI, automated metadata management and privacy-aware data sharing mature, the opportunity to embed data quality into the enterprises core has never been more significant.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Birst automates the creation of data warehouses in Snowflake

Birst BI

FEBRUARY 25, 2020

Managing large-scale data warehouse systems has been known to be very administrative, costly, and lead to analytic silos. The good news is that Snowflake, the cloud data platform, lowers costs and administrative overhead. What gaps does the joint solution address in the market? What value does this joint solution provide?

Data Warehouse

Data Warehouse Cost-Benefit Data Architecture Enterprise

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

AWS Big Data

JANUARY 30, 2025

Diagram 1: Overall architecture of the solution, using AWS Step Functions, Amazon Redshift and Amazon S3 The following AWS services were used to shape our new ETL architecture: Amazon Redshift A fully managed, petabyte-scale data warehouse service in the cloud. includes the ability to run Python scripts.

Data Warehouse

Data Warehouse Data Architecture Machine Learning Data Transformation

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Recognizing this paradigm shift, ANZ Institutional Division has embarked on a transformative journey to redefine its approach to data management, utilization, and extracting significant business value from data insights. This enables global discoverability and collaboration without centralizing ownership or operations.

Metadata

Metadata Data Governance Data Quality Data-driven

7 key Microsoft Azure analytics services (plus one extra)

CIO Business Intelligence

JUNE 29, 2022

Analytics is the means for discovering those insights, and doing it well requires the right tools for ingesting and preparing data, enriching and tagging it, building and sharing reports, and managing and protecting your data and insights. Azure Data Factory. Azure Data Lake Analytics. Azure Synapse Analytics.

Data Lake

Data Lake Analytics Data Warehouse Machine Learning

Available Now! Automated Testing for Data Transformations

Wayne Yaddow

FEBRUARY 18, 2025

Selecting the strategies and tools for validating data transformations and data conversions in your data pipelines. Introduction Data transformations and data conversions are crucial to ensure that raw data is organized, processed, and ready for useful analysis.

Testing

Testing Data Transformation Data-driven Data Quality

Ensuring Data Transformation Quality with dbt Core

Wayne Yaddow

MARCH 14, 2025

How dbt Core aids data teams test, validate, and monitor complex data transformations and conversions Photo by NASA on Unsplash Introduction dbt Core, an open-source framework for developing, testing, and documenting SQL-based data transformations, has become a must-have tool for modern data teams as the complexity of data pipelines grows.

Data Transformation

Data Transformation Testing Unstructured Data Data Quality

Breaking down data silos for digital success

CIO Business Intelligence

NOVEMBER 7, 2023

Given the importance of sharing information among diverse disciplines in the era of digital transformation, this concept is arguably as important as ever. The aim is to normalize, aggregate, and eventually make available to analysts across the organization data that originates in various pockets of the enterprise.

Data Warehouse

Data Warehouse Digital Transformation Data-driven Reporting

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. Choose Manage model access. Change the AWS Region to US West (Oregon). Choose Stacks.

Metadata

Metadata Data Lake Modeling Data Warehouse

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

It does this by helping teams handle the T in ETL (extract, transform, and load) processes. It allows users to write data transformation code, run it, and test the output, all within the framework it provides. dbt emerged as the perfect choice for this transformation within their existing AWS environment.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

The Amazon Redshift integration for Apache Spark combined with AWS Glue or Amazon EMR performs transformations before loading data into Amazon Redshift. Finally, data can be loaded into Amazon Redshift with popular ETL tools like Informatica , Matillion and DBT Labs. AWS Glue 4.0

IoT

IoT Data Warehouse Cost-Benefit Reporting

Happy Birthday, CDP Public Cloud

Cloudera

OCTOBER 13, 2020

In the beginning, CDP ran only on AWS with a set of services that supported a handful of use cases and workload types: CDP Data Warehouse: a kubernetes-based service that allows business analysts to deploy data warehouses with secure, self-service access to enterprise data. That Was Then. New Services.

Data Warehouse

Data Warehouse Machine Learning Visualization Data Lake

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

DataKitchen

JULY 27, 2023

Azure Databricks Delta Live Table s: These provide a more straightforward way to build and manage Data Pipelines for the latest, high-quality data in Delta Lake. It provides data prep, management, and enterprise data warehousing tools. It has a data pipeline tool , as well. It does the job.

Machine Learning

Machine Learning Cost-Benefit Data Transformation Testing

Introducing Amazon Q data integration in AWS Glue

AWS Big Data

APRIL 30, 2024

Amazon Q Developer can now generate complex data integration jobs with multiple sources, destinations, and data transformations. Generated jobs can use a variety of data transformations, including filter, project, union, join, and custom user-supplied SQL. Matt Su is a Senior Product Manager on the AWS Glue team.

Data Integration

Data Integration Data Lake Data Warehouse Software

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

AWS Big Data

NOVEMBER 13, 2023

Amazon Redshift is a fully managed data warehousing service that offers both provisioned and serverless options, making it more efficient to run and scale analytics without having to manage your data warehouse. These upstream data sources constitute the data producer components.

Data Warehouse

Data Warehouse Analytics Data Lake Data Science

Straumann Group is transforming dentistry with data, AI

CIO Business Intelligence

FEBRUARY 16, 2023

My vision is that I can give the keys to my businesses to manage their data and run their data on their own, as opposed to the Data & Tech team being at the center and helping them out,” says Iyengar, director of Data & Tech at Straumann Group North America.

Unstructured Data

Unstructured Data Data Lake Prescriptive Analytics Data Warehouse

Migrate Amazon Redshift from DC2 to RA3 to accommodate increasing data volumes and analytics demands

AWS Big Data

AUGUST 9, 2024

This trend is no exception for Dafiti , an ecommerce company that recognizes the importance of using data to drive strategic decision-making processes. Amazon Redshift is widely used for Dafiti’s data analytics, supporting approximately 100,000 daily queries from over 400 users across three countries. We started with 115 dc2.large

Data Lake

Data Lake Analytics Data Warehouse Data-driven

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. How does Data Virtualization manage data quality requirements?

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

What is DataOps? Collaborative, cross-functional analytics

CIO Business Intelligence

DECEMBER 22, 2022

Research firm Gartner further describes the methodology as one focused on “improving the communication, integration, and automation of data flows between data managers and data consumers across an organization.” Data scientists may also be included as key members of DataOps teams, according to Dunning. “I

Analytics

Analytics Machine Learning Data mining Software

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

IBM Big Data Hub

JUNE 15, 2023

It is comprised of commodity cloud object storage, open data and open table formats, and high-performance open-source query engines. To help organizations scale AI workloads, we recently announced IBM watsonx.data , a data store built on an open data lakehouse architecture and part of the watsonx AI and data platform.

Data Warehouse

Data Warehouse Data Lake Optimization Data-driven

Stored procedure enhancements in Amazon Redshift

AWS Big Data

SEPTEMBER 6, 2023

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. With Amazon Redshift, you can analyze all your data to derive holistic insights about your business and your customers. You can also schedule stored procedures to automate data processing on Amazon Redshift.

Data Warehouse

Data Warehouse Insurance Statistics Software

How to use foundation models and trusted governance to manage AI workflow risk

IBM Big Data Hub

OCTOBER 16, 2023

AI governance refers to the practice of directing, managing and monitoring an organization’s AI activities. It includes processes that trace and document the origin of data, models and associated metadata and pipelines for audits. It can be used with both on-premise and multi-cloud environments.

Risk

Risk Modeling Management Metadata

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To speed up the self-service analytics and foster innovation based on data, a solution was needed to provide ways to allow any team to create data products on their own in a decentralized manner. To create and manage the data products, smava uses Amazon Redshift , a cloud data warehouse.

Data Lake

Data Lake Data Warehouse Data-driven B2B

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

AWS Big Data

NOVEMBER 15, 2023

Overview of the BMW Cloud Data Hub At the BMW Group, Cloud Data Hub (CDH) is the central platform for managing company-wide data and data solutions. The difference lies in when and where data transformation takes place. In ETL, data is transformed before it’s loaded into the data warehouse.

Dashboards

Dashboards Analytics Metadata Data Warehouse

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

AWS Big Data

NOVEMBER 16, 2023

Amazon Redshift is a popular cloud data warehouse, offering a fully managed cloud-based service that seamlessly integrates with an organization’s Amazon Simple Storage Service (Amazon S3) data lake, real-time streams, machine learning (ML) workflows, transactional workflows, and much more—all while providing up to 7.9x

Enterprise

Enterprise Data Warehouse Data Lake Optimization

Lay the groundwork now for advanced analytics and AI

CIO Business Intelligence

AUGUST 3, 2023

It also helps him democratize credit union data so it can be used to improve customer service, automate the maintenance of such data by making various types of data easier to find, and provide chains of custody and audit controls to help meet regulatory needs.

Analytics

Analytics Data Lake Metadata Cost-Benefit

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Cloudera

JULY 21, 2022

Cloudera users can securely connect Rill to a source of event stream data, such as Cloudera DataFlow , model data into Rill’s cloud-based Druid service, and share live operational dashboards within minutes via Rill’s interactive metrics dashboard or any connected BI solution. Data is made queryable in real time. Apache Hive.

Metrics

Metrics Slice and Dice Data Warehouse Dashboards

Cloudera’s Open Data Lakehouse Supercharged with dbt Core(tm)

Cloudera

OCTOBER 7, 2022

We’re excited to announce the general availability of the open source adapters for dbt for all the engines in CDP — Apache Hive , Apache Impala , and Apache Spark, with added support for Apache Livy and Cloudera Data Engineering. The Open Data Lakehouse . Cloudera builds dbt adaptors for all engines in the open data lakehouse.

Data Warehouse

Data Warehouse Data Transformation Machine Learning Data Lake

Sirius About Snowflake Demo: How to Create a Reporting Dashboard

CDW Research Hub

OCTOBER 20, 2020

In our latest demo, we highlight how we’re piloting a modern analytic solution using Snowflake’s scalable cloud data warehouse in combination with Matillion and ThoughtSpot, through Snowflake’s Partner Connect service offering. Manageability and use for non-technical users, democratizing data enterprisewide. Contact us.

Dashboards

Dashboards Reporting Data Warehouse Structured Data

Simplify data transfer: Google BigQuery to Amazon S3 using Amazon AppFlow

AWS Big Data

OCTOBER 5, 2023

In today’s data-driven world, the ability to effortlessly move and analyze data across diverse platforms is essential. Amazon AppFlow , a fully managed data integration service, has been at the forefront of streamlining data transfer between AWS services, software as a service (SaaS) applications, and now Google BigQuery.

Data Warehouse

Data Warehouse Machine Learning Data Integration Data-driven

SAP Datasphere Powers Business at the Speed of Data

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

Webinars

Trending Sources

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Webinars

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

Database vs. Data Warehouse: What’s the Difference?

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

MLOps and DevOps: Why Data Makes It Different

The Best Data Management Tools For Small Businesses

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

How EUROGATE established a data mesh architecture using Amazon DataZone

Data’s dark secret: Why poor quality cripples AI and growth

Birst automates the creation of data warehouses in Snowflake

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

7 key Microsoft Azure analytics services (plus one extra)

Available Now! Automated Testing for Data Transformations

Ensuring Data Transformation Quality with dbt Core

Breaking down data silos for digital success

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Amazon Redshift data ingestion options

Top 10 Data Pipeline Interview Questions to Read in 2023

Happy Birthday, CDP Public Cloud

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

Introducing Amazon Q data integration in AWS Glue

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

Straumann Group is transforming dentistry with data, AI

Migrate Amazon Redshift from DC2 to RA3 to accommodate increasing data volumes and analytics demands

Biggest Trends in Data Visualization Taking Shape in 2022

What is DataOps? Collaborative, cross-functional analytics

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

Stored procedure enhancements in Amazon Redshift

How to use foundation models and trusted governance to manage AI workflow risk

How smava makes loans transparent and affordable using Amazon Redshift Serverless

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

Lay the groundwork now for advanced analytics and AI

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Cloudera’s Open Data Lakehouse Supercharged with dbt Core(tm)

Sirius About Snowflake Demo: How to Create a Reporting Dashboard

Simplify data transfer: Google BigQuery to Amazon S3 using Amazon AppFlow

Stay Connected