Data Transformation, Data Warehouse and IT

MLOps and DevOps: Why Data Makes It Different

O'Reilly on Data

OCTOBER 19, 2021

Why: Data Makes It Different. In contrast, a defining feature of ML-powered applications is that they are directly exposed to a large amount of messy, real-world data which is too complex to be understood and modeled by hand. However, the concept is quite abstract. Can’t we just fold it into existing DevOps best practices?

IT

IT Testing Experimentation Software

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

This is where we dispel an old “big data” notion (heard a decade ago) that was expressed like this: “we need our data to run at the speed of business.” Instead, what we really need is for our business to run at the speed of data. This is where SAP Datasphere (the next generation of SAP Data Warehouse Cloud) comes in.

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

AWS Big Data

NOVEMBER 22, 2024

The need for streamlined data transformations As organizations increasingly adopt cloud-based data lakes and warehouses, the demand for efficient data transformation tools has grown. Using Athena and the dbt adapter, you can transform raw data in Amazon S3 into well-structured tables suitable for analytics.

Data Lake

Data Lake Data Warehouse Cost-Benefit Data Transformation

Webinars

Going Beyond Chatbots: Connecting AI to Your Tools, Systems, & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

MORE WEBINARS

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis.

Data Warehouse

Data Warehouse Analytics Testing Modeling

Database vs. Data Warehouse: What’s the Difference?

Jet Global

MAY 28, 2019

The success of any business into the next year and beyond will depend entirely on the volume, accuracy, and reportability of the data they collect—and how well the business can analyze, extract insight from, and take action on that data. All About That (Data)Base. Enter the Warehouse.

Data Warehouse

Data Warehouse Reporting Business Intelligence Sales

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

AWS Big Data

NOVEMBER 22, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that you can use to analyze your data at scale. Redshift Data API provides a secure HTTP endpoint and integration with AWS SDKs. In the next step, copy data from Amazon Simple Storage Service (Amazon S3) to the temporary table.

Data Warehouse

Data Warehouse Recreation/Entertainment Cost-Benefit Data-driven

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

AWS Big Data

JANUARY 6, 2025

With Amazon AppFlow, you can run data flows at nearly any scale and at the frequency you chooseon a schedule, in response to a business event, or on demand. You can configure data transformation capabilities such as filtering and validation to generate rich, ready-to-use data as part of the flow itself, without additional steps.

Analytics

Analytics Data Warehouse Big Data Metrics

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

Your generated jobs can use a variety of data transformations, including filters, projections, unions, joins, and aggregations, giving you the flexibility to handle complex data processing requirements. The following video provides a full demonstration of the experience with AWS Glue Studio.

Data Integration

Data Integration Visualization Data Processing Big Data

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

These issues dont just hinder next-gen analytics and AI; they erode trust, delay transformation and diminish business value. Data quality is no longer a back-office concern. In this article, I am drawing from firsthand experience working with CIOs, CDOs, CTOs and transformation leaders across industries.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

AWS Big Data

JANUARY 30, 2025

Diagram 1: Overall architecture of the solution, using AWS Step Functions, Amazon Redshift and Amazon S3 The following AWS services were used to shape our new ETL architecture: Amazon Redshift A fully managed, petabyte-scale data warehouse service in the cloud. Our infrastructure was defined as code using the AWS CDK.

Data Warehouse

Data Warehouse Data Architecture Machine Learning Data Transformation

Birst automates the creation of data warehouses in Snowflake

Birst BI

FEBRUARY 25, 2020

Managing large-scale data warehouse systems has been known to be very administrative, costly, and lead to analytic silos. The good news is that Snowflake, the cloud data platform, lowers costs and administrative overhead. The result is a lower total cost of ownership and trusted data and analytics.

Data Warehouse

Data Warehouse Cost-Benefit Data Architecture Enterprise

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Their terminal operations rely heavily on seamless data flows and the management of vast volumes of data. Recently, EUROGATE has developed a digital twin for its container terminal Hamburg (CTH), generating millions of data points every second from Internet of Things (IoT)devices attached to its container handling equipment (CHE).

IoT

IoT Machine Learning Metadata Data-driven

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

AWS Big Data

DECEMBER 21, 2023

As the volume and complexity of analytics workloads continue to grow, customers are looking for more efficient and cost-effective ways to ingest and analyse data. Using a native AWS Glue connector increases agility, simplifies data movement, and improves data quality. Choose Save to save your job, and choose Run to run the job.

Analytics

Analytics IT Data Lake Visualization

7 key Microsoft Azure analytics services (plus one extra)

CIO Business Intelligence

JUNE 29, 2022

Insights hidden in your data are essential for optimizing business operations, finetuning your customer experience, and developing new products — or new lines of business, like predictive maintenance. And as businesses contend with increasingly large amounts of data, the cloud is fast becoming the logical place where analytics work gets done.

Data Lake

Data Lake Analytics Data Warehouse Machine Learning

Ensuring Data Transformation Results with Great Expectations

Wayne Yaddow

MARCH 12, 2025

The framework ensures that your data transformations comply with rigorous specifications from the moment they are created through every iteration of your pipeline. Great Expectations can enable a wide range of data transformations and conversion operations.

Data Transformation

Data Transformation Data Quality Testing Data Warehouse

Ensuring Data Transformation Quality with dbt Core

Wayne Yaddow

MARCH 14, 2025

How dbt Core aids data teams test, validate, and monitor complex data transformations and conversions Photo by NASA on Unsplash Introduction dbt Core, an open-source framework for developing, testing, and documenting SQL-based data transformations, has become a must-have tool for modern data teams as the complexity of data pipelines grows.

Data Transformation

Data Transformation Testing Unstructured Data Data Quality

Breaking down data silos for digital success

CIO Business Intelligence

NOVEMBER 7, 2023

For years, IT and business leaders have been talking about breaking down the data silos that exist within their organizations. Given the importance of sharing information among diverse disciplines in the era of digital transformation, this concept is arguably as important as ever.

Data Warehouse

Data Warehouse Digital Transformation Data-driven Reporting

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. Can it also help write SQL queries? The answer is yes. Choose Notebook instances.

Metadata

Metadata Data Lake Modeling Data Warehouse

The Best Data Management Tools For Small Businesses

Smart Data Collective

APRIL 29, 2020

What is data management? Data management can be defined in many ways. The extraction of raw data, transforming to a suitable format for business needs, and loading into a data warehouse. Data transformation. Data analytics and visualisation. Custom applications can also be integrated.

Management

Management Data Warehouse Digital Transformation Dashboards

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

What Is Data Quality Management (DQM)? Data quality management is a set of practices that aim at maintaining a high quality of information. It goes all the way from the acquisition of data and the implementation of advanced data processes, to an effective distribution of data.

Data Quality

Data Quality Metrics Data-driven Management

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

Building a data platform involves various approaches, each with its unique blend of complexities and solutions. In this post, we delve into a case study for a retail use case, exploring how the Data Build Tool (dbt) was used effectively within an AWS environment to build a high-performing, efficient, and modern data platform.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities. This makes sure the new data platform can meet current and future business goals.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

Amazon Redshift , a warehousing service, offers a variety of options for ingesting data from diverse sources into its high-performance, scalable environment. It uses massively parallel processing (MPP) architecture in Amazon Redshift to read and load large amounts of data in parallel from files or data from supported data sources.

IoT

IoT Data Warehouse Cost-Benefit Reporting

Happy Birthday, CDP Public Cloud

Cloudera

OCTOBER 13, 2020

In the beginning, CDP ran only on AWS with a set of services that supported a handful of use cases and workload types: CDP Data Warehouse: a kubernetes-based service that allows business analysts to deploy data warehouses with secure, self-service access to enterprise data. That Was Then. New Services.

Data Warehouse

Data Warehouse Machine Learning Visualization Data Lake

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

dbt is an open source, SQL-first templating engine that allows you to write repeatable and extensible data transforms in Python and SQL. dbt is predominantly used by data warehouses (such as Amazon Redshift ) customers who are looking to keep their data transform logic separate from storage and engine.

Data Lake

Data Lake Management Metrics Data Warehouse

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

DataKitchen

JULY 27, 2023

While working in Azure with our customers, we have noticed several standard Azure tools people use to develop data pipelines and ETL or ELT processes. We counted ten ‘standard’ ways to transform and set up batch data pipelines in Microsoft Azure. You can use it for big data analytics and machine learning workloads.

Machine Learning

Machine Learning Cost-Benefit Data Transformation Testing

Straumann Group is transforming dentistry with data, AI

CIO Business Intelligence

FEBRUARY 16, 2023

The company’s orthodontics business, for instance, makes heavy use of image processing to the point that unstructured data is growing at a pace of roughly 20% to 25% per month. For example, imaging data can be used to show patients how an aligner will change their appearance over time. “It

Unstructured Data

Unstructured Data Data Lake Prescriptive Analytics Data Warehouse

Stored procedure enhancements in Amazon Redshift

AWS Big Data

SEPTEMBER 6, 2023

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. With Amazon Redshift, you can analyze all your data to derive holistic insights about your business and your customers. You can also schedule stored procedures to automate data processing on Amazon Redshift.

Data Warehouse

Data Warehouse Insurance Statistics Software

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. We would like to talk about data visualization and its role in the big data movement.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Migrate Amazon Redshift from DC2 to RA3 to accommodate increasing data volumes and analytics demands

AWS Big Data

AUGUST 9, 2024

Dafiti’s data infrastructure relies heavily on ETL and ELT processes, with approximately 2,500 unique processes run daily. Amazon Redshift at Dafiti Amazon Redshift is a fully managed data warehouse service, and was adopted by Dafiti in 2017. We started with 115 dc2.large We removed the DC2 cluster and completed the migration.

Data Lake

Data Lake Analytics Data Warehouse Data-driven

Introducing Amazon Q data integration in AWS Glue

AWS Big Data

APRIL 30, 2024

This reduces the time and effort you need to learn, build, and run data integration jobs using AWS Glue data integration engines. For example, you can ask Amazon Q Developer to generate a complete extract, transform, and load (ETL) script or code snippet for individual ETL operations.

Data Integration

Data Integration Data Lake Data Warehouse Software

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

AWS Big Data

NOVEMBER 13, 2023

Amazon Redshift is a fully managed data warehousing service that offers both provisioned and serverless options, making it more efficient to run and scale analytics without having to manage your data warehouse. These upstream data sources constitute the data producer components.

Data Warehouse

Data Warehouse Analytics Data Lake Data Science

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

IBM Big Data Hub

JUNE 15, 2023

The proliferation of data silos also inhibits the unification and enrichment of data which is essential to unlocking the new insights. Moreover, increased regulatory requirements make it harder for enterprises to democratize data access and scale the adoption of analytics and artificial intelligence (AI).

Data Warehouse

Data Warehouse Data Lake Optimization Data-driven

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

AWS Big Data

NOVEMBER 16, 2023

Amazon Redshift is a popular cloud data warehouse, offering a fully managed cloud-based service that seamlessly integrates with an organization’s Amazon Simple Storage Service (Amazon S3) data lake, real-time streams, machine learning (ML) workflows, transactional workflows, and much more—all while providing up to 7.9x

Enterprise

Enterprise Data Warehouse Data Lake Optimization

Cloudera’s Open Data Lakehouse Supercharged with dbt Core(tm)

Cloudera

OCTOBER 7, 2022

We’re excited to announce the general availability of the open source adapters for dbt for all the engines in CDP — Apache Hive , Apache Impala , and Apache Spark, with added support for Apache Livy and Cloudera Data Engineering. The Open Data Lakehouse . Cloudera builds dbt adaptors for all engines in the open data lakehouse.

Data Warehouse

Data Warehouse Data Transformation Machine Learning Data Lake

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

AWS Big Data

NOVEMBER 15, 2023

It seamlessly consolidates data from various data sources within AWS, including AWS Cost Explorer (and forecasting with Cost Explorer ), AWS Trusted Advisor , and AWS Compute Optimizer. Overview of the BMW Cloud Data Hub At the BMW Group, Cloud Data Hub (CDH) is the central platform for managing company-wide data and data solutions.

Dashboards

Dashboards Analytics Metadata Data Warehouse

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

In today’s rapidly evolving financial landscape, data is the bedrock of innovation, enhancing customer and employee experiences and securing a competitive edge. Like many large financial institutions, ANZ Institutional Division operated with siloed data practices and centralized data management teams.

Metadata

Metadata Data Governance Data Quality Data-driven

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

The data volume is in double-digit TBs with steady growth as business and data sources evolve. smava’s Data Platform team faced the challenge to deliver data to stakeholders with different SLAs, while maintaining the flexibility to scale up and down while staying cost-efficient.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Lay the groundwork now for advanced analytics and AI

CIO Business Intelligence

AUGUST 3, 2023

When global technology company Lenovo started utilizing data analytics, they helped identify a new market niche for its gaming laptops, and powered remote diagnostics so their customers got the most from their servers and other devices.

Analytics

Analytics Data Lake Metadata Cost-Benefit

Sirius About Snowflake Demo: How to Create a Reporting Dashboard

CDW Research Hub

OCTOBER 20, 2020

Snowflake is a modern cloud data platform that boasts instant elasticity, secure data sharing and per-second pricing across multiple clouds. Its ability to natively load and use SQL to query semi-structured and structured data within a single system simplifies your data engineering. Have questions? Contact us.

Dashboards

Dashboards Reporting Data Warehouse Structured Data

What is DataOps? Collaborative, cross-functional analytics

CIO Business Intelligence

DECEMBER 22, 2022

DataOps (data operations) is an agile, process-oriented methodology for developing and delivering analytics. It brings together DevOps teams with data engineers and data scientists to provide the tools, processes, and organizational structures to support the data-focused enterprise. What is DataOps?

Analytics

Analytics Machine Learning Data mining Software

Apply fine-grained access and transformation on the SUPER data type in Amazon Redshift

AWS Big Data

JUNE 19, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing ETL (extract, transform, and load), business intelligence (BI), and reporting tools. All columns should masked for them.

Data Warehouse

Data Warehouse Testing Sales Structured Data

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Cloudera

JULY 21, 2022

Cloudera users can securely connect Rill to a source of event stream data, such as Cloudera DataFlow , model data into Rill’s cloud-based Druid service, and share live operational dashboards within minutes via Rill’s interactive metrics dashboard or any connected BI solution. Data is made queryable in real time. Apache Hive.

Metrics

Metrics Slice and Dice Data Warehouse Dashboards

Why Enterprise Data Lineage is Critical for the Success of Your Modern Data Stack

Octopai

NOVEMBER 13, 2022

The modern data stack is a data management system built out of cloud-based data systems. A given modern data stack will usually include components for data ingestion from your data sources, data transformation, data storage, data analysis and reporting.

Enterprise

Enterprise Data Warehouse Reporting Metadata

MLOps and DevOps: Why Data Makes It Different

SAP Datasphere Powers Business at the Speed of Data

Webinars

Trending Sources

From data lakes to insights: dbt adapter for Amazon Athena now supported in dbt Cloud

Webinars

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Database vs. Data Warehouse: What’s the Difference?

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

Ingest data from Google Analytics 4 and Google Sheets to Amazon Redshift using Amazon AppFlow

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Data’s dark secret: Why poor quality cripples AI and growth

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

Birst automates the creation of data warehouses in Snowflake

How EUROGATE established a data mesh architecture using Amazon DataZone

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

7 key Microsoft Azure analytics services (plus one extra)

Ensuring Data Transformation Results with Great Expectations

Ensuring Data Transformation Quality with dbt Core

Breaking down data silos for digital success

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

The Best Data Management Tools For Small Businesses

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Amazon Redshift data ingestion options

Happy Birthday, CDP Public Cloud

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

Straumann Group is transforming dentistry with data, AI

Stored procedure enhancements in Amazon Redshift

Biggest Trends in Data Visualization Taking Shape in 2022

Migrate Amazon Redshift from DC2 to RA3 to accommodate increasing data volumes and analytics demands

Introducing Amazon Q data integration in AWS Glue

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

Cloudera’s Open Data Lakehouse Supercharged with dbt Core(tm)

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Lay the groundwork now for advanced analytics and AI

Sirius About Snowflake Demo: How to Create a Reporting Dashboard

What is DataOps? Collaborative, cross-functional analytics

Apply fine-grained access and transformation on the SUPER data type in Amazon Redshift

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Why Enterprise Data Lineage is Critical for the Success of Your Modern Data Stack

Stay Connected