Data Transformation, Data Warehouse and Technology

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

In fact, by putting a single label like AI on all the steps of a data-driven business process, we have effectively not only blurred the process, but we have also blurred the particular characteristics that make each step separately distinct, uniquely critical, and ultimately dependent on specialized, specific technologies at each step.

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis.

Data Warehouse

Data Warehouse Analytics Testing Modeling

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

AWS Big Data

NOVEMBER 22, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that you can use to analyze your data at scale. She has helped many customers build large-scale data warehouse solutions in the cloud and on premises. She is passionate about data analytics and data science.

Data Warehouse

Data Warehouse Recreation/Entertainment Cost-Benefit Data-driven

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

MORE WEBINARS

MLOps and DevOps: Why Data Makes It Different

O'Reilly on Data

OCTOBER 19, 2021

What does a modern technology stack for streamlined ML processes look like? Why: Data Makes It Different. Data is at the core of any ML project, so data infrastructure is a foundational concern. Can’t we just fold it into existing DevOps best practices? How can you start applying the stack in practice today?

IT

IT Testing Experimentation Software

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

Your generated jobs can use a variety of data transformations, including filters, projections, unions, joins, and aggregations, giving you the flexibility to handle complex data processing requirements. Stuti Deshpande is a Big Data Specialist Solutions Architect at AWS.

Data Integration

Data Integration Visualization Data Processing Big Data

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

With the addition of these technologies alongside existing systems like terminal operating systems (TOS) and SAP, the number of data producers has grown substantially. However, much of this data remains siloed and making it accessible for different purposes and other departments remains complex. She can reached via LinkedIn.

IoT

IoT Machine Learning Metadata Data-driven

Data’s dark secret: Why poor quality cripples AI and growth

CIO Business Intelligence

APRIL 8, 2025

Data is the foundation of innovation, agility and competitive advantage in todays digital economy. As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Data quality is no longer a back-office concern.

Data Quality

Data Quality Data-driven Key Performance Indicator Metadata

Birst automates the creation of data warehouses in Snowflake

Birst BI

FEBRUARY 25, 2020

Managing large-scale data warehouse systems has been known to be very administrative, costly, and lead to analytic silos. The good news is that Snowflake, the cloud data platform, lowers costs and administrative overhead. When did you begin a technology partnership with Snowflake and why?

Data Warehouse

Data Warehouse Cost-Benefit Data Architecture Enterprise

Available Now! Automated Testing for Data Transformations

Wayne Yaddow

FEBRUARY 18, 2025

Selecting the strategies and tools for validating data transformations and data conversions in your data pipelines. Introduction Data transformations and data conversions are crucial to ensure that raw data is organized, processed, and ready for useful analysis.

Testing

Testing Data Transformation Data-driven Data Quality

7 key Microsoft Azure analytics services (plus one extra)

CIO Business Intelligence

JUNE 29, 2022

The recent announcement of the Microsoft Intelligent Data Platform makes that more obvious, though analytics is only one part of that new brand. Azure Data Factory. Azure Data Lake Analytics. Data warehouses are designed for questions you already know you want to ask about your data, again and again.

Data Lake

Data Lake Analytics Data Warehouse Machine Learning

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

AWS Big Data

JANUARY 30, 2025

Diagram 1: Overall architecture of the solution, using AWS Step Functions, Amazon Redshift and Amazon S3 The following AWS services were used to shape our new ETL architecture: Amazon Redshift A fully managed, petabyte-scale data warehouse service in the cloud. The following Diagram 2 shows this workflow.

Data Warehouse

Data Warehouse Data Architecture Machine Learning Data Transformation

Breaking down data silos for digital success

CIO Business Intelligence

NOVEMBER 7, 2023

Given the importance of sharing information among diverse disciplines in the era of digital transformation, this concept is arguably as important as ever. The aim is to normalize, aggregate, and eventually make available to analysts across the organization data that originates in various pockets of the enterprise.

Data Warehouse

Data Warehouse Digital Transformation Data-driven Reporting

Ensuring Data Transformation Quality with dbt Core

Wayne Yaddow

MARCH 14, 2025

How dbt Core aids data teams test, validate, and monitor complex data transformations and conversions Photo by NASA on Unsplash Introduction dbt Core, an open-source framework for developing, testing, and documenting SQL-based data transformations, has become a must-have tool for modern data teams as the complexity of data pipelines grows.

Data Transformation

Data Transformation Testing Unstructured Data Data Quality

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities. This makes sure the new data platform can meet current and future business goals.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

Today most of a company’s operations and strategic decisions heavily rely on data, so the importance of quality is even higher. And indeed, low-quality data is the leading cause of failure for advanced data and technology initiatives, to the tune of $9.7 Here, it all comes down to the data transformation error rate.

Data Quality

Data Quality Metrics Data-driven Management

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. Can it also help write SQL queries? The answer is yes.

Metadata

Metadata Data Lake Modeling Data Warehouse

Straumann Group is transforming dentistry with data, AI

CIO Business Intelligence

FEBRUARY 16, 2023

The company’s orthodontics business, for instance, makes heavy use of image processing to the point that unstructured data is growing at a pace of roughly 20% to 25% per month. Advances in imaging technology present Straumann Group with the opportunity to provide its customers with new capabilities to offer their clients.

Unstructured Data

Unstructured Data Data Lake Prescriptive Analytics Data Warehouse

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

DataKitchen

JULY 27, 2023

Azure Synapse Analytics Pipelines: Azure Synapse Analytics (formerly SQL Data Warehouse) provides data exploration, data preparation, data management, and data warehousing capabilities. It provides data prep, management, and enterprise data warehousing tools. It does the job.

Machine Learning

Machine Learning Cost-Benefit Data Transformation Testing

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

In this post, we delve into a case study for a retail use case, exploring how the Data Build Tool (dbt) was used effectively within an AWS environment to build a high-performing, efficient, and modern data platform. It does this by helping teams handle the T in ETL (extract, transform, and load) processes.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. This is something that you can learn more about in just about any technology blog.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

IBM Big Data Hub

JUNE 15, 2023

It is comprised of commodity cloud object storage, open data and open table formats, and high-performance open-source query engines. To help organizations scale AI workloads, we recently announced IBM watsonx.data , a data store built on an open data lakehouse architecture and part of the watsonx AI and data platform.

Data Warehouse

Data Warehouse Data Lake Optimization Data-driven

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

By treating data as a product, the bank is positioned to not only overcome current challenges, but to unlock new opportunities for growth, customer service, and competitive advantage. These nodes can implement analytical platforms like data lake houses, data warehouses, or data marts, all united by producing data products.

Metadata

Metadata Data Governance Data Quality Data-driven

Introducing Amazon Q data integration in AWS Glue

AWS Big Data

APRIL 30, 2024

Amazon Q Developer can now generate complex data integration jobs with multiple sources, destinations, and data transformations. Generated jobs can use a variety of data transformations, including filter, project, union, join, and custom user-supplied SQL.

Data Integration

Data Integration Data Lake Data Warehouse Software

Why Enterprise Data Lineage is Critical for the Success of Your Modern Data Stack

Octopai

NOVEMBER 13, 2022

A given modern data stack will usually include components for data ingestion from your data sources, data transformation, data storage, data analysis and reporting. In the ideal modern data stack, it should be easy to connect these components and no big deal to switch one component out for another.

Enterprise

Enterprise Data Warehouse Reporting Metadata

Lay the groundwork now for advanced analytics and AI

CIO Business Intelligence

AUGUST 3, 2023

When global technology company Lenovo started utilizing data analytics, they helped identify a new market niche for its gaming laptops, and powered remote diagnostics so their customers got the most from their servers and other devices. Each of the acquired companies had multiple data sets with different primary keys, says Hepworth. “We

Analytics

Analytics Data Lake Metadata Cost-Benefit

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

AWS Big Data

NOVEMBER 15, 2023

The difference lies in when and where data transformation takes place. In ETL, data is transformed before it’s loaded into the data warehouse. In ELT, raw data is loaded into the data warehouse first, then it’s transformed directly within the warehouse.

Analytics

Analytics Dashboards Metadata Data Warehouse

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

AWS Big Data

NOVEMBER 16, 2023

Amazon Redshift is a popular cloud data warehouse, offering a fully managed cloud-based service that seamlessly integrates with an organization’s Amazon Simple Storage Service (Amazon S3) data lake, real-time streams, machine learning (ML) workflows, transactional workflows, and much more—all while providing up to 7.9x

Enterprise

Enterprise Data Warehouse Data Lake Optimization

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Cloudera

JULY 21, 2022

Cloudera users can securely connect Rill to a source of event stream data, such as Cloudera DataFlow , model data into Rill’s cloud-based Druid service, and share live operational dashboards within minutes via Rill’s interactive metrics dashboard or any connected BI solution. Cloudera Data Warehouse). Apache Hive.

Metrics

Metrics Slice and Dice Data Warehouse Dashboards

Apply fine-grained access and transformation on the SUPER data type in Amazon Redshift

AWS Big Data

JUNE 19, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing ETL (extract, transform, and load), business intelligence (BI), and reporting tools. All columns should masked for them.

Data Warehouse

Data Warehouse Testing Sales Structured Data

Addressing the Three Scalability Challenges in Modern Data Platforms

Cloudera

NOVEMBER 22, 2021

In legacy analytical systems such as enterprise data warehouses, the scalability challenges of a system were primarily associated with computational scalability, i.e., the ability of a data platform to handle larger volumes of data in an agile and cost-efficient way. Introduction. CRM platforms).

Data Processing

Data Processing Data Warehouse Enterprise Visualization

Use AWS Glue to streamline SFTP data processing

AWS Big Data

AUGUST 13, 2024

We will create a glue studio job, add events and venue data from the SFTP server, carry out data transformations and load transformed data to s3. Sean is passionate about helping businesses harness the power of data to drive innovation and growth. Select Visual ETL in the central pane.

Data Processing

Data Processing Visualization Data Lake Data Processing

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

AWS Big Data

DECEMBER 21, 2023

As the volume and complexity of analytics workloads continue to grow, customers are looking for more efficient and cost-effective ways to ingest and analyse data. This enables organizations to streamline data integration and analytics with OpenSearch Service. Select the secret you created, and on the Actions menu, choose Delete.

Analytics

Analytics IT Data Lake Visualization

What is a DataOps Engineer?

DataKitchen

OCTOBER 5, 2021

Data operations (or data production) is a series of pipeline procedures that take raw data, progress through a series of processing and transformation steps, and output finished products in the form of dashboards, predictions, data warehouses or whatever the business requires. Their product is the data.

Testing

Testing Dashboards Measurement Experimentation

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

The modern data stack is a combination of various software tools used to collect, process, and store data on a well-integrated cloud-based data platform. It is known to have benefits in handling data due to its robustness, speed, and scalability. A typical modern data stack consists of the following: A data warehouse.

Data Warehouse

Data Warehouse Cost-Benefit Data Science Data Transformation

How SafetyCulture scales unpredictable dbt Cloud workloads in a cost-effective manner with Amazon Redshift

AWS Big Data

MARCH 16, 2023

SafetyCulture is a global technology company that puts the power of continuous improvement into everyone’s hands. Amazon Redshift is a fully managed data warehouse service that tens of thousands of customers use to manage analytics at scale. Refer to Getting started data sharing using the console for setup steps.

Data Warehouse

Data Warehouse Testing Snapshot Modeling

How Chime Financial uses AWS to build a serverless stream analytics platform and defeat fraudsters

AWS Big Data

SEPTEMBER 19, 2023

Chime is a financial technology company founded on the premise that basic banking services should be helpful, easy, and free. However, our legacy data warehouse-based solution was not equipped for this challenge. About the Authors Khandu Shinde is a Staff Engineer focused on Big Data Platforms and Solutions for Chime.

Analytics

Analytics Risk Big Data Machine Learning

How to use foundation models and trusted governance to manage AI workflow risk

IBM Big Data Hub

OCTOBER 16, 2023

As more businesses use AI systems and the technology continues to mature and change, improper use could expose a company to significant financial, operational, regulatory and reputational risks. But how trustworthy is that training data? Artificial intelligence (AI) adoption is still in its early stages.

Risk

Risk Modeling Management Metadata

Data platform trinity: Competitive or complementary?

IBM Big Data Hub

JANUARY 18, 2023

They defined it as : “ A data lakehouse is a new, open data management architecture that combines the flexibility, cost-efficiency, and scale of data lakes with the data management and ACID transactions of data warehouses, enabling business intelligence (BI) and machine learning (ML) on all data. ”.

Data Lake

Data Lake Data Warehouse Data-driven Metadata

How to modernize data lakes with a data lakehouse architecture

IBM Big Data Hub

JULY 5, 2023

It gained rapid popularity given its support for data transformations, streaming and SQL. But it never co-existed amicably within existing data lake environments. Fast forward almost 15 years and reality has clearly set in on the trade-offs and compromises this technology entailed.

Data Lake

Data Lake Metadata Cost-Benefit Data Warehouse

Assessing and interviewing data engineers from a distance

Insight

APRIL 8, 2020

Data architects and data modelers who specialize in areas such as schema design, identifying query access patterns and building and maintaining data warehouses. The problem requires use of one or two foundational data structures and details some sort of analysis that we’d like performed on a dataset.

Data Warehouse

Data Warehouse Cost-Benefit Software Optimization

Building Better Data Models to Unlock Next-Level Intelligence

Sisense

MAY 11, 2021

Here at Sisense, we think about this flow in five linear layers: Raw This is our data in its raw form within a data warehouse. We follow an ELT ( E xtract, L oad, T ransform) practice, as opposed to ETL, in which we opt to transform the data in the warehouse in the stages that follow.

Modeling

Modeling Big Data IoT Data Warehouse

The Key to Unlocking IT Modernization’s Power? Enterprise level Transformation

Cloudera

APRIL 12, 2021

As Cussatt put it, “data transformation isn’t about the IT, but about enabling the mission to be able to serve the veterans.” We have a multi-tiered governance to ensure we’re investing the dollars in appropriate technologies and making sure we’re getting the best value. Digital transformation can be an overwhelming undertaking.

Enterprise

Enterprise IT Digital Transformation Data Warehouse

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Alation

AUGUST 11, 2022

In the technology industry, even the most incremental trends often get framed in next-generation terminology. DataOps sprung up to connect data sources to data consumers. The data warehouse and analytical data stores moved to the cloud and disaggregated into the data mesh. Tools became stacks.

Metadata

Metadata Data Warehouse Data Quality Data Lake

How Infomedia built a serverless data pipeline with change data capture using AWS Glue and Apache Hudi

AWS Big Data

MARCH 15, 2023

The API retrieves data at runtime from an Amazon Aurora PostgreSQL-Compatible Edition database for end-user consumption. To populate the database, the Infomedia team developed a data pipeline using Amazon Simple Storage Service (Amazon S3) for data storage, AWS Glue for data transformations, and Apache Hudi for CDC and record-level updates.

Cost-Benefit

Cost-Benefit Data Processing Optimization Data-driven

SAP Datasphere Powers Business at the Speed of Data

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Webinars

Trending Sources

Accelerate your data workflows with Amazon Redshift Data API persistent sessions

Webinars

MLOps and DevOps: Why Data Makes It Different

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

How EUROGATE established a data mesh architecture using Amazon DataZone

Data’s dark secret: Why poor quality cripples AI and growth

Birst automates the creation of data warehouses in Snowflake

Available Now! Automated Testing for Data Transformations

7 key Microsoft Azure analytics services (plus one extra)

How Open Universities Australia modernized their data platform and significantly reduced their ETL costs with AWS Cloud Development Kit and AWS Step Functions

Breaking down data silos for digital success

Ensuring Data Transformation Quality with dbt Core

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Straumann Group is transforming dentistry with data, AI

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

Biggest Trends in Data Visualization Taking Shape in 2022

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Introducing Amazon Q data integration in AWS Glue

Why Enterprise Data Lineage is Critical for the Success of Your Modern Data Stack

Lay the groundwork now for advanced analytics and AI

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Apply fine-grained access and transformation on the SUPER data type in Amazon Redshift

Addressing the Three Scalability Challenges in Modern Data Platforms

Use AWS Glue to streamline SFTP data processing

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

What is a DataOps Engineer?

The Modern Data Stack Explained: What The Future Holds

How SafetyCulture scales unpredictable dbt Cloud workloads in a cost-effective manner with Amazon Redshift

How Chime Financial uses AWS to build a serverless stream analytics platform and defeat fraudsters

How to use foundation models and trusted governance to manage AI workflow risk

Data platform trinity: Competitive or complementary?

How to modernize data lakes with a data lakehouse architecture

Assessing and interviewing data engineers from a distance

Building Better Data Models to Unlock Next-Level Intelligence

The Key to Unlocking IT Modernization’s Power? Enterprise level Transformation

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

How Infomedia built a serverless data pipeline with change data capture using AWS Glue and Apache Hudi

Stay Connected