Data Processing, Data Warehouse and Machine Learning

Oracle Wants to Be the Database for AI

David Menninger's Analyst Perspectives

MAY 15, 2025

Oracle recently hosted its annual Database Analyst Summit, sharing the vision and strategy for its data platform. While much of the event was under non-disclosure as product plans and launch schedules are finalized, it still served as a useful recap of the broad portfolio of data platform capabilities that Oracle has to offer.

Data Lake

Data Lake Data Warehouse Machine Learning Software

The DataOps Vendor Landscape, 2021

DataKitchen

APRIL 13, 2021

We have also included vendors for the specific use cases of ModelOps, MLOps, DataGovOps and DataSecOps which apply DataOps principles to machine learning, AI, data governance, and data security operations. . Dagster / ElementL — A data orchestrator for machine learning, analytics, and ETL. .

Testing

Testing Machine Learning Consulting Data Science

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

While customers can perform some basic analysis within their operational or transactional databases, many still need to build custom data pipelines that use batch or streaming jobs to extract, transform, and load (ETL) data into their data warehouse for more comprehensive analysis.

Data Warehouse

Data Warehouse Analytics Testing Sales

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

The following requirements were essential to decide for adopting a modern data mesh architecture: Domain-oriented ownership and data-as-a-product : EUROGATE aims to: Enable scalable and straightforward data sharing across organizational boundaries. Eliminate centralized bottlenecks and complex data pipelines.

IoT

IoT Machine Learning Metadata Data-driven

The future of data: A 5-pillar approach to modern data management

CIO Business Intelligence

DECEMBER 11, 2024

It was not alive because the business knowledge required to turn data into value was confined to individuals minds, Excel sheets or lost in analog signals. We are now deciphering rules from patterns in data, embedding business knowledge into ML models, and soon, AI agents will leverage this data to make decisions on behalf of companies.

Management

Management Data Governance Data Science Reporting

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

AWS Big Data

MAY 30, 2024

Amazon Redshift is a fast, scalable, and fully managed cloud data warehouse that allows you to process and run your complex SQL analytics workloads on structured and semi-structured data. The system had an integration with legacy backend services that were all hosted on premises.

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Structured Data

How Will The Cloud Impact Data Warehousing Technologies?

Smart Data Collective

APRIL 8, 2020

Dating back to the 1970s, the data warehousing market emerged when computer scientist Bill Inmon first coined the term ‘data warehouse’. Created as on-premise servers, the early data warehouses were built to perform on just a gigabyte scale. Cloud based solutions are the future of the data warehousing market.

Technology

Technology Data Warehouse Big Data Machine Learning

Take Your SQL Skills To The Next Level With These Popular SQL Books

datapine

SEPTEMBER 27, 2022

With a MySQL dashboard builder , for example, you can connect all the data with a few clicks. A host of notable brands and retailers with colossal inventories and multiple site pages use SQL to enhance their site’s structure functionality and MySQL reporting processes. It is a must-read for understanding data warehouse design.

Business Intelligence

Business Intelligence Data Warehouse Data Processing Data mining

Scaling RISE with SAP data and AWS Glue

AWS Big Data

NOVEMBER 29, 2024

Customers often want to augment and enrich SAP source data with other non-SAP source data. Such analytic use cases can be enabled by building a data warehouse or data lake. Customers can now use the AWS Glue SAP OData connector to extract data from SAP.

Visualization

Visualization Data Processing Data-driven Cost-Benefit

5 misconceptions about cloud data warehouses

IBM Big Data Hub

FEBRUARY 2, 2023

In today’s world, data warehouses are a critical component of any organization’s technology ecosystem. They provide the backbone for a range of use cases such as business intelligence (BI) reporting, dashboarding, and machine-learning (ML)-based predictive analytics, that enable faster decision making and insights.

Data Warehouse

Data Warehouse Cost-Benefit Unstructured Data Data Architecture

Deciphering The Seldom Discussed Differences Between Data Mining and Data Science

Smart Data Collective

NOVEMBER 18, 2020

Where to Use Data Science? Data Science is used in different areas of our life and can help companies to deal with the following situations: Using predictive analytics to prevent fraud Using machine learning to streamline marketing practices Using data analytics to create more effective actuarial processes.

Data mining

Data mining Data Science Informatics Statistics

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

AWS Big Data

MARCH 6, 2025

Tens of thousands of customers use Amazon Redshift for modern data analytics at scale, delivering up to three times better price-performance and seven times better throughput than other cloud data warehouses. He has over 19 years of experience architecting, building, leading, and maintaining big data platforms.

Visualization

Visualization Sales Data Warehouse Management

Data science vs. machine learning: What’s the difference?

IBM Big Data Hub

JULY 6, 2023

While data science and machine learning are related, they are very different fields. In a nutshell, data science brings structure to big data while machine learning focuses on learning from the data itself. What is data science? What is machine learning?

Machine Learning

Machine Learning Data Science Statistics Deep Learning

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

One of the key challenges in modern big data management is facilitating efficient data sharing and access control across multiple EMR clusters. Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated. The producer account will host the EMR cluster and S3 buckets.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

AWS Big Data

OCTOBER 18, 2023

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that provides the flexibility to use provisioned or serverless compute for your analytical workloads. Modern analytics is much wider than SQL-based data warehousing. You can isolate workloads using data sharing, while using the same underlying datasets.

Analytics

Analytics Data Warehouse Dashboards Testing

Preparing the foundations for Generative AI

CIO Business Intelligence

FEBRUARY 20, 2024

All data is held in a lake-centric hub, and protected by a strong, universal security model, with data loss prevention and protection for sensitive data, and features for auditing and forensic investigation already built-in.

Cost-Benefit

Cost-Benefit Data Lake Data Warehouse Data Processing

From Excel to AI: How Liberty Dental revolutionized care management

CIO Business Intelligence

OCTOBER 17, 2024

So, we aggregated all this data, applied some machine learning algorithms on top of it and then fed it into large language models (LLMs) and now use generative AI (genAI), which gives us an output of these care plans. We had a kind of small data warehouse on-prem. But the biggest point is data governance.

Management

Management Insurance ROI Cost-Benefit

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

AWS Big Data

FEBRUARY 27, 2024

Four-layered data lake and data warehouse architecture – The architecture comprises four layers, including the analytical layer, which houses purpose-built facts and dimension datasets that are hosted in Amazon Redshift. This enables data-driven decision-making across the organization.

Data-driven

Data-driven Data Lake Data Quality Data Governance

The Multifaceted Value Proposition of the Cloudera Data Platform

Cloudera

FEBRUARY 22, 2021

Providing a comprehensive set of diverse analytical frameworks for different use cases across the data lifecycle (data streaming, data engineering, data warehousing, operational database and machine learning) while at the same time seamlessly integrating data content via the Shared Data Experience (SDX), a layer that separates compute and storage.

Cost-Benefit

Cost-Benefit Data Warehouse Data Processing Data Governance

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Large-scale data warehouse migration to the cloud is a complex and challenging endeavor that many organizations undertake to modernize their data infrastructure, enhance data management capabilities, and unlock new business opportunities. This makes sure the new data platform can meet current and future business goals.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

How Gilead used Amazon Redshift to quickly and cost-effectively load third-party medical claims data

AWS Big Data

NOVEMBER 8, 2023

Because Gilead is expanding into biologics and large molecule therapies, and has an ambitious goal of launching 10 innovative therapies by 2030, there is heavy emphasis on using data with AI and machine learning (ML) to accelerate the drug discovery pipeline. You pay only for the compute resources and storage that you use.

Data Lake

Data Lake Data Warehouse Cost-Benefit Optimization

What is business intelligence? Transforming data into business insights

CIO Business Intelligence

JANUARY 20, 2023

Improved employee satisfaction: Providing business users access to data without having to contact analysts or IT can reduce friction, increase productivity, and facilitate faster results. Whereas BI studies historical data to guide business decision-making, business analytics is about looking forward.

Business Intelligence

Business Intelligence Dashboards Data mining OLAP

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

The currently available choices include: The Amazon Redshift COPY command can load data from Amazon Simple Storage Service (Amazon S3), Amazon EMR , Amazon DynamoDB , or remote hosts over SSH. This native feature of Amazon Redshift uses massive parallel processing (MPP) to load objects directly from data sources into Redshift tables.

IoT

IoT Data Warehouse Cost-Benefit Reporting

Power enterprise-grade Data Vaults with Amazon Redshift – Part 2

AWS Big Data

NOVEMBER 16, 2023

Amazon Redshift is a popular cloud data warehouse, offering a fully managed cloud-based service that seamlessly integrates with an organization’s Amazon Simple Storage Service (Amazon S3) data lake, real-time streams, machine learning (ML) workflows, transactional workflows, and much more—all while providing up to 7.9x

Enterprise

Enterprise Data Warehouse Snapshot Cost-Benefit

Building and Evaluating GenAI Knowledge Management Systems using Ollama, Trulens and Cloudera

Cloudera

MAY 23, 2024

In modern enterprises, the exponential growth of data means organizational knowledge is distributed across multiple formats, ranging from structured data stores such as data warehouses to multi-format data stores like data lakes. Langchain) and LLM evaluations (e.g.

Management

Management Metrics Data Processing Machine Learning

Use AWS Glue to streamline SFTP data processing

AWS Big Data

AUGUST 13, 2024

In today’s data-driven world, seamless integration and transformation of data across diverse sources into actionable insights is paramount. With AWS Glue, you can discover and connect to hundreds of diverse data sources and manage your data in a centralized data catalog. Choose Store a new secret.

Data Processing

Data Processing Visualization Data Lake Data Processing

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

These nodes can implement analytical platforms like data lake houses, data warehouses, or data marts, all united by producing data products. By treating the data as a product, the outcome is a reusable asset that outlives a project and meets the needs of the enterprise consumer.

Metadata

Metadata Data Governance Data Quality Data-driven

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

IBM Big Data Hub

JUNE 15, 2023

It is comprised of commodity cloud object storage, open data and open table formats, and high-performance open-source query engines. To help organizations scale AI workloads, we recently announced IBM watsonx.data , a data store built on an open data lakehouse architecture and part of the watsonx AI and data platform.

Data Warehouse

Data Warehouse Data Lake Optimization Data-driven

Bringing More AI to Snowflake, the Data Cloud

DataRobot Blog

FEBRUARY 28, 2023

Integrating different systems, data sources, and technologies within an ecosystem can be difficult and time-consuming, leading to inefficiencies, data silos, broken machine learning models, and locked ROI. Exploratory Data Analysis After we connect to Snowflake, we can start our ML experiment.

Data Processing

Data Processing Experimentation Machine Learning Data Warehouse

96 Percent of Businesses Can’t Be Wrong: How Hybrid Cloud Came to Dominate the Data Sector

Cloudera

JANUARY 26, 2022

Network operating systems let computers communicate with each other; and data storage grew—a 5MB hard drive was considered limitless in 1983 (when compared to a magnetic drum with memory capacity of 10 kB from the 1960s). The amount of data being collected grew, and the first data warehouses were developed.

Data Processing

Data Processing IoT Data Warehouse Cost-Benefit

A Guide To Starting A Career In Business Intelligence & The BI Skills You Need

datapine

MARCH 31, 2022

On the flip side, if you enjoy diving deep into the technical side of things, with the right mix of skills for business intelligence you can work a host of incredibly interesting problems that will keep you in flow for hours on end. This could involve anything from learning SQL to buying some textbooks on data warehouses.

Business Intelligence

Business Intelligence Statistics Visualization Data-driven

The New Cloudera

Cloudera

JANUARY 3, 2019

Our pre-merger customer bases have very little overlap, giving us a considerable enterprise installed base whose demand for IoT, analytics, data warehousing, and machine learning continues to grow. It’s clear today that the data warehouse industry is undergoing a major transformation. We intend to win.

Machine Learning

Machine Learning IoT Data Warehouse Enterprise

Perform secure database write-backs with Amazon QuickSight

AWS Big Data

MAY 10, 2023

A write-back is the ability to update a data mart, data warehouse, or any other database backend from within BI dashboards and analyze the updated data in near-real time within the dashboard itself. AnyCompany currently uses Amazon Redshift as their enterprise data warehouse platform and QuickSight as their BI solution.

Dashboards

Dashboards Data Warehouse Visualization Data Processing

Get Your Analytics Insights Instantly – Without Abandoning Central IT

Cloudera

JANUARY 21, 2021

While cloud-native, point-solution data warehouse services may serve your immediate business needs, there are dangers to the corporation as a whole when you do your own IT this way. Cloudera Data Warehouse (CDW) is here to save the day! CDW is an integrated data warehouse service within Cloudera Data Platform (CDP).

Data Warehouse

Data Warehouse Data Lake IT Analytics

Governing data in relational databases using Amazon DataZone

AWS Big Data

MAY 7, 2024

It also makes it easier for engineers, data scientists, product managers, analysts, and business users to access data throughout an organization to discover, use, and collaborate to derive data-driven insights. The architecture illustrates how the solution works in a multi-account environment, which is a common scenario.

Metadata

Metadata Data Lake Data Processing Data-driven

Addressing the Three Scalability Challenges in Modern Data Platforms

Cloudera

NOVEMBER 22, 2021

In legacy analytical systems such as enterprise data warehouses, the scalability challenges of a system were primarily associated with computational scalability, i.e., the ability of a data platform to handle larger volumes of data in an agile and cost-efficient way. Introduction. public, private, hybrid cloud)?

Data Processing

Data Processing Data Warehouse Enterprise Visualization

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

However, as data processing at scale solutions grow, organizations need to build more and more features on top of their data lakes. Additionally, the task of maintaining and managing files in the data lake can be tedious and sometimes complex. Data can be organized into three different zones, as shown in the following figure.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

AWS Big Data

NOVEMBER 8, 2023

In today’s data-driven landscape, the quality of data is the foundation upon which the success of organizations and innovations stands. High-quality data is not just about accuracy; it’s also about timeliness. You’ll learn the “why” behind the solution and see it come to life—complete with the inevitable errors.

Data-driven

Data-driven Machine Learning Data Lake Cost-Benefit

CDP Private Cloud is a Game-changer for Partners

Cloudera

SEPTEMBER 2, 2020

In short, CDP Private Cloud is a game-changer for Cloudera partners as it provides opportunities to help their customers modernize their data platform by breaking up monolithic architectures without leaving their data centers! . Over a third of these Enterprises are actively executing on a strategy to move to hybrid IT.

Cost-Benefit

Cost-Benefit Data Warehouse Data Lake Machine Learning

South Africa’s King Price Insurance moves to cloud as business grows

CIO Business Intelligence

MARCH 16, 2022

We’ve also started experimenting with specific cloud services that focus on artificial intelligence (AI) and machine learning – from traditional optimisation models published as bespoke intelligent services to proprietary intelligent services in the form of chatbots, text, voice and image processing. Who did you involve and why?

Insurance

Insurance Cost-Benefit Data Processing Strategy

Drinking our own champagne – Cloudera upgrades to CDP Private Cloud

Cloudera

APRIL 21, 2021

We took a pre-upgrade downtime in production to accomplish some of the prerequisite tasks like database upgrade and operating system upgrades on our master hosts. That downtime also allowed us to test the disaster recovery environment that our 24×7 users would interact with during the production upgrade.

Testing

Testing Data Processing Interactive Data Warehouse

Generative AI for the Enterprise

Cloudera

MAY 31, 2023

An example would be asking about the price of CDW (Cloudera Data Warehouse), as the language model doesn’t have access to the enterprise price list and standard discount rates the answer will probably provide the typical rates for a collision damage waiver (also abbreviated as CDW), the answer will be factual but out of context.

Enterprise

Enterprise Data Processing Machine Learning Experimentation

Themes and Conferences per Pacoid, Episode 11

Domino Data Lab

JULY 2, 2019

Doesn’t this seem like a worthy goal for machine learning—to make the machines learn to work more effectively? pointed out in “ The Case for Learned Index Structures ” (see video ) the internal smarts (B-trees, etc.) of relational databases represent early forms of machine learning. With me so far?

Metadata

Metadata Data Science Machine Learning Data-driven

The value of CDP Public Cloud over legacy Hadoop-on-IaaS implementations

Cloudera

MAY 18, 2021

CDP Public Cloud leverages the elastic nature of the cloud hosting model to align spend on Cloudera subscription (measured in Cloudera Consumption Units or CCUs) with actual usage of the platform. Machine Learning Prototypes. Experience configuration / use case deployment: At the data lifecycle experience level (e.g.,

Cost-Benefit

Cost-Benefit Data-driven Machine Learning Data Warehouse

Oracle Wants to Be the Database for AI

The DataOps Vendor Landscape, 2021

Webinars

Trending Sources

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Webinars

How EUROGATE established a data mesh architecture using Amazon DataZone

The future of data: A 5-pillar approach to modern data management

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

How Will The Cloud Impact Data Warehousing Technologies?

Take Your SQL Skills To The Next Level With These Popular SQL Books

Scaling RISE with SAP data and AWS Glue

5 misconceptions about cloud data warehouses

Deciphering The Seldom Discussed Differences Between Data Mining and Data Science

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

Data science vs. machine learning: What’s the difference?

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

Preparing the foundations for Generative AI

From Excel to AI: How Liberty Dental revolutionized care management

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

The Multifaceted Value Proposition of the Cloudera Data Platform

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

How Gilead used Amazon Redshift to quickly and cost-effectively load third-party medical claims data

What is business intelligence? Transforming data into business insights

Amazon Redshift data ingestion options

Power enterprise-grade Data Vaults with Amazon Redshift – Part 2

Building and Evaluating GenAI Knowledge Management Systems using Ollama, Trulens and Cloudera

Use AWS Glue to streamline SFTP data processing

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

The disruptive potential of open data lakehouse architectures and IBM watsonx.data

Bringing More AI to Snowflake, the Data Cloud

96 Percent of Businesses Can’t Be Wrong: How Hybrid Cloud Came to Dominate the Data Sector

A Guide To Starting A Career In Business Intelligence & The BI Skills You Need

The New Cloudera

Perform secure database write-backs with Amazon QuickSight

Get Your Analytics Insights Instantly – Without Abandoning Central IT

Governing data in relational databases using Amazon DataZone

Addressing the Three Scalability Challenges in Modern Data Platforms

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

CDP Private Cloud is a Game-changer for Partners

South Africa’s King Price Insurance moves to cloud as business grows

Drinking our own champagne – Cloudera upgrades to CDP Private Cloud

Generative AI for the Enterprise

Themes and Conferences per Pacoid, Episode 11

The value of CDP Public Cloud over legacy Hadoop-on-IaaS implementations

Stay Connected