Data Integration, Data Lake and IoT

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Their terminal operations rely heavily on seamless data flows and the management of vast volumes of data. Recently, EUROGATE has developed a digital twin for its container terminal Hamburg (CTH), generating millions of data points every second from Internet of Things (IoT)devices attached to its container handling equipment (CHE).

IoT

IoT Machine Learning Metadata Data-driven

What is data architecture? A framework to manage data

CIO Business Intelligence

DECEMBER 20, 2024

Beyond breaking down silos, modern data architectures need to provide interfaces that make it easy for users to consume data using tools fit for their jobs. Data must be able to freely move to and from data warehouses, data lakes, and data marts, and interfaces must make it easy for users to consume that data.

Data Architecture

Data Architecture Management Consulting Internet of Things

Migrate Delta tables from Azure Data Lake Storage to Amazon S3 using AWS Glue

AWS Big Data

SEPTEMBER 10, 2024

We often see requests from customers who have started their data journey by building data lakes on Microsoft Azure, to extend access to the data to AWS services. In such scenarios, data engineers face challenges in connecting and extracting data from storage containers on Microsoft Azure.

Data Lake

Data Lake Metadata Management Software

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Reporting: Is it the Most Boring, Important Thing in Analytics?

Juice Analytics

MAY 11, 2020

Among all the hot analytics initiatives to choose from (big data, IoT, NLP, data storytelling, cognitive BI, GDPR), plain old reporting is what is considered the most important strategic initiative. But seriously, reporting?

Reporting

Reporting Analytics IT Data Lake

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

AWS Big Data

DECEMBER 16, 2024

Third, some services require you to set up and manage compute resources used for federated connectivity, and capabilities like connection testing and data preview arent available in all services. To solve for these challenges, we launched Amazon SageMaker Lakehouse unified data connectivity.

Visualization

Visualization Data Processing Testing Publishing

What CEOs really need from today’s CIOs

CIO Business Intelligence

AUGUST 3, 2022

When Cargill started putting IoT sensors into shrimp ponds, then CIO Justin Kershaw realized that the $130 billion agricultural business was becoming a digital business. To help determine where IT should stop and IoT product engineering should start, Kershaw did not call CIOs of other food and agricultural businesses to compare notes.

Finance

Finance IoT Digital Transformation Sales

Data transformation takes flight at Atlanta’s Hartsfield-Jackson airport

CIO Business Intelligence

AUGUST 9, 2024

The original proof of concept was to have one data repository ingesting data from 11 sources, including flat files and data stored via APIs on premises and in the cloud, Pruitt says. There are a lot of variables that determine what should go into the data lake and what will probably stay on premise,” Pruitt says.

Data Transformation

Data Transformation Machine Learning Data Lake Dashboards

What is Data Pipeline? A Detailed Explanation

Smart Data Collective

OCTOBER 17, 2022

A point of data entry in a given pipeline. Examples of an origin include storage systems like data lakes, data warehouses and data sources that include IoT devices, transaction processing applications, APIs or social media. Data Pipeline: Use Cases. Destination.

Data Warehouse

Data Warehouse Data Lake Visualization Big Data

P&G turns to AI to create digital manufacturing of the future

CIO Business Intelligence

OCTOBER 1, 2022

The company has already undertaken pilot projects in Egypt, India, Japan, and the US that use Azure IoT Hub and IoT Edge to help manufacturing technicians analyze insights to create improvements in the production of baby care and paper products. It also involves large amounts of data and near real-time processing.

Manufacturing

Manufacturing Digital Transformation IoT Internet of Things

Snowflake: Data Ingestion Using Snowpipe and AWS Glue

BizAcuity

NOVEMBER 22, 2022

This typically requires a data warehouse for analytics needs that is able to ingest and handle real time data of huge volumes. Snowflake is a cloud-native platform that eliminates the need for separate data warehouses, data lakes, and data marts allowing secure data sharing across the organization.

Data Warehouse

Data Warehouse Cost-Benefit Data Lake Internet of Things

Introducing Cloudera DataFlow (CDF)

Cloudera

FEBRUARY 4, 2019

One of the most promising technology areas in this merger that already had a high growth potential and is poised for even more growth is the Data-in-Motion platform called Hortonworks DataFlow (HDF). CDF, as an end-to-end streaming data platform, emerges as a clear solution for managing data from the edge all the way to the enterprise.

IoT

IoT Prescriptive Analytics Internet of Things Digital Transformation

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

AWS Big Data

MAY 30, 2023

Customers have been using data warehousing solutions to perform their traditional analytics tasks. Recently, data lakes have gained lot of traction to become the foundation for analytical solutions, because they come with benefits such as scalability, fault tolerance, and support for structured, semi-structured, and unstructured datasets.

Data Lake

Data Lake Data Analytics Analytics Data Processing

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

DataKitchen

JULY 27, 2023

Let’s go through the ten Azure data pipeline tools Azure Data Factory : This cloud-based data integration service allows you to create data-driven workflows for orchestrating and automating data movement and transformation. Azure Blob Storage serves as the data lake to store raw data.

Machine Learning

Machine Learning Cost-Benefit Data Transformation Testing

A hybrid approach in healthcare data warehousing with Amazon Redshift

AWS Big Data

FEBRUARY 21, 2023

Loading complex multi-point datasets into a dimensional model, identifying issues, and validating data integrity of the aggregated and merged data points are the biggest challenges that clinical quality management systems face. Although data lakes resemble data vaults, a data vault provides more features of a data warehouse.

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Metadata

3 Trends that are Changing the World of Data

In(tegrate) the Clouds

AUGUST 4, 2016

In my last post, I wrote about the new data integration requirements. In this post I wanted to share a few points made recently in a TDWI institute interview with SnapLogic founder and CEO Gaurav Dhillon when he was asked: What are some of the most interesting trends you’re seeing in the BI, analytics, and data warehousing space?

IoT

IoT Internet of Things Data Lake Data Warehouse

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

In this blog, I will demonstrate the value of Cloudera DataFlow (CDF) , the edge-to-cloud streaming data platform available on the Cloudera Data Platform (CDP) , as a Data integration and Democratization fabric. The post How Cloudera Data Flow Enables Successful Data Mesh Architectures appeared first on Cloudera Blog.

Metadata

Metadata Cost-Benefit Enterprise Interactive

Snowflake: Data Ingestion Using Snowpipe and AWS Glue

BizAcuity

APRIL 1, 2023

This typically requires a data warehouse for analytics needs that is able to ingest and handle real time data of huge volumes. Snowflake is a cloud-native platform that eliminates the need for separate data warehouses, data lakes, and data marts allowing secure data sharing across the organization.

Data Warehouse

Data Warehouse Cost-Benefit Data Lake Internet of Things

Break data silos and stream your CDC data with Amazon Redshift streaming and Amazon MSK

AWS Big Data

DECEMBER 13, 2023

Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Internet-of-Things [ IoT] devices, system telemetry data, or clickstream data) from a busy website or application.

Data Warehouse

Data Warehouse Snapshot Data Processing Internet of Things

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

Cargotec captures terabytes of IoT telemetry data from their machinery operated by numerous customers across the globe. This data needs to be ingested into a data lake, transformed, and made available for analytics, machine learning (ML), and visualization.

Metadata

Metadata Data Lake Machine Learning Big Data

AWS Glue streaming application to process Amazon MSK data using AWS Glue Schema Registry

AWS Big Data

JUNE 12, 2023

Organizations across the world are increasingly relying on streaming data, and there is a growing need for real-time data analytics, considering the growing velocity and volume of data being collected.

Management

Management Metadata Internet of Things Testing

Go Fast and Far Using Data Virtualization

Data Virtualization

JANUARY 20, 2022

The post Go Fast and Far Using Data Virtualization appeared first on Data Virtualization blog - Data Integration and Modern Data Management Articles, Analysis and Information. Technologies change constantly within organizations and having a flexible architecture is key.

Data Architecture

Data Architecture Data Integration Technology Management

Go Fast and Far Using Data Virtualization to help you Go Fast and Go Far

Data Virtualization

JANUARY 20, 2022

The post Go Fast and Far Using Data Virtualization to help you Go Fast and Go Far appeared first on Data Virtualization blog - Data Integration and Modern Data Management Articles, Analysis and Information. Technologies change constantly within organizations and having a flexible architecture is key.

Data Architecture

Data Architecture Data Integration Technology Management

The Energy Utilities Series: Challenges and Opportunities of Decarbonization (Post 2 of 6)

Data Virtualization

JUNE 2, 2023

The post The Energy Utilities Series: Challenges and Opportunities of Decarbonization (Post 2 of 6) appeared first on Data Management Blog - Data Integration and Modern Data Management Articles, Analysis and Information. Decarbonization is the process of transitioning from.

Data Integration

Data Integration Internet of Things Management Data Lake

It’s not your data. It’s how you use it. Unlock the power of data & build foundations of a data driven organisation

CIO Business Intelligence

MAY 24, 2022

However, more than 99 percent of respondents said they would migrate data to the cloud over the next two years. The Internet of Things (IoT) is a huge contributor of data to this growing volume, iotaComm estimates there are 35 billion IoT devices worldwide and that in 2025 all IoT devices combined will generate 79.4

Data-driven

Data-driven Data Lake Data Warehouse Machine Learning

What is a Data Pipeline?

Jet Global

MAY 9, 2024

The key components of a data pipeline are typically: Data Sources : The origin of the data, such as a relational database , data warehouse, data lake , file, API, or other data store. This can include tasks such as data ingestion, cleansing, filtering, aggregation, or standardization.

Data Lake

Data Lake Data Warehouse Business Intelligence Machine Learning

Prioritizing AI investments: Balancing short-term gains with long-term vision

CIO Business Intelligence

FEBRUARY 18, 2025

If you reflect for a moment, the last major technology inflection points were probably things like mobility, IoT, development operations and the cloud to name but a few. edge compute data distribution that connect broad, deep PLM eco-systems. Agentic AI is here to stay and will gain tremendous momentum in 2024.

Machine Learning

Machine Learning Data Quality Enterprise Sales

Data Leaders Brief

How EUROGATE established a data mesh architecture using Amazon DataZone

What is data architecture? A framework to manage data

Webinars

Trending Sources

Migrate Delta tables from Azure Data Lake Storage to Amazon S3 using AWS Glue

Webinars

Reporting: Is it the Most Boring, Important Thing in Analytics?

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

What CEOs really need from today’s CIOs

Data transformation takes flight at Atlanta’s Hartsfield-Jackson airport

What is Data Pipeline? A Detailed Explanation

P&G turns to AI to create digital manufacturing of the future

Snowflake: Data Ingestion Using Snowpipe and AWS Glue

Introducing Cloudera DataFlow (CDF)

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

The Ten Standard Tools To Develop Data Pipelines In Microsoft Azure

A hybrid approach in healthcare data warehousing with Amazon Redshift

3 Trends that are Changing the World of Data

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Snowflake: Data Ingestion Using Snowpipe and AWS Glue

Break data silos and stream your CDC data with Amazon Redshift streaming and Amazon MSK

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Glue streaming application to process Amazon MSK data using AWS Glue Schema Registry

Go Fast and Far Using Data Virtualization

Go Fast and Far Using Data Virtualization to help you Go Fast and Go Far

The Energy Utilities Series: Challenges and Opportunities of Decarbonization (Post 2 of 6)

It’s not your data. It’s how you use it. Unlock the power of data & build foundations of a data driven organisation

What is a Data Pipeline?

Prioritizing AI investments: Balancing short-term gains with long-term vision

Stay Connected