2022, Data Integration and Data Lake

2022

Data Integration

Data Lake

Load data incrementally from transactional data lakes to data warehouses

AWS Big Data

OCTOBER 19, 2023

Data lakes and data warehouses are two of the most important data storage and management technologies in a modern data architecture. Data lakes store all of an organization’s data, regardless of its format or structure.

Data Lake

Data Lake Data Warehouse Visualization Snapshot

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 2: AWS Glue Studio Visual Editor

AWS Big Data

MARCH 20, 2023

In the first post of this series , we described how AWS Glue for Apache Spark works with Apache Hudi, Linux Foundation Delta Lake, and Apache Iceberg datasets tables using the native support of those data lake formats. For S3 URL , enter s3://noaa-ghcn-pds/csv/by_year/2022.csv. The data source is configured.

Visualization

Visualization Data Lake Snapshot Big Data

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

What is Data Pipeline? A Detailed Explanation

Smart Data Collective

OCTOBER 17, 2022

A point of data entry in a given pipeline. Examples of an origin include storage systems like data lakes, data warehouses and data sources that include IoT devices, transaction processing applications, APIs or social media. The final point to which the data has to be eventually transferred is a destination.

Data Warehouse

Data Warehouse Data Lake Visualization Big Data

Building Best-in-Class Enterprise Analytics

Speaker: Anthony Roach, Director of Product Management at Tableau Software, and Jeremiah Morrow, Partner Solution Marketing Director at Dremio

Tableau works with Strategic Partners like Dremio to build data integrations that bring the two technologies together, creating a seamless and efficient customer experience. A self-service platform for data exploration and visualization that broadens access to analytic insights. A seamless and efficient customer experience.

Analytics

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

AWS Big Data

MAY 30, 2023

Customers have been using data warehousing solutions to perform their traditional analytics tasks. Recently, data lakes have gained lot of traction to become the foundation for analytical solutions, because they come with benefits such as scalability, fault tolerance, and support for structured, semi-structured, and unstructured datasets.

Data Lake

Data Lake Data Analytics Analytics Data Processing

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

We have seen a strong customer demand to expand its scope to cloud-based data lakes because data lakes are increasingly the enterprise solution for large-scale data initiatives due to their power and capabilities. Let’s say that this company is located in Europe and the data product must comply with the GDPR.

Data Lake

Data Lake Management Metrics Data Warehouse

P&G turns to AI to create digital manufacturing of the future

CIO Business Intelligence

OCTOBER 1, 2022

In summer 2022, P&G sealed a multiyear partnership with Microsoft to transform P&G’s digital manufacturing platform. It requires taking data from equipment sensors, applying advanced analytics to derive descriptive and predictive insights, and automating corrective actions.

Manufacturing

Manufacturing Digital Transformation IoT Internet of Things

With a zero-ETL approach, AWS is helping builders realize near-real-time analytics

AWS Big Data

JUNE 28, 2023

Another example of AWS’s investment in zero-ETL is providing the ability to query a variety of data sources without having to worry about data movement. Data analysts and data engineers can use familiar SQL commands to join data across several data sources for quick analysis, and store the results in Amazon S3 for subsequent use.

Analytics

Analytics Data Warehouse Data Lake Data-driven

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

The data fabric architectural approach can simplify data access in an organization and facilitate self-service data consumption at scale. Read: The first capability of a data fabric is a semantic knowledge data catalog, but what are the other 5 core capabilities of a data fabric? 3 March 2022.

Management

Management Metadata Data Architecture Data Lake

Dive deep into AWS Glue 4.0 for Apache Spark

AWS Big Data

MAY 18, 2023

It’s even harder when your organization is dealing with silos that impede data access across different data stores. Seamless data integration is a key requirement in a modern data architecture to break down data silos. AWS Glue released version 4.0 runtime ( 3.5 AWS Glue released version 4.0 runtime ( 3.5

Testing

Testing Data Lake Cost-Benefit Data Integration

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

This view is used to identify patterns and trends in customer behavior, which can inform data-driven decisions to improve business outcomes. In 2022, AWS commissioned a study conducted by the American Productivity and Quality Center (APQC) to quantify the Business Value of Customer 360.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Data Mesh vs. Data Fabric: A Love Story

Alation

JANUARY 13, 2022

Thoughtworks says data mesh is key to moving beyond a monolithic data lake. Spoiler alert: data fabric and data mesh are independent design concepts that are, in fact, quite complementary. Thoughtworks says data mesh is key to moving beyond a monolithic data lake 2. Gartner on Data Fabric.

Data Lake

Data Lake Metadata Data-driven Data Governance

How AWS helped Altron Group accelerate their vision for optimized customer engagement

AWS Big Data

JULY 13, 2023

Data-Driven Everything engagement Altron has provided information technology services since 1965 across South Africa, the Middle East, and Australia. Foundations for a data lake with data governance controls and data quality checks. A set of QuickSight dashboards to be consumed via browser and mobile.

Optimization

Optimization B2B Data Quality Sales

5 financial planning software capabilities that drive business value

Jedox

JANUARY 13, 2023

The answer is simple: most organizations use financial planning solutions that lack the adaptability, integration, and simplicity that high-performing teams need to collaborate effectively. Data integration. 1 Gartner, Critical Capabilities for Financial Planning Software, December 2022.

Software

Software Finance Forecasting Data Lake

Modeling, Modernization and Automation

BI-Survey

APRIL 27, 2023

Eckerson Group wrote this report in collaboration with BARC by studying the results of a global survey of 238 data & analytics practitioners and leaders. BARC conducted the survey in December 2022 and January 2023, drawing respondents from companies of various sizes and across various industries.

Modeling

Modeling Data Warehouse Data Quality Business Driver

How data stores and governance impact your AI initiatives

IBM Big Data Hub

OCTOBER 12, 2023

To optimize data analytics and AI workloads, organizations need a data store built on an open data lakehouse architecture. This type of architecture combines the performance and usability of a data warehouse with the flexibility and scalability of a data lake. Learn more about IBM watsonx 1.

Cost-Benefit

Cost-Benefit Metadata Data Governance Modeling

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

AWS Big Data

APRIL 20, 2023

In a modern data architecture, unified analytics enable you to access the data you need, whether it’s stored in a data lake or a data warehouse. AWS Glue provides an extensible architecture that enables users with different data processing use cases, and works well with Amazon Redshift.

Visualization

Visualization Data Warehouse Big Data Data Lake

Process price transparency data using AWS Glue

AWS Big Data

MAY 4, 2023

Phase 1 implementation of this regulation, which went into effect on July 1, 2022, requires that payors publish machine-readable files publicly for each plan that they offer. Krishna Maddileti is a Senior Solutions Architect on the AWS Data Lab team. We also show how to query and derive insights using Amazon Athena.

Insurance

Insurance Publishing Cost-Benefit Data Lake

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Alation

AUGUST 11, 2022

Today, the brightest minds in our industry are targeting the massive proliferation of data volumes and the accompanying but hard-to-find value locked within all that data. I recently had the opportunity to connect with Mohan at Snowflake Summit 2022 in Las Vegas. Data fabric is a technology architecture.

Metadata

Metadata Data Warehouse Data Quality Data Lake

Redefining enterprise transformation in the age of intelligent ecosystems

CIO Business Intelligence

JANUARY 16, 2025

The mega-vendor era By 2020, the basis of competition for what are now referred to as mega-vendors was interoperability, automation and intra-ecosystem participation and unlocking access to data to drive business capabilities, value and manage risk. edge compute data distribution that connect broad, deep PLM eco-systems.

Enterprise

Enterprise Digital Transformation Scorecard Interactive

Amazon Redshift announces history mode for zero-ETL integrations to simplify historical data tracking and analysis

AWS Big Data

FEBRUARY 18, 2025

We at AWS recognized the need for a more streamlined approach to data integration, particularly between operational databases and the cloud data warehouses. The journey of zero-ETL began in late 2022 when we introduced the feature for Aurora MySQL with Amazon Redshift. Delete the Redshift cluster. Delete the EC2 instance.

Data Warehouse

Data Warehouse Optimization Data Lake Marketing

Data Leaders Brief

Load data incrementally from transactional data lakes to data warehouses

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Webinars

Trending Sources

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 2: AWS Glue Studio Visual Editor

Webinars

What is Data Pipeline? A Detailed Explanation

Building Best-in-Class Enterprise Analytics

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

P&G turns to AI to create digital manufacturing of the future

With a zero-ETL approach, AWS is helping builders realize near-real-time analytics

Augmented data management: Data fabric versus data mesh

Dive deep into AWS Glue 4.0 for Apache Spark

Create an end-to-end data strategy for Customer 360 on AWS

Data Mesh vs. Data Fabric: A Love Story

How AWS helped Altron Group accelerate their vision for optimized customer engagement

5 financial planning software capabilities that drive business value

Modeling, Modernization and Automation

How data stores and governance impact your AI initiatives

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

Process price transparency data using AWS Glue

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Redefining enterprise transformation in the age of intelligent ecosystems

Amazon Redshift announces history mode for zero-ETL integrations to simplify historical data tracking and analysis

Stay Connected