2022, Data Lake and Data Warehouse

Load data incrementally from transactional data lakes to data warehouses

AWS Big Data

OCTOBER 19, 2023

Data lakes and data warehouses are two of the most important data storage and management technologies in a modern data architecture. Data lakes store all of an organization’s data, regardless of its format or structure.

Data Lake

Data Lake Data Warehouse Visualization Snapshot

Capital One Offers Cost Controls for Cloud Data Warehouses

David Menninger's Analyst Perspectives

NOVEMBER 7, 2024

The adoption of cloud environments for analytic workloads has been a key feature of the data platforms sector in recent years. For two-thirds (66%) of participants in ISG’s Data Lake Dynamic Insights Research, the primary data platform used for analytics is cloud based.

Data Warehouse

Data Warehouse Cost-Benefit Data Lake Software

Simplify data ingestion from Amazon S3 to Amazon Redshift using auto-copy

AWS Big Data

OCTOBER 30, 2024

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze your data using standard SQL and your existing business intelligence (BI) tools. Data ingestion is the process of getting data to Amazon Redshift. Do not overwrite existing files.

Data Warehouse

Data Warehouse Sales Data Lake Recreation/Entertainment

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Use Apache Iceberg in a data lake to support incremental data processing

AWS Big Data

MARCH 2, 2023

Iceberg has become very popular for its support for ACID transactions in data lakes and features like schema and partition evolution, time travel, and rollback. In early 2022, AWS announced general availability of Athena ACID transactions, powered by Apache Iceberg. and later supports the Apache Iceberg framework for data lakes.

Data Lake

Data Lake Data Processing Metadata Snapshot

Build an Amazon Redshift data warehouse using an Amazon DynamoDB single-table design

AWS Big Data

JUNE 21, 2023

These types of queries are suited for a data warehouse. The goal of a data warehouse is to enable businesses to analyze their data fast; this is important because it means they are able to gain valuable insights in a timely manner. Amazon Redshift is fully managed, scalable, cloud data warehouse.

Data Warehouse

Data Warehouse Data Lake OLAP Cost-Benefit

Australia’s IT leadership moves 2022

CIO Business Intelligence

JULY 24, 2022

Here’s an update on the moves of Australian IT leaders starting from June 2022. Kuret replaces Scott Wall who after four years with BankVic joined Bank Australia as chief transformation officer in June 2022. CIO Australia consistently tracks the moves of IT leaders. Simon Herbert departs from NSW Customer Service. IT Leadership

IT

IT Data Lake Data Warehouse Digital Transformation

AWS Lake Formation 2022 year in review

AWS Big Data

JANUARY 31, 2023

In this post, we are excited to summarize the features that the AWS Glue Data Catalog, AWS Glue crawler, and Lake Formation teams delivered in 2022. Whether you are a data platform builder, data engineer, data scientist, or any technology leader interested in data lake solutions, this post is for you.

Data Lake

Data Lake Data Governance Data Architecture Machine Learning

Data Science & Analytics Industry Main Developments in 2021 and Key Trends for 2022

KDnuggets

DECEMBER 14, 2021

We have solicited insights from experts at industry-leading companies, asking: "What were the main AI, Data Science, Machine Learning Developments in 2021 and what key trends do you expect in 2022?" Read their opinions here.

Data Science

Data Science Machine Learning Analytics Data Lake

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Statistics Optimization

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

AWS Big Data

APRIL 27, 2023

Amazon Athena supports the MERGE command on Apache Iceberg tables, which allows you to perform inserts, updates, and deletes in your data lake at scale using familiar SQL statements that are compliant with ACID (Atomic, Consistent, Isolated, Durable). Create a table to point to the CDC data. Upload 20220922-184314489.csv

Data Lake

Data Lake Snapshot Optimization Data Transformation

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

AWS Big Data

JUNE 23, 2023

Events and many other security data types are stored in Imperva’s Threat Research Multi-Region data lake. Imperva harnesses data to improve their business outcomes. As part of their solution, they are using Amazon QuickSight to unlock insights from their data.

Data Lake

Data Lake Dashboards Cost-Benefit Data Warehouse

The rise of the data lakehouse: A new era of data value

CIO Business Intelligence

AUGUST 18, 2022

Previously, Walgreens was attempting to perform that task with its data lake but faced two significant obstacles: cost and time. Those challenges are well-known to many organizations as they have sought to obtain analytical knowledge from their vast amounts of data. Lakehouses redeem the failures of some data lakes.

Data Lake

Data Lake Data Warehouse Unstructured Data Business Intelligence

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

AWS Big Data

JULY 21, 2023

This leads to having data across many instances of data warehouses and data lakes using a modern data architecture in separate AWS accounts. We recently announced the integration of Amazon Redshift data sharing with AWS Lake Formation.

Data Lake

Data Lake Data Warehouse Marketing Management

Build a real-time GDPR-aligned Apache Iceberg data lake

AWS Big Data

FEBRUARY 24, 2023

Data lakes are a popular choice for today’s organizations to store their data around their business activities. As a best practice of a data lake design, data should be immutable once stored. A data lake built on AWS uses Amazon Simple Storage Service (Amazon S3) as its primary storage environment.

Data Lake

Data Lake Metadata Testing Data Warehouse

2021 Gift Giving Guide for Data Nerds

DataKitchen

DECEMBER 7, 2021

Data Mesh: Delivering Data-Driven Value at Scale , by Zhamak Dehghani. This book is not available until January 2022, but considering all the hype around the data mesh, we expect it to be a best seller.

Data-driven

Data-driven Data Governance Big Data Data Science

What I Learned At Gartner Data & Analytics 2022

Timo Elliott

MAY 27, 2022

For the longest time, in order to do any analytics, we had to take the data to the technology — rip it out of the business applications and move it to a data warehouse or a data lake, or a data lakehouse. And there’s been a big change in technology that is supporting all this.

Data Analytics

Data Analytics Analytics Recreation/Entertainment Data Lake

Open Data Lakehouse powered by Iceberg for all your Data Warehouse needs

Cloudera

APRIL 3, 2023

In this blog, we will share with you in detail how Cloudera integrates core compute engines including Apache Hive and Apache Impala in Cloudera Data Warehouse with Iceberg. We will publish follow up blogs for other data services. It allows us to independently upgrade the Virtual Warehouses and Database Catalogs.

Data Warehouse

Data Warehouse Snapshot Metadata Cost-Benefit

The Future of the Data Lakehouse – Open

CIO Business Intelligence

JUNE 23, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes.

Data Lake

Data Lake Data Warehouse Machine Learning Data-driven

What is Data Pipeline? A Detailed Explanation

Smart Data Collective

OCTOBER 17, 2022

A point of data entry in a given pipeline. Examples of an origin include storage systems like data lakes, data warehouses and data sources that include IoT devices, transaction processing applications, APIs or social media. The final point to which the data has to be eventually transferred is a destination.

Data Warehouse

Data Warehouse Data Lake Visualization Big Data

The Future of the Data Lakehouse – Open

Cloudera

JUNE 18, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes.

Data Lake

Data Lake Data Warehouse Machine Learning Data-driven

6 Exciting Announcements Made at Snowflake Summit 2022

CDW Research Hub

JULY 22, 2022

The Sirius Data & Analytics Consulting team recently attended Snowflake Summit 2022 in Las Vegas; the first time the annual conference has been held in person since 2019. Whether it was due to being in a room full of data enthusiasts or the magic of Las Vegas, the energy matched the larger attendance and venue.

Data Lake

Data Lake Data Governance Machine Learning Consulting

Get maximum value out of your cloud data warehouse with Amazon Redshift

AWS Big Data

APRIL 19, 2023

Every day, customers are challenged with how to manage their growing data volumes and operational costs to unlock the value of data for timely insights and innovation, while maintaining consistent performance. As data workloads grow, costs to scale and manage data usage with the right governance typically increase as well.

Data Warehouse

Data Warehouse Data Lake Unstructured Data Optimization

Cloudera Named a Leader in the 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems (DBMS)

Cloudera

DECEMBER 16, 2022

We are pleased to announce that Cloudera has been named a Leader in the 2022 Gartner ® Magic Quadrant for Cloud Database Management Systems. Cloudera has long had the capabilities of a data lakehouse, if not the label. Get an introduction to the latest version of Cloudera’s Data Platform. and/or its affiliates in the U.S.

Management

Management Metadata Machine Learning Data Lake

Migrate Amazon Redshift from DC2 to RA3 to accommodate increasing data volumes and analytics demands

AWS Big Data

AUGUST 9, 2024

These processes retrieve data from around 90 different data sources, resulting in updating roughly 2,000 tables in the data warehouse and 3,000 external tables in Parquet format, accessed through Amazon Redshift Spectrum and a data lake on Amazon Simple Storage Service (Amazon S3). We started with 115 dc2.large

Data Lake

Data Lake Analytics Data Warehouse Data-driven

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

dbt is an open source, SQL-first templating engine that allows you to write repeatable and extensible data transforms in Python and SQL. dbt is predominantly used by data warehouses (such as Amazon Redshift ) customers who are looking to keep their data transform logic separate from storage and engine.

Data Lake

Data Lake Management Metrics Data Warehouse

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To speed up the self-service analytics and foster innovation based on data, a solution was needed to provide ways to allow any team to create data products on their own in a decentralized manner. To create and manage the data products, smava uses Amazon Redshift , a cloud data warehouse.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Regeneron turns to IT to accelerate drug discovery

CIO Business Intelligence

NOVEMBER 4, 2022

MetaBio, which received a 2022 CIO 100 Award , provides a single source for datasets in a unified format, enabling researchers to quickly extract information about various therapeutic functions without having to worry about how to prepare or find the data. Much of Regeneron’s data, of course, is confidential.

Data Lake

Data Lake IT Experimentation Data-driven

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architect Armando Vázquez identifies eight common types of data architects: Enterprise data architect: These data architects oversee an organization’s overall data architecture, defining data architecture strategy and designing and implementing architectures. Are data architects in demand?

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Has the Data Warehouse Had Its Day?

BI-Survey

JANUARY 15, 2023

Statements from countless interviews with our customers reveal that the data warehouse is seen as a “black box” by many and understood by few business users. Therefore, it is not clear why the costly and apparently flexibility-inhibiting data warehouse is needed at all. The limiting factor is rather the data landscape.

Data Warehouse

Data Warehouse IT Data Architecture Measurement

Achieve your AI goals with an open data lakehouse approach

IBM Big Data Hub

OCTOBER 4, 2023

Why does AI need an open data lakehouse architecture? from 2022 to 2026. Another IDC study showed that while 2/3 of respondents reported using AI-driven data analytics, most reported that less than half of the data under management is available for this type of analytics.

Data Lake

Data Lake Metadata Data Warehouse Cost-Benefit

Announcing data filtering for Amazon Aurora MySQL zero-ETL integration with Amazon Redshift

AWS Big Data

MARCH 20, 2024

To run analytics on your operational data, you might build a solution that is a combination of a database, a data warehouse, and an extract, transform, and load (ETL) pipeline. ETL is the process data engineers use to combine data from different sources.

Data Warehouse

Data Warehouse Business Driver Data-driven Data Lake

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

In February 2022, we introduced Apache Iceberg as a technical preview within CDP. Over the past decade, Cloudera has enabled multi-function analytics on data lakes through the introduction of the Hive table format and Hive ACID. The data lakehouse is not new to Cloudera or our customers.

Data Lake

Data Lake Data Warehouse Data Architecture Metadata

With a zero-ETL approach, AWS is helping builders realize near-real-time analytics

AWS Big Data

JUNE 28, 2023

Another example of AWS’s investment in zero-ETL is providing the ability to query a variety of data sources without having to worry about data movement. Data analysts and data engineers can use familiar SQL commands to join data across several data sources for quick analysis, and store the results in Amazon S3 for subsequent use.

Analytics

Analytics Data Warehouse Data Lake Data-driven

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

AWS Big Data

MAY 30, 2023

Customers have been using data warehousing solutions to perform their traditional analytics tasks. Recently, data lakes have gained lot of traction to become the foundation for analytical solutions, because they come with benefits such as scalability, fault tolerance, and support for structured, semi-structured, and unstructured datasets.

Data Lake

Data Lake Data Analytics Analytics Data Processing

Get started with Amazon DynamoDB zero-ETL integration with Amazon Redshift

AWS Big Data

OCTOBER 17, 2024

You can then run enhanced analysis on this DynamoDB data with the rich capabilities of Amazon Redshift, such as high-performance SQL, built-in machine learning (ML) and Spark integrations, materialized views (MV) with automatic and incremental refresh, data sharing, and the ability to join data across multiple data stores and data lakes.

Metrics

Metrics Dashboards Data Warehouse Statistics

Why Business Intelligence is Top of Mind for CFOs for 2022

Jet Global

DECEMBER 3, 2021

The term “ business intelligence ” (BI) has been in common use for several decades now, referring initially to the OLAP systems that drew largely upon pre-processed information stored in data warehouses. Discover Meaning Amid All That Data. Analytics are a hot topic for 2022, and for good reason. The Future Is Now.

Business Intelligence

Business Intelligence OLAP Sales Data Warehouse

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

AWS Big Data

APRIL 19, 2023

Many customers run big data workloads such as extract, transform, and load (ETL) on Apache Hive to create a data warehouse on Hadoop. sql_parameters DATE=2022-07-04::HOUR=00 Any additional or dynamic parameters expected by the SQL files. He is passionate about big data and data analytics.

Metadata

Metadata Data Lake Testing Consulting

Demystifying Modern Data Platforms

Cloudera

SEPTEMBER 15, 2022

July brings summer vacations, holiday gatherings, and for the first time in two years, the return of the Massachusetts Institute of Technology (MIT) Chief Data Officer symposium as an in-person event. A key area of focus for the symposium this year was the design and deployment of modern data platforms. What is a data fabric?

Data Lake

Data Lake Data Architecture Data-driven Data Warehouse

Does Cost Reduction Play a Role in Digital Transformation?

Cloudera

OCTOBER 6, 2022

For example, we have some customers using their data platform originally established for compliance initiatives to drive new use cases. These data lakes house much of the data needed to also support other use cases. We see this in the area of operational databases and legacy data warehouses.

Digital Transformation

Digital Transformation Cost-Benefit Data Lake Machine Learning

Better, faster decisions: Why businesses thrive on real-time data

CIO Business Intelligence

SEPTEMBER 8, 2022

Most organizations understand the profound impact that data is having on modern business. In Foundry’s 2022 Data & Analytics Study , 88% of IT decision-makers agree that data collection and analysis have the potential to fundamentally change their business models over the next three years.

Cost-Benefit

Cost-Benefit Internet of Things Data-driven Data Lake

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

This view is used to identify patterns and trends in customer behavior, which can inform data-driven decisions to improve business outcomes. In 2022, AWS commissioned a study conducted by the American Productivity and Quality Center (APQC) to quantify the Business Value of Customer 360.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Wonderla Holidays goes digital to enhance business and customer fun

CIO Business Intelligence

OCTOBER 18, 2022

One pulse sends 150 bytes of data. So, each band can send out 500KB to 750KB of data. To handle the huge volume of data thus generated, the company is in the process of deploying a data lake, data warehouse, and real-time analytical tools in a hybrid model. Digital Transformation, RFID

Data Lake

Data Lake Data Warehouse Cost-Benefit Digital Transformation

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

With data ownership decentralization, data owners can create data products for their respective domains, meaning data consumers, both data scientist and business users, can use a combination of these data products for data analytics and data science. 3 March 2022. 11 May 2021. .

Management

Management Metadata Data Architecture Data Lake

Load data incrementally from transactional data lakes to data warehouses

Capital One Offers Cost Controls for Cloud Data Warehouses

Webinars

Trending Sources

Simplify data ingestion from Amazon S3 to Amazon Redshift using auto-copy

Webinars

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Use Apache Iceberg in a data lake to support incremental data processing

Build an Amazon Redshift data warehouse using an Amazon DynamoDB single-table design

Australia’s IT leadership moves 2022

AWS Lake Formation 2022 year in review

Data Science & Analytics Industry Main Developments in 2021 and Key Trends for 2022

Choosing an open table format for your transactional data lake on AWS

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

The rise of the data lakehouse: A new era of data value

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

Build a real-time GDPR-aligned Apache Iceberg data lake

2021 Gift Giving Guide for Data Nerds

What I Learned At Gartner Data & Analytics 2022

Open Data Lakehouse powered by Iceberg for all your Data Warehouse needs

The Future of the Data Lakehouse – Open

What is Data Pipeline? A Detailed Explanation

The Future of the Data Lakehouse – Open

6 Exciting Announcements Made at Snowflake Summit 2022

Get maximum value out of your cloud data warehouse with Amazon Redshift

Cloudera Named a Leader in the 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems (DBMS)

Migrate Amazon Redshift from DC2 to RA3 to accommodate increasing data volumes and analytics demands

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Regeneron turns to IT to accelerate drug discovery

What is a data architect? Skills, salaries, and how to become a data framework master

Has the Data Warehouse Had Its Day?

Achieve your AI goals with an open data lakehouse approach

Announcing data filtering for Amazon Aurora MySQL zero-ETL integration with Amazon Redshift

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

With a zero-ETL approach, AWS is helping builders realize near-real-time analytics

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

Get started with Amazon DynamoDB zero-ETL integration with Amazon Redshift

Why Business Intelligence is Top of Mind for CFOs for 2022

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

Demystifying Modern Data Platforms

Does Cost Reduction Play a Role in Digital Transformation?

Better, faster decisions: Why businesses thrive on real-time data

Create an end-to-end data strategy for Customer 360 on AWS

Wonderla Holidays goes digital to enhance business and customer fun

Augmented data management: Data fabric versus data mesh

Stay Connected