Data Architecture, Data Lake and Visualization

Load data incrementally from transactional data lakes to data warehouses

AWS Big Data

OCTOBER 19, 2023

Data lakes and data warehouses are two of the most important data storage and management technologies in a modern data architecture. Data lakes store all of an organization’s data, regardless of its format or structure.

Data Lake

Data Lake Data Warehouse Visualization Snapshot

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

They understand that a one-size-fits-all approach no longer works, and recognize the value in adopting scalable, flexible tools and open data formats to support interoperability in a modern data architecture to accelerate the delivery of new solutions.

Data Lake

Data Lake Snapshot Metadata Data Architecture

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

Collaborate and build faster using familiar AWS tools for model development, generative AI, data processing, and SQL analytics with Amazon Q Developer , the most capable generative AI assistant for software development, helping you along the way. And move with confidence and trust with built-in governance to address enterprise security needs.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Modern Data Architecture for Telecommunications

Cloudera

SEPTEMBER 6, 2022

Data has continued to grow both in scale and in importance through this period, and today telecommunications companies are increasingly seeing data architecture as an independent organizational challenge, not merely an item on an IT checklist. Previously, there were three types of data structures in telco: .

Data Architecture

Data Architecture Cost-Benefit Digital Transformation Business Driver

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Need for a data mesh architecture Because entities in the EUROGATE group generate vast amounts of data from various sourcesacross departments, locations, and technologiesthe traditional centralized data architecture struggles to keep up with the demands for real-time insights, agility, and scalability.

IoT

IoT Machine Learning Metadata Data-driven

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

While traditional extract, transform, and load (ETL) processes have long been a staple of data integration due to its flexibility, for common use cases such as replication and ingestion, they often prove time-consuming, complex, and less adaptable to the fast-changing demands of modern data architectures.

Data Integration

Data Integration Data Lake Statistics Data-driven

Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue

AWS Big Data

FEBRUARY 14, 2023

Organizations have chosen to build data lakes on top of Amazon Simple Storage Service (Amazon S3) for many years. A data lake is the most popular choice for organizations to store all their organizational data generated by different teams, across business domains, from all different formats, and even over history.

Data Lake

Data Lake Statistics Data Architecture Finance

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architect role Data architects are senior visionaries who translate business requirements into technology requirements and define data standards and principles, often in support of data or digital transformations. Data architect vs. data engineer The data architect and data engineer roles are closely related.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

DataOps For Business Analytics Teams

DataKitchen

JANUARY 3, 2022

Data scientists derive insights from data while business analysts work closely with and tend to the data needs of business units. Business analysts sometimes perform data science, but usually, they integrate and visualize data and create reports and dashboards from data supplied by other groups.

Business Analytics

Business Analytics Analytics Testing Dashboards

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

AWS Big Data

APRIL 24, 2023

Building a data lake on Amazon Simple Storage Service (Amazon S3) provides numerous benefits for an organization. However, many use cases, like performing change data capture (CDC) from an upstream relational database to an Amazon S3-based data lake, require handling data at a record level.

Data Lake

Data Lake Data Governance Machine Learning Cost-Benefit

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

AWS Big Data

JANUARY 12, 2024

Ingestion: Data lake batch, micro-batch, and streaming Many organizations land their source data into their data lake in various ways, including batch, micro-batch, and streaming jobs. Amazon AppFlow can be used to transfer data from different SaaS applications to a data lake.

Data Lake

Data Lake Cost-Benefit Visualization Structured Data

Texas Rangers data transformation modernizes stadium operations

CIO Business Intelligence

OCTOBER 18, 2022

Noel had already established a relationship with consulting firm Resultant through a smaller data visualization project. Resultant recommended a new, on-prem data infrastructure, complete with data lakes to provide stake holders with a better way to manage data reliability, accuracy, and timeliness.

Data Transformation

Data Transformation Consulting Data Lake Reporting

Visualize data quality scores and metrics generated by AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

It provides insights and metrics related to the performance and effectiveness of data quality processes. In this post, we highlight the seamless integration of Amazon Athena and Amazon QuickSight , which enables the visualization of operational metrics for AWS Glue Data Quality rule evaluation in an efficient and effective manner.

Data Quality

Data Quality Metrics Visualization Dashboards

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

AWS Big Data

APRIL 20, 2023

In a modern data architecture, unified analytics enable you to access the data you need, whether it’s stored in a data lake or a data warehouse. Select the Visual with a blank canvas , because we’re authoring a job from scratch, then choose Create.

Visualization

Visualization Data Warehouse Big Data Data Lake

Why the Data Journey Manifesto?

DataKitchen

JUNE 12, 2023

We had been talking about “Agile Analytic Operations,” “DevOps for Data Teams,” and “Lean Manufacturing For Data,” but the concept was hard to get across and communicate. I spent much time de-categorizing DataOps: we are not discussing ETL, Data Lake, or Data Science.

Testing

Testing Dashboards Data Lake Data Science

Enhance data security and governance for Amazon Redshift Spectrum with VPC endpoints

AWS Big Data

FEBRUARY 16, 2024

Many customers are extending their data warehouse capabilities to their data lake with Amazon Redshift. They are looking to further enhance their security posture where they can enforce access policies on their data lakes based on Amazon Simple Storage Service (Amazon S3). Choose Create endpoint.

Data Lake

Data Lake Data Warehouse Testing Business Objectives

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

Zero-ETL integration also enables you to load and analyze data from multiple operational database clusters in a new or existing Amazon Redshift instance to derive holistic insights across many applications. Use one click to access your data lake tables using auto-mounted AWS Glue data catalogs on Amazon Redshift for a simplified experience.

Data Warehouse

Data Warehouse Analytics Data Lake Machine Learning

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

AWS Big Data

OCTOBER 9, 2024

Today, customers are embarking on data modernization programs by migrating on-premises data warehouses and data lakes to the AWS Cloud to take advantage of the scale and advanced analytical capabilities of the cloud. Compare ongoing data that is replicated from the source on-premises database to the target S3 data lake.

Data Quality

Data Quality Data Lake Data Warehouse Metrics

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

CIO Business Intelligence

AUGUST 2, 2023

So Thermo Fisher Scientific CIO Ryan Snyder and his colleagues have built a data layer cake based on a cascading series of discussions that allow IT and business partners to act as one team. Martha Heller: What are the business drivers behind the data architecture ecosystem you’re building at Thermo Fisher Scientific?

Manufacturing

Manufacturing Data Architecture Data Strategy Strategy

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To bring their customers the best deals and user experience, smava follows the modern data architecture principles with a data lake as a scalable, durable data store and purpose-built data stores for analytical processing and data consumption. This is the Data Mart stage.

Data Lake

Data Lake Data Warehouse Data-driven B2B

How Getir unleashed data democratization using a data mesh architecture with Amazon Redshift

AWS Big Data

OCTOBER 23, 2024

Amazon Redshift enables data warehousing by seamlessly integrating with other data stores and services in the modern data organization through features such as Zero-ETL , data sharing , streaming ingestion , data lake integration , and Redshift ML.

Data Warehouse

Data Warehouse Cost-Benefit Data Lake Data-driven

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

AWS Big Data

JULY 25, 2024

ATPCO is the industry leader in providing pricing and merchandising content for airlines, global distribution systems (GDSs), online travel agencies (OTAs), and other sales channels for consumers to visually understand differences between various offers. This slowed down their pace of innovation because it added time to the analytics journey.

Data Lake

Data Lake Metadata Sales Publishing

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

Data fabric and data mesh are emerging data management concepts that are meant to address the organizational change and complexities of understanding, governing and working with enterprise data in a hybrid multicloud ecosystem. The good news is that both data architecture concepts are complimentary.

Management

Management Metadata Data Architecture Data Lake

Estimating Scope 1 Carbon Footprint with Amazon Athena

AWS Big Data

AUGUST 2, 2023

The data architecture diagram below shows an example of how you could use AWS services to calculate and visualize an organization’s estimated carbon footprint. Customers have the flexibility to choose the services in each stage of the data pipeline based on their use case. usage_therms", "gasutilization"."usage_scf"

Data Lake

Data Lake Measurement Visualization Data Architecture

Belcorp reimagines R&D with AI

CIO Business Intelligence

JUNE 28, 2023

The initial stage involved establishing the data architecture, which provided the ability to handle the data more effectively and systematically. “We This allowed us to derive insights more easily.”

Digital Transformation

Digital Transformation Cost-Benefit Informatics Data mining

Data science vs data analytics: Unpacking the differences

IBM Big Data Hub

SEPTEMBER 19, 2023

Overview: Data science vs data analytics Think of data science as the overarching umbrella that covers a wide range of tasks performed to find patterns in large datasets, structure data for use, train machine learning models and develop artificial intelligence (AI) applications.

Data Science

Data Science Data Analytics Prescriptive Analytics Analytics

A comparative assessment of digital transformation in Italy

CIO Business Intelligence

APRIL 24, 2024

In fact, AMA collects a huge amount of structured and unstructured data from bins, collection vehicles, facilities, and user reports, and until now, this data has remained disconnected, managed by disparate systems and interfaces, through Excel spreadsheets.

Digital Transformation

Digital Transformation Business Intelligence Unstructured Data Data Lake

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

Cloudera

MARCH 23, 2022

In fact, we recently announced the integration with our cloud ecosystem bringing the benefits of Iceberg to enterprises as they make their journey to the public cloud, and as they adopt more converged architectures like the Lakehouse. 1: Multi-function analytics . 2: Open formats. Flexible and open file formats.

Metadata

Metadata Data Architecture Machine Learning Cost-Benefit

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

JANUARY 8, 2024

Kinesis Data Streams has native integrations with other AWS services such as AWS Glue and Amazon EventBridge to build real-time streaming applications on AWS. Refer to Amazon Kinesis Data Streams integrations for additional details. The raw data can be streamed to Amazon S3 for archiving.

Analytics

Analytics IoT Data-driven Snapshot

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Success criteria alignment by all stakeholders (producers, consumers, operators, auditors) is key for successful transition to a new Amazon Redshift modern data architecture. The success criteria are the key performance indicators (KPIs) for each component of the data workflow.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Unstructured data management and governance using AWS AI/ML and analytics services

AWS Big Data

OCTOBER 25, 2023

The following is a high-level architecture of the solution we can build to process the unstructured data, assuming the input data is being ingested to the raw input object store. The steps of the workflow are as follows: Integrated AI services extract data from the unstructured data.

Unstructured Data

Unstructured Data Metadata Management Analytics

The New Normal for FP&A: Data Analytics

Jedox

OCTOBER 22, 2020

In working with clients, these are some of the most common “pain points” I routinely address: Difficulty in extracting data out of legacy systems. Limited real-time analytics and visuals. Inability to get data quickly. Data accuracy concerns. More time spent accessing data vs. making data-driven decisions.

Data Analytics

Data Analytics Analytics Unstructured Data Data mining

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

In this post, we dive deep into the tool, walking through all steps from log ingestion, transformation, visualization, and architecture design to calculate TCO. With QuickSight, you can visualize YARN log data and conduct analysis against the datasets generated by pre-built dashboard templates and a widget.

Dashboards

Dashboards Optimization Data Lake Cost-Benefit

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

Both engines provide native ingestion support from Kinesis Data Streams and Amazon MSK via a separate streaming pipeline to a data lake or data warehouse for analysis. OpenSearch Service offers visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5

Data Lake

Data Lake Unstructured Data Management Snapshot

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

Strategize based on how your teams explore data, run analyses, wrangle data for downstream requirements, and visualize data at different levels. The AWS modern data architecture shows a way to build a purpose-built, secure, and scalable data platform in the cloud.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Achieving Trusted AI in Manufacturing

Cloudera

JANUARY 30, 2024

Here are some of the key use cases: Predictive maintenance: With time series data (sensor data) coming from the equipment, historical maintenance logs, and other contextual data, you can predict how the equipment will behave and when the equipment or a component will fail. Eliminate data silos.

Manufacturing

Manufacturing Contextual Data IoT Internet of Things

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

Building data lakes from continuously changing transactional data of databases and keeping data lakes up to date is a complex task and can be an operational challenge. You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes.

Data Lake

Data Lake Dashboards Metrics Metadata

Three Trends for Modernizing Analytics and Data Warehousing in 2019

Cloudera

DECEMBER 19, 2018

The most common big data use case is data warehouse optimization. Big data architecture is used to augment different applications, operating alongside or in a discrete fashion with a data warehouse. A big data implementation may even replace a data warehouse entirely with a data lake.

Data Warehouse

Data Warehouse Analytics Big Data Data Architecture

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

Amazon Redshift , a warehousing service, offers a variety of options for ingesting data from diverse sources into its high-performance, scalable environment. provides a visual ETL tool for authoring jobs to read from and write to Amazon Redshift, using the Redshift Spark connector for connectivity. AWS Glue 4.0 Sudipta Bagchi is a Sr.

IoT

IoT Data Warehouse Cost-Benefit Reporting

Flexible and secure Data-as-a-Service delivered today

Birst BI

OCTOBER 24, 2019

DaaS is a core component of modern data architecture. It provides a governed standard for accessing existing data objects and pipelines for sharing new data objects within an organization. Because it hides the underlying complexities of connecting to and preparing data sources, DaaS helps expand usage of available data.

Data Warehouse

Data Warehouse Recreation/Entertainment Data Lake Data Architecture

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

Cargotec captures terabytes of IoT telemetry data from their machinery operated by numerous customers across the globe. This data needs to be ingested into a data lake, transformed, and made available for analytics, machine learning (ML), and visualization. In his spare time, he loves to play cricket and badminton.

Metadata

Metadata Data Lake Machine Learning Big Data

This Structure has Novel Features which are of Considerable Business Interest

Peter James Thomas

APRIL 3, 2020

Busy Executives and Managers have their information needs best served via visual exhibits that are focussed on their areas of priority and highlight things that are of specific concern to them. Without paying attention to this, your shiny warehouse or data lake will be a technological curiosity, not an indispensable business tool.

Dashboards

Dashboards Reporting Sales Data Lake

Load data incrementally from transactional data lakes to data warehouses

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Webinars

Trending Sources

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Webinars

Modern Data Architecture for Telecommunications

How EUROGATE established a data mesh architecture using Amazon DataZone

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue

What is a data architect? Skills, salaries, and how to become a data framework master

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

DataOps For Business Analytics Teams

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

Top analytics announcements of AWS re:Invent 2024

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

Texas Rangers data transformation modernizes stadium operations

Visualize data quality scores and metrics generated by AWS Glue Data Quality

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

Why the Data Journey Manifesto?

Enhance data security and governance for Amazon Redshift Spectrum with VPC endpoints

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Perform data parity at scale for data modernization programs using AWS Glue Data Quality

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

How smava makes loans transparent and affordable using Amazon Redshift Serverless

How Getir unleashed data democratization using a data mesh architecture with Amazon Redshift

How ATPCO enables governed self-service data access to accelerate innovation with Amazon DataZone

Augmented data management: Data fabric versus data mesh

Estimating Scope 1 Carbon Footprint with Amazon Athena

Belcorp reimagines R&D with AI

Data science vs data analytics: Unpacking the differences

A comparative assessment of digital transformation in Italy

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Unstructured data management and governance using AWS AI/ML and analytics services

The New Normal for FP&A: Data Analytics

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

Exploring real-time streaming for generative AI Applications

Create an end-to-end data strategy for Customer 360 on AWS

Achieving Trusted AI in Manufacturing

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Three Trends for Modernizing Analytics and Data Warehousing in 2019

Amazon Redshift data ingestion options

Flexible and secure Data-as-a-Service delivered today

How Cargotec uses metadata replication to enable cross-account data sharing

This Structure has Novel Features which are of Considerable Business Interest

Stay Connected