Data Architecture, Data Lake and Statistics

Data Architecture

Data Lake

Statistics

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

AWS Big Data

OCTOBER 1, 2024

Amazon Redshift enables you to efficiently query and retrieve structured and semi-structured data from open format files in Amazon S3 data lake without having to load the data into Amazon Redshift tables. Amazon Redshift extends SQL capabilities to your data lake, enabling you to run analytical queries.

Data Lake

Data Lake Statistics Broadcasting Optimization

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

While traditional extract, transform, and load (ETL) processes have long been a staple of data integration due to its flexibility, for common use cases such as replication and ingestion, they often prove time-consuming, complex, and less adaptable to the fast-changing demands of modern data architectures.

Data Integration

Data Integration Data Lake Statistics Data-driven

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Trending Sources

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Statistics Optimization

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

They understand that a one-size-fits-all approach no longer works, and recognize the value in adopting scalable, flexible tools and open data formats to support interoperability in a modern data architecture to accelerate the delivery of new solutions.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

AWS Big Data

AUGUST 31, 2023

Amazon Redshift enables you to directly access data stored in Amazon Simple Storage Service (Amazon S3) using SQL queries and join data across your data warehouse and data lake. With Amazon Redshift, you can query the data in your S3 data lake using a central AWS Glue metastore from your Redshift data warehouse.

Data Lake

Data Lake Data Warehouse Metadata Data Architecture

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS Big Data

SEPTEMBER 13, 2023

The Analytics specialty practice of AWS Professional Services (AWS ProServe) helps customers across the globe with modern data architecture implementations on the AWS Cloud. Of those tables, some are larger (such as in terms of record volume) than others, and some are updated more frequently than others.

Data Lake

Data Lake Data Processing Metadata Snapshot

Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue

AWS Big Data

FEBRUARY 14, 2023

Organizations have chosen to build data lakes on top of Amazon Simple Storage Service (Amazon S3) for many years. A data lake is the most popular choice for organizations to store all their organizational data generated by different teams, across business domains, from all different formats, and even over history.

Data Lake

Data Lake Statistics Data Architecture Finance

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architecture is a complex and varied field and different organizations and industries have unique needs when it comes to their data architects. Solutions data architect: These individuals design and implement data solutions for specific business needs, including data warehouses, data marts, and data lakes.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

A Day in the Life of a DataOps Engineer

DataKitchen

OCTOBER 11, 2021

First, you must understand the existing challenges of the data team, including the data architecture and end-to-end toolchain. Figure 2: Example data pipeline with DataOps automation. In this project, I automated data extraction from SFTP, the public websites, and the email attachments.

Testing

Testing Metadata Dashboards Statistics

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

Zero-ETL integration also enables you to load and analyze data from multiple operational database clusters in a new or existing Amazon Redshift instance to derive holistic insights across many applications. Use one click to access your data lake tables using auto-mounted AWS Glue data catalogs on Amazon Redshift for a simplified experience.

Data Warehouse

Data Warehouse Analytics Data Lake Machine Learning

Real estate CIOs drive deals with data

CIO Business Intelligence

JULY 26, 2023

The CIO delights in detailing the work of Re/Max’s technology team, which is building the pipelines and cloud-native applications to deliver agents in the field the most refined and insightful data from more than 500 MLS listing serivces in the US and Canada as quickly as possible.

Data Lake

Data Lake Digital Transformation Machine Learning Data Architecture

Demystifying Modern Data Platforms

Cloudera

SEPTEMBER 15, 2022

Mark: The first element in the process is the link between the source data and the entry point into the data platform. At Ramsey International (RI), we refer to that layer in the architecture as the foundation, but others call it a staging area, raw zone, or even a source data lake. What is a data fabric?

Data Lake

Data Lake Data Architecture Data-driven Data Warehouse

Belcorp reimagines R&D with AI

CIO Business Intelligence

JUNE 28, 2023

The initial stage involved establishing the data architecture, which provided the ability to handle the data more effectively and systematically. “We Working with non-typical data presents us with a reality where encountering challenges is part of our daily operations.”

Digital Transformation

Digital Transformation Cost-Benefit Informatics Data mining

Data science vs data analytics: Unpacking the differences

IBM Big Data Hub

SEPTEMBER 19, 2023

Though you may encounter the terms “data science” and “data analytics” being used interchangeably in conversations or online, they refer to two distinctly different concepts. Meanwhile, data analytics is the act of examining datasets to extract value and find answers to specific questions.

Data Science

Data Science Data Analytics Prescriptive Analytics Analytics

HEMA accelerates their data governance journey with Amazon DataZone

AWS Big Data

DECEMBER 19, 2024

Delta tables technical metadata is stored in the Data Catalog, which is a native source for creating assets in the Amazon DataZone business catalog. Access control is enforced using AWS Lake Formation , which manages fine-grained access control and data sharing on data lake data.

Data Governance

Data Governance Publishing Data-driven Metadata

You Can’t Hit What You Can’t See

Cloudera

DECEMBER 1, 2022

Full-stack observability is a critical requirement for effective modern data platforms to deliver the agile, flexible, and cost-effective environment organizations are looking for. RI is a global leader in the design and deployment of large-scale, production-level modern data platforms for the world’s largest enterprises.

Data Quality

Data Quality Metrics Data Lake Statistics

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Success criteria alignment by all stakeholders (producers, consumers, operators, auditors) is key for successful transition to a new Amazon Redshift modern data architecture. The success criteria are the key performance indicators (KPIs) for each component of the data workflow.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

Based on the statistics of individual and aggregated application runs per queue and per user, you can determine the existing workload distribution by user. He also understands how to apply technologies to solve big data problems and build a well-designed data architecture. You can find peak and off-peak hours in a day.

Dashboards

Dashboards Optimization Data Lake Cost-Benefit

A Retrospective of 2018’s Articles

Peter James Thomas

APRIL 9, 2019

These are as follows: General Data Articles. Data Visualisation. Statistics & Data Science. Analytics & Big Data. Data Visualisation. Statistics & Data Science. Data Science Challenges – It’s Deja Vu all over again! Analytics & Big Data. CDO perspectives.

Data-driven

Data-driven Statistics Data Science Big Data

Visualize data quality scores and metrics generated by AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

The purpose of this step is to understand our data quality statistics at the table level as well as at the ruleset level. Use the queries in this section to analyze your data quality metrics and create an Athena view to use to build a QuickSight dashboard in the next step.

Data Quality

Data Quality Metrics Visualization Dashboards

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

Refactoring coupled compute and storage to a decoupling architecture is a modern data solution. It enables compute such as EMR instances and storage such as Amazon Simple Storage Service (Amazon S3) data lakes to scale. George Zhao is a Senior Data Architect at AWS ProServe.

Cost-Benefit

Cost-Benefit Data Lake Dashboards Big Data

Why We Started the Data Intelligence Project

Alation

JULY 7, 2022

In the 2010s, the growing scope of the data landscape gave rise to a new profession: the data scientist. This new role, combined with the creation of data lakes and the increasing use of cloud services, created new employment opportunities in data analytics, data architecture, and data management.

Metadata

Metadata Data-driven Insurance Statistics

Convergent Evolution

Peter James Thomas

AUGUST 18, 2018

That was the Science, here comes the Technology… A Brief Hydrology of Data Lakes. Even back then, these were used for activities such as Analytics , Dashboards , Statistical Modelling , Data Mining and Advanced Visualisation. Of course some architectures featured both paradigms as well.

Data Lake

Data Lake Data Warehouse Data mining Statistics

Your Data Architecture Holds the Key to Unlocking AI’s Full Potential

CIO Business Intelligence

APRIL 4, 2023

In order to move AI forward, we need to first build and fortify the foundational layer: data architecture. This architecture is important because, to reap the full benefits of AI, it must be built to scale across an enterprise versus individual AI applications. Constructing the right data architecture cannot be bypassed.

Data Architecture

Data Architecture Data Lake Data Warehouse Cost-Benefit

Data Leaders Brief

Accelerate Amazon Redshift Data Lake queries with AWS Glue Data Catalog Column Statistics

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

Webinars

Trending Sources

Choosing an open table format for your transactional data lake on AWS

Webinars

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue

What is a data architect? Skills, salaries, and how to become a data framework master

Top analytics announcements of AWS re:Invent 2024

A Day in the Life of a DataOps Engineer

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Real estate CIOs drive deals with data

Demystifying Modern Data Platforms

Belcorp reimagines R&D with AI

Data science vs data analytics: Unpacking the differences

HEMA accelerates their data governance journey with Amazon DataZone

You Can’t Hit What You Can’t See

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

A Retrospective of 2018’s Articles

Visualize data quality scores and metrics generated by AWS Glue Data Quality

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

Why We Started the Data Intelligence Project

Convergent Evolution

Your Data Architecture Holds the Key to Unlocking AI’s Full Potential

Stay Connected