Book, Data Architecture and Data Lake

Book

Data Architecture

Data Lake

What is data architecture? A framework to manage data

CIO Business Intelligence

DECEMBER 20, 2024

Data architecture definition Data architecture describes the structure of an organizations logical and physical data assets, and data management resources, according to The Open Group Architecture Framework (TOGAF). An organizations data architecture is the purview of data architects.

Data Architecture

Data Architecture Management Consulting Internet of Things

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

AWS Big Data

OCTOBER 30, 2024

This is part two of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to load data from a legacy database (SQL Server) into a transactional data lake ( Apache Iceberg ) using AWS Glue. Delete the bucket.

Data Lake

Data Lake Data Processing Optimization Machine Learning

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

AWS Big Data

NOVEMBER 14, 2024

The landscape of big data management has been transformed by the rising popularity of open table formats such as Apache Iceberg, Apache Hudi, and Linux Foundation Delta Lake. These formats, designed to address the limitations of traditional data storage systems, have become essential in modern data architectures.

Metadata

Metadata Data Warehouse Big Data Data Lake

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS Big Data

SEPTEMBER 13, 2023

The Analytics specialty practice of AWS Professional Services (AWS ProServe) helps customers across the globe with modern data architecture implementations on the AWS Cloud. Of those tables, some are larger (such as in terms of record volume) than others, and some are updated more frequently than others.

Data Lake

Data Lake Data Processing Metadata Snapshot

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

Analytics remained one of the key focus areas this year, with significant updates and innovations aimed at helping businesses harness their data more efficiently and accelerate insights. From enhancing data lakes to empowering AI-driven analytics, AWS unveiled new tools and services that are set to shape the future of data and analytics.

Analytics

Analytics Data Lake Metadata Data Warehouse

What you don’t know about data management could kill your business

CIO Business Intelligence

NOVEMBER 28, 2023

But at the other end of the attention spectrum is data management, which all too frequently is perceived as being boring, tedious, the work of clerks and admins, and ridiculously expensive. Still, to truly create lasting value with data, organizations must develop data management mastery.

Management

Management Data Architecture Data Lake Data Strategy

Your guide to AWS Analytics at AWS re:Invent 2023

AWS Big Data

NOVEMBER 13, 2023

For those in the data world, this post provides a curated guide for all analytics sessions that you can use to quickly schedule and build your itinerary. Book your spot early for the sessions you do not want to miss. 11:30 AM – 12:30 PM (PDT) Ceasars Forum ANT318 | Accelerate innovation with end-to-end serverless data architecture.

Analytics

Analytics Data Lake Data Warehouse Data-driven

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To bring their customers the best deals and user experience, smava follows the modern data architecture principles with a data lake as a scalable, durable data store and purpose-built data stores for analytical processing and data consumption.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Addressing the Elephant in the Room – Welcome to Today’s Cloudera

Cloudera

JUNE 13, 2024

There were thousands of attendees at the event – lining up for book signings and meetings with recruiters to fill the endless job openings for developers experienced with MapReduce and managing Big Data. This was the gold rush of the 21st century, except the gold was data.

Big Data

Big Data Machine Learning Contextual Data Data Lake

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

This dynamic integration of streaming data enables generative AI applications to respond promptly to changing conditions, improving their adaptability and overall performance in various tasks. To better understand this, imagine a chatbot that helps travelers book their travel.

Data Lake

Data Lake Unstructured Data Management Snapshot

Unstructured data management and governance using AWS AI/ML and analytics services

AWS Big Data

OCTOBER 25, 2023

The following is a high-level architecture of the solution we can build to process the unstructured data, assuming the input data is being ingested to the raw input object store. The steps of the workflow are as follows: Integrated AI services extract data from the unstructured data.

Unstructured Data

Unstructured Data Metadata Management Analytics

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

JANUARY 8, 2024

Kinesis Data Streams has native integrations with other AWS services such as AWS Glue and Amazon EventBridge to build real-time streaming applications on AWS. Refer to Amazon Kinesis Data Streams integrations for additional details. In his spare time, Raghavarao enjoys spending time with his family, reading books, and watching movies.

Analytics

Analytics IoT Data-driven Snapshot

A Retrospective of 2018’s Articles

Peter James Thomas

APRIL 9, 2019

This article offers a framework for building momentum in the early stages of a Data Programme. Analytics & Big Data. A review of some of the problems that can beset Data Lakes, together with some ideas about what to do to fix these from Dan Woods (Forbes), Paul Barth (Podium Data) and Dave Wells (Eckerson Group).

Data-driven

Data-driven Statistics Data Science Big Data

This Structure has Novel Features which are of Considerable Business Interest

Peter James Thomas

APRIL 3, 2020

The way that this consistency of figures is achieved is by all elements of the Structured Reporting Framework drawing their data from the same data repositories. Without paying attention to this, your shiny warehouse or data lake will be a technological curiosity, not an indispensable business tool.

Dashboards

Dashboards Reporting Sales Data Lake

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Jet Global

OCTOBER 1, 2024

Trino allows users to run ad hoc queries across massive datasets, making real-time decision-making a reality without needing extensive data transformations. This is particularly valuable for teams that require instant answers from their data. Data Lake Analytics: Trino doesn’t just stop at databases.

Dashboards

Dashboards Data Lake Reporting Cost-Benefit

Enforce fine-grained access control on data lake tables using AWS Glue 5.0 integrated with AWS Lake Formation

AWS Big Data

DECEMBER 4, 2024

FGAC enables you to granularly control access to your data lake resources at the table, column, and row levels. This level of control is essential for organizations that need to comply with data governance and security regulations, or those that deal with sensitive data. through Lake Formation permissions.

Data Lake

Data Lake Big Data Management Software

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

AWS Big Data

APRIL 28, 2025

Use existing AWS Glue tables This section has following prerequisites: A data lake administrator user by following Create a data lake administrator. For detailed instruction see Revoking permission using the Lake Formation console. He is also the author of the book Serverless ETL and Analytics with AWS Glue.

Metadata

Metadata Data Lake Big Data Publishing

Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer

AWS Big Data

OCTOBER 30, 2024

This is the final part of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to process data with Amazon Redshift Spectrum and create the gold (consumption) layer. The following diagram illustrates the different layers of the data lake.

Data Lake

Data Lake Machine Learning Data Architecture Data-driven

Data Leaders Brief

What is data architecture? A framework to manage data

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Webinars

Trending Sources

Expand data access through Apache Iceberg using Delta Lake UniForm on AWS

Webinars

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Top analytics announcements of AWS re:Invent 2024

What you don’t know about data management could kill your business

Your guide to AWS Analytics at AWS re:Invent 2023

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Addressing the Elephant in the Room – Welcome to Today’s Cloudera

Exploring real-time streaming for generative AI Applications

Unstructured data management and governance using AWS AI/ML and analytics services

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

A Retrospective of 2018’s Articles

This Structure has Novel Features which are of Considerable Business Interest

Unlocking Trino’s Full Potential With Simba Drivers for BI & ETL

Enforce fine-grained access control on data lake tables using AWS Glue 5.0 integrated with AWS Lake Formation

Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift

Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer

Stay Connected