Data Analytics, Data Architecture and Data Processing

Data Analytics

Data Architecture

Data Processing

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

AWS Big Data

OCTOBER 30, 2024

This is part two of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to load data from a legacy database (SQL Server) into a transactional data lake ( Apache Iceberg ) using AWS Glue. To start the job, choose Run. format(dbname)).config("spark.sql.catalog.glue_catalog.catalog-impl",

Data Lake

Data Lake Data Processing Optimization Machine Learning

Eight Top DataOps Trends for 2022

DataKitchen

NOVEMBER 29, 2021

Data Gets Meshier. 2022 will bring further momentum behind modular enterprise architectures like data mesh. The data mesh addresses the problems characteristic of large, complex, monolithic data architectures by dividing the system into discrete domains managed by smaller, cross-functional teams.

Testing

Testing Data Lake Data Architecture Manufacturing

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Trending Sources

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

Need for a data mesh architecture Because entities in the EUROGATE group generate vast amounts of data from various sourcesacross departments, locations, and technologiesthe traditional centralized data architecture struggles to keep up with the demands for real-time insights, agility, and scalability.

IoT

IoT Machine Learning Metadata Data-driven

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Deciphering the Pros & Cons of Real-Time Data Streaming

Smart Data Collective

SEPTEMBER 15, 2021

The data architecture assimilates and processes sizable volumes of streaming data from different data sources. This very architecture ingests data right away while it is getting generated. Data streaming in real-time enables an organization to act in the moment, which eventually enables it to prosper.

IoT

IoT Business Objectives Manufacturing Software

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

Together with price-performance, Amazon Redshift offers capabilities such as serverless architecture, machine learning integration within your data warehouse and secure data sharing across the organization. dbt Cloud is a hosted service that helps data teams productionize dbt deployments. Choose Create.

Data Warehouse

Data Warehouse Analytics Testing Modeling

Announcing the 2020 Data Impact Award Winners

Cloudera

NOVEMBER 18, 2020

The technological linchpin of its digital transformation has been its Enterprise Data Architecture & Governance platform. It hosts over 150 big data analytics sandboxes across the region with over 200 users utilizing the sandbox for data discovery. times more effective than traditional mass marketing.

Internet Publishing and Broadcasting

Internet Publishing and Broadcasting Data-driven Broadcasting Digital Transformation

Power analytics as a service capabilities using Amazon Redshift

AWS Big Data

APRIL 17, 2024

It offers features like data sharing , Amazon Redshift ML , Amazon Redshift Spectrum , and Amazon Redshift Serverless , which simplify application building and make it effortless for AaaS companies to embed rich data analytics capabilities. times better price-performance than other cloud data warehouses.

Data Warehouse

Data Warehouse Analytics Cost-Benefit Data Processing

Build SAML identity federation for Amazon OpenSearch Service domains within a VPC

AWS Big Data

FEBRUARY 7, 2024

Create an Amazon Route 53 public hosted zone such as mydomain.com to be used for routing internet traffic to your domain. For instructions, refer to Creating a public hosted zone. Request an AWS Certificate Manager (ACM) public certificate for the hosted zone. hosted_zone_id – The Route 53 public hosted zone ID.

Dashboards

Dashboards Data Processing Metadata Consulting

Public or On-Prem? Telco giants are optimizing the network with the Hybrid Cloud

Cloudera

OCTOBER 19, 2022

The telecommunications industry continues to develop hybrid data architectures to support data workload virtualization and cloud migration. Telco organizations are planning to move towards hybrid multi-cloud to manage data better and support their workforces in the near future. 2- AI capability drives data monetization.

Optimization

Optimization Data Architecture Data Governance B2B

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

The producer account will host the EMR cluster and S3 buckets. The catalog account will host Lake Formation and AWS Glue. The consumer account will host EMR Serverless, Athena, and SageMaker notebooks. By using Data Catalog metadata federation, organizations can construct a sophisticated data architecture.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

How Swisscom automated Amazon Redshift as part of their One Data Platform solution using AWS CDK – Part 1

AWS Big Data

JUNE 12, 2024

Swisscom’s Data, Analytics, and AI division is building a One Data Platform (ODP) solution that will enable every Swisscom employee, process, and product to benefit from the massive value of Swisscom’s data. The following high-level architecture diagram shows ODP with different layers of the modern data architecture.

Data Architecture

Data Architecture Cost-Benefit Data-driven Experimentation

Understanding Digital Interactions in Real-Time

CIO Business Intelligence

JUNE 29, 2022

But this glittering prize might cause some organizations to overlook something significantly more important: constructing the kind of event-driven data architecture that supports robust real-time analytics. An event-based, real-time data architecture is precisely how businesses today create the experiences that consumers expect.

Interactive

Interactive Data-driven Data Architecture Software

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

AWS Big Data

MARCH 6, 2025

In todays data-driven world, securely accessing, visualizing, and analyzing data is essential for making informed business decisions. The Amazon Redshift Data API simplifies access to your Amazon Redshift data warehouse by removing the need to manage database drivers, connections, network configurations, data buffering, and more.

Visualization

Visualization Sales Data Warehouse Management

4 paths to sustainable AI

CIO Business Intelligence

JANUARY 31, 2024

The size of the data sets is limited by business concerns. Use renewable energy Hosting AI operations at a data center that uses renewable power is a straightforward path to reduce carbon emissions, but it’s not without tradeoffs. Data analytics lead Diego Cáceres urges caution about when to use AI.

Cost-Benefit

Cost-Benefit Modeling Testing IoT

Habib Bank manages data at scale with Cloudera Data Platform

Cloudera

NOVEMBER 17, 2022

The bank will be able to secure, manage, and analyse huge volumes of structured and unstructured data, with the analytic tool of their choice. . Cloudera’s CDP is the only solution that can address the system, hosting, integration and security, enabling us to deploy quickly and easily with minimal impact to operations.”

Management

Management Data Lake Consulting Unstructured Data

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

JANUARY 22, 2024

Cost and resource efficiency – This is an area where Acast observed a reduction in data duplication, and therefore cost reduction (in some accounts, removing the copy of data 100%), by reading data across accounts while enabling scaling. Srikant Das is an Acceleration Lab Solutions Architect at Amazon Web Services.

Data-driven

Data-driven Advertising Metadata Data Architecture

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

AWS Big Data

OCTOBER 18, 2023

Many customers migrate their data warehousing workloads to Amazon Redshift and benefit from the rich capabilities it offers, such as the following: Amazon Redshift seamlessly integrates with broader data, analytics, and AI or machine learning (ML) services on AWS , enabling you to choose the right tool for the right job.

Analytics

Analytics Data Warehouse Dashboards Testing

Meet the newest Data Superheros: The Sixth Annual Data Impact Awards Finalists Are…

Cloudera

AUGUST 28, 2018

These thought leaders in data management and analytics represent all areas of the industry from executives and industry analysts to professors and media experts. However, this year, it is evident that the pace of acceleration to modern data architectures has intensified. ” – Cornelia Levy-Bencheton. .”

Machine Learning

Machine Learning Digital Transformation Consulting IoT

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Success criteria alignment by all stakeholders (producers, consumers, operators, auditors) is key for successful transition to a new Amazon Redshift modern data architecture. The success criteria are the key performance indicators (KPIs) for each component of the data workflow.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

AWS Big Data

MAY 30, 2024

It also helps you securely access your data in operational databases, data lakes, or third-party datasets with minimal movement or copying of data. Tens of thousands of customers use Amazon Redshift to process large amounts of data, modernize their data analytics workloads, and provide insights for their business users.

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Structured Data

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. This data is sent to Apache Kafka, which is hosted on Amazon Managed Streaming for Apache Kafka (Amazon MSK).

Data Lake

Data Lake Analytics Snapshot Data Quality

VeloxCon 2024: Innovation in data management

IBM Big Data Hub

APRIL 29, 2024

VeloxCon 2024 , the premier developer conference that is dedicated to the Velox open-source project, brought together industry leaders, engineers, and enthusiasts to explore the latest advancements and collaborative efforts shaping the future of data management.

Management

Management Optimization Data Processing Metrics

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

Strategize based on how your teams explore data, run analyses, wrangle data for downstream requirements, and visualize data at different levels. Plan on how you can enable your teams to use ML to move from descriptive to prescriptive analytics.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes. Amazon EMR Serverless is a serverless option in Amazon EMR that makes it easy for data analysts and engineers to run open-source big data analytics frameworks without configuring, managing, and scaling clusters or servers.

Data Lake

Data Lake Dashboards Metrics Metadata

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

Metadata exporter This section provides details on the AWS Glue job that exports the AWS Glue Data Catalog into an S3 location. The source code for the application is hosted the AWS Glue GitHub. He advises clients on architecting and adopting Data Architectures that best serve their Data Analytics and Machine Learning needs.

Metadata

Metadata Data Lake Machine Learning Big Data

The Gartner 2022 Leadership Vision for Data and Analytics Leaders Questions and Answers

Andrew White

JANUARY 9, 2022

On Thursday January 6th I hosted Gartner’s 2022 Leadership Vision for Data and Analytics webinar. To drive a successful Data Analytics strategy do you think it is a multidisciplinary activity and if so, what additional roles would you expect to see involved. We write about data and analytics.

Analytics

Analytics Measurement Data-driven Modeling

The future of data: A 5-pillar approach to modern data management

CIO Business Intelligence

DECEMBER 11, 2024

Manish Limaye Pillar #1: Data platform The data platform pillar comprises tools, frameworks and processing and hosting technologies that enable an organization to process large volumes of data, both in batch and streaming modes.

Management

Management Data Governance Data Science Reporting

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

AWS Big Data

JULY 14, 2023

These inputs reinforced the need of a unified data strategy across the FinOps teams. We decided to build a scalable data management product that is based on the best practices of modern data architecture. Our source system and domain teams were mapped as data producers, and they would have ownership of the datasets.

Finance

Finance Metadata Big Data Recreation/Entertainment

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

AWS Big Data

FEBRUARY 27, 2024

Four-layered data lake and data warehouse architecture – The architecture comprises four layers, including the analytical layer, which houses purpose-built facts and dimension datasets that are hosted in Amazon Redshift.

Data-driven

Data-driven Data Lake Data Quality Data Governance

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

Overview of solution As a data-driven company, smava relies on the AWS Cloud to power their analytics use cases. smava ingests data from various external and internal data sources into a landing stage on the data lake based on Amazon Simple Storage Service (Amazon S3).

Data Lake

Data Lake Data Warehouse Data-driven B2B

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

Third-party data might include industry benchmarks, data feeds (such as weather and social media), and/or anonymized customer data. Four Approaches to Data Analytics The world of data analytics is constantly and quickly changing. The application thus becomes a vital information hub.

Analytics

Analytics Cost-Benefit Visualization Dashboards

Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer

AWS Big Data

OCTOBER 30, 2024

This is the final part of a three-part series where we show how to build a data lake on AWS using a modern data architecture. This post shows how to process data with Amazon Redshift Spectrum and create the gold (consumption) layer. His focus areas are MLOps, feature stores, data lakes, model hosting, and generative AI.

Data Lake

Data Lake Machine Learning Data Architecture Data-driven

Data Leaders Brief

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

Eight Top DataOps Trends for 2022

Webinars

Trending Sources

How EUROGATE established a data mesh architecture using Amazon DataZone

Webinars

Deciphering the Pros & Cons of Real-Time Data Streaming

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Announcing the 2020 Data Impact Award Winners

Power analytics as a service capabilities using Amazon Redshift

Build SAML identity federation for Amazon OpenSearch Service domains within a VPC

Public or On-Prem? Telco giants are optimizing the network with the Hybrid Cloud

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

How Swisscom automated Amazon Redshift as part of their One Data Platform solution using AWS CDK – Part 1

Understanding Digital Interactions in Real-Time

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

4 paths to sustainable AI

Habib Bank manages data at scale with Cloudera Data Platform

Design a data mesh on AWS that reflects the envisioned organization

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

Meet the newest Data Superheros: The Sixth Annual Data Impact Awards Finalists Are…

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

VeloxCon 2024: Innovation in data management

Create an end-to-end data strategy for Customer 360 on AWS

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

How Cargotec uses metadata replication to enable cross-account data sharing

The Gartner 2022 Leadership Vision for Data and Analytics Leaders Questions and Answers

The future of data: A 5-pillar approach to modern data management

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

How smava makes loans transparent and affordable using Amazon Redshift Serverless

What Is Embedded Analytics?

Modernize your legacy databases with AWS data lakes, Part 3: Build a data lake processing layer

Stay Connected