Data Analytics, Data Transformation and Metadata

Data Analytics

Data Transformation

Metadata

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

In addition to real-time analytics and visualization, the data needs to be shared for long-term data analytics and machine learning applications. From here, the metadata is published to Amazon DataZone by using AWS Glue Data Catalog. This process is shown in the following figure.

IoT

IoT Machine Learning Metadata Data-driven

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Institutional Data & AI Platform architecture The Institutional Division has implemented a self-service data platform to enable the domain teams to build and manage data products autonomously. The following diagram illustrates the building blocks of the Institutional Data & AI Platform.

Metadata

Metadata Data Governance Data Quality Data-driven

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

DECEMBER 13, 2023

In addition to using native managed AWS services that BMS didn’t need to worry about upgrading, BMS was looking to offer an ETL service to non-technical business users that could visually compose data transformation workflows and seamlessly run them on the AWS Glue Apache Spark-based serverless data integration engine.

Metadata

Metadata Data Lake Visualization Data Quality

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

AWS Big Data

OCTOBER 11, 2023

The lift and shift migration approach is limited in its ability to transform businesses because it relies on outdated, legacy technologies and architectures that limit flexibility and slow down productivity. It shows a call center streaming data source that sends the latest call center feed in every 15 seconds.

Management

Management Metadata Analytics Dashboards

How to Build a Successful Metadata Management Framework

Alation

JUNE 28, 2022

This is where metadata, or the data about data, comes into play. Having a data catalog is the cornerstone of your data governance strategy, but what supports your data catalog? Your metadata management framework provides the underlying structure that makes your data accessible and manageable.

Metadata

Metadata Management Data Governance Machine Learning

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. If we talk about Big Data, data visualization is crucial to more successfully drive high-level decision making.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Lay the groundwork now for advanced analytics and AI

CIO Business Intelligence

AUGUST 3, 2023

When global technology company Lenovo started utilizing data analytics, they helped identify a new market niche for its gaming laptops, and powered remote diagnostics so their customers got the most from their servers and other devices. Without those templates, it’s hard to add such information after the fact.”

Analytics

Analytics Data Lake Metadata Cost-Benefit

Deliver decompressed Amazon CloudWatch Logs to Amazon S3 and Splunk using Amazon Data Firehose

AWS Big Data

APRIL 2, 2024

You can use Amazon Data Firehose to aggregate and deliver log events from your applications and services captured in Amazon CloudWatch Logs to your Amazon Simple Storage Service (Amazon S3) bucket and Splunk destinations, for use cases such as data analytics, security analysis, application troubleshooting etc.

Metadata

Metadata Marketing Analytics Data Transformation

How HR&A uses Amazon Redshift spatial analytics on Amazon Redshift Serverless to measure digital equity in states across the US

AWS Big Data

DECEMBER 5, 2023

A combination of Amazon Redshift Spectrum and COPY commands are used to ingest the survey data stored as CSV files. For the files with unknown structures, AWS Glue crawlers are used to extract metadata and create table definitions in the Data Catalog. She helps customers architect data analytics solutions at scale on AWS.

Measurement

Measurement Dashboards Analytics Data Warehouse

How healthcare organizations can analyze and create insights using price transparency data

AWS Big Data

OCTOBER 11, 2023

The data in the machine-readable files can provide valuable insights to understand the true cost of healthcare services and compare prices and quality across hospitals. The availability of machine-readable files opens up new possibilities for data analytics, allowing organizations to analyze large amounts of pricing data.

Visualization

Visualization Dashboards Data-driven Gap analysis

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Ontotext

NOVEMBER 18, 2021

Picture this – you start with the perfect use case for your data analytics product. And all of them are asking hard questions: “Can you integrate my data, with my particular format?”, “How well can you scale?”, “How many visualizations do you offer?”. Nowadays, data analytics doesn’t exist on its own.

Visualization

Visualization Reporting Metadata Enterprise

Gain insights from historical location data using Amazon Location Service and AWS analytics services

AWS Big Data

MARCH 13, 2024

Data analytics – Business analysts gather operational insights from multiple data sources, including the location data collected from the vehicles. You can also use the data transformation feature of Data Firehose to invoke a Lambda function to perform data transformation in batches.

Analytics

Analytics IoT Metadata Internet of Things

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To ingest the data, smava uses a set of popular third-party customer data platforms complemented by custom scripts. After the data lands in Amazon S3, smava uses the AWS Glue Data Catalog and crawlers to automatically catalog the available data, capture the metadata, and provide an interface that allows querying all data assets.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

AWS Big Data

AUGUST 1, 2023

For GlueDatabaseName , enter a unique name for the Data Catalog database to hold the Jira data table metadata (the default is jiralake ). This mode will scan all data and disable the change data capture (CDC) features of the stack. The DataBrew job performs data transformation and filtering tasks.

Data Lake

Data Lake Data Transformation Data-driven Cost-Benefit

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

Specifically, the system uses Amazon SageMaker Processing jobs to process the data stored in the data lake, employing the AWS SDK for Pandas (previously known as AWS Wrangler) for various data transformation operations, including cleaning, normalization, and feature engineering.

Data Lake

Data Lake Analytics Snapshot Data Quality

Run Apache Hive workloads using Spark SQL with Amazon EMR on EKS

AWS Big Data

OCTOBER 18, 2023

FINRA centralizes all its data in Amazon Simple Storage Service (Amazon S3) with a remote Hive metastore on Amazon Relational Database Service (Amazon RDS) to manage their metadata information. The support to run Spark SQL through the StartJobRun API in EMR on EKS has further enabled FINRA’s innovation in data analytics.

Big Data

Big Data Data Processing Interactive Testing

Why The Public Sector Needs Data Governance

Alation

NOVEMBER 22, 2022

Before you implement a data governance framework, you need to know the data you already have. This means you need to: Inventory data: Know all information resources and relevant metadata. Classify data: Organize structured and unstructured data into relevant categories. Reuse metadata productively.

Data Governance

Data Governance Metadata Data-driven Unstructured Data

Tackling AI’s data challenges with IBM databases on AWS

IBM Big Data Hub

MARCH 14, 2024

The solution: IBM databases on AWS To solve for these challenges, IBM’s portfolio of SaaS database solutions on Amazon Web Services (AWS), enables enterprises to scale applications, analytics and AI across the hybrid cloud landscape.

Cost-Benefit

Cost-Benefit Metadata Optimization Management

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Ontotext

FEBRUARY 12, 2024

For many organizations, a centralized data platform will fall short as it gives data teams much less autonomy over managing increasingly diverse and voluminous datasets. In an era where data analytics means competitive differentiation, it’s critical for decision-makers to have access to the data they need, when they need it.

Data-driven

Data-driven Data Lake Data Quality Business Objectives

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes. Amazon EMR Serverless is a serverless option in Amazon EMR that makes it easy for data analysts and engineers to run open-source big data analytics frameworks without configuring, managing, and scaling clusters or servers.

Data Lake

Data Lake Dashboards Metrics Metadata

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

Third-party data might include industry benchmarks, data feeds (such as weather and social media), and/or anonymized customer data. Four Approaches to Data Analytics The world of data analytics is constantly and quickly changing. Data Transformation and Enrichment Data can be enriched for analysis.

Analytics

Analytics Cost-Benefit Visualization Dashboards

Hybrid big data analytics with Amazon EMR on AWS Outposts

AWS Big Data

JANUARY 29, 2025

times more performant than Apache Spark 3.5.1), and ease of Amazon EMR with the control and proximity of your data center, empowering enterprises to meet stringent regulatory and operational requirements while unlocking new data processing possibilities. This method is ideal for recurring tasks or large-scale data transformations.

Big Data

Big Data Data Analytics Analytics Interactive

Stream real-time data into Apache Iceberg tables in Amazon S3 using Amazon Data Firehose

AWS Big Data

NOVEMBER 6, 2024

To learn more about how to process Firehose records using Lambda, see Transform source data in Amazon Data Firehose. After executing your Lambda function, Firehose looks for routing information and operations in the metadata fields (in the following format) provided by your Lambda function. b64decode(record['data']).decode('utf-8')

Metadata

Metadata Data Lake Management Internet of Things

Streamline AWS WAF log analysis with Apache Iceberg and Amazon Data Firehose

AWS Big Data

FEBRUARY 18, 2025

To optimize their security operations, organizations are adopting modern approaches that combine real-time monitoring with scalable data analytics. They are using data lake architectures and Apache Iceberg to efficiently process large volumes of security data while minimizing operational overhead.

Snapshot

Snapshot Optimization Data Lake Metadata

“You Complete Me,” said Data Lineage to DataOps Observability.

DataKitchen

JANUARY 23, 2023

To capture a more complete picture of the data’s journey, it is important to have a DataOps Observability system in place. Data lineage is static and often lags by weeks or months. Data lineage is often considered static because it is typically based on snapshots of data and metadata taken at a specific time.

Testing

Testing Data Quality Data Governance Data-driven

Ingest telemetry messages in near real time with Amazon API Gateway, Amazon Data Firehose, and Amazon Location Service

AWS Big Data

NOVEMBER 14, 2024

We use the built-in features of Data Firehose, including AWS Lambda for necessary data transformation and Amazon Simple Notification Service (Amazon SNS) for near real-time alerts. AWS Glue – The AWS Glue Data Catalog is your persistent technical metadata store in the AWS Cloud. Meters) GPS value Speed s 1.0 (km/h)

Data Lake

Data Lake Metadata Testing Data-driven

Data Leaders Brief

How EUROGATE established a data mesh architecture using Amazon DataZone

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Webinars

Trending Sources

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Webinars

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

How to Build a Successful Metadata Management Framework

Biggest Trends in Data Visualization Taking Shape in 2022

Lay the groundwork now for advanced analytics and AI

Deliver decompressed Amazon CloudWatch Logs to Amazon S3 and Splunk using Amazon Data Firehose

How HR&A uses Amazon Redshift spatial analytics on Amazon Redshift Serverless to measure digital equity in states across the US

How healthcare organizations can analyze and create insights using price transparency data

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Gain insights from historical location data using Amazon Location Service and AWS analytics services

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Run Apache Hive workloads using Spark SQL with Amazon EMR on EKS

Why The Public Sector Needs Data Governance

Tackling AI’s data challenges with IBM databases on AWS

Data Mesh 101: How Data Mesh Helps Organizations Be Data-Driven and Achieve Velocity

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

What Is Embedded Analytics?

Hybrid big data analytics with Amazon EMR on AWS Outposts

Stream real-time data into Apache Iceberg tables in Amazon S3 using Amazon Data Firehose

Streamline AWS WAF log analysis with Apache Iceberg and Amazon Data Firehose

“You Complete Me,” said Data Lineage to DataOps Observability.

Ingest telemetry messages in near real time with Amazon API Gateway, Amazon Data Firehose, and Amazon Location Service

Stay Connected