Big Data, Metadata and Visualization

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

AWS Big Data

OCTOBER 30, 2024

Ali Tore, Senior Vice President of Advanced Analytics at Salesforce, highlighting the value of this integration, says “We’re excited to partner with Amazon to bring Tableau’s powerful data exploration and AI-driven analytics capabilities to customers managing data across organizational boundaries with Amazon DataZone.

Visualization

Visualization Data Lake Testing Data Governance

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

Content includes reports, documents, articles, presentations, visualizations, video, and audio representations of the insights and knowledge that have been extracted from data. We could further refine our opening statement to say that our business users are too often in a state of being data-rich, but insights-poor, and content-hungry.

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. It can be used for something as visual as reducing traffic jams, to personalizing products and services, to improving the experience in multiplayer video games. We would like to talk about data visualization and its role in the big data movement.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

In the era of big data, data lakes have emerged as a cornerstone for storing vast amounts of raw data in its native format. They support structured, semi-structured, and unstructured data, offering a flexible and scalable environment for data ingestion from multiple sources.

Metadata

Metadata Snapshot Data Lake Metrics

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

In addition to real-time analytics and visualization, the data needs to be shared for long-term data analytics and machine learning applications. From here, the metadata is published to Amazon DataZone by using AWS Glue Data Catalog. This process is shown in the following figure.

IoT

IoT Machine Learning Metadata Data-driven

Best Practices for Metadata Management

Alation

JULY 19, 2021

What Is Metadata? Metadata is information about data. A clothing catalog or dictionary are both examples of metadata repositories. Indeed, a popular online catalog, like Amazon, offers rich metadata around products to guide shoppers: ratings, reviews, and product details are all examples of metadata.

Metadata

Metadata Management Data Governance Machine Learning

CRM’s Have a Big Data Technical Debt Problem: Here’s How to Fix It

Smart Data Collective

JULY 27, 2021

Customer relationship management (CRM) platforms are very reliant on big data. As these platforms become more widely used, some of the data resources they depend on become more stretched. CRM providers need to find ways to address the technical debt problem they are facing through new big data initiatives.

Big Data

Big Data Snapshot IT Dashboards

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

AWS Big Data

DECEMBER 16, 2024

This approach simplifies your data journey and helps you meet your security requirements. The SageMaker Lakehouse data connection testing capability boosts your confidence in established connections. You can navigate to the projects Data page to visually verify the existence of the newly created table. Choose Save.

Visualization

Visualization Data Processing Testing Publishing

Manage Amazon OpenSearch Service Visualizations, Alerts, and More with GitHub and Jenkins

AWS Big Data

OCTOBER 24, 2024

OpenSearch Service stores different types of stored objects, such as dashboards, visualizations, alerts, security roles, index templates, and more, within the domain. As your user base and number of Amazon OpenSearch Service domains grow, tracking activities and changes to those saved objects becomes increasingly difficult.

Visualization

Visualization Management Data Processing Testing

The Missing Link in Enterprise Data Governance: Metadata

Octopai

JUNE 26, 2020

In order to figure out why the numbers in the two reports didn’t match, Steve needed to understand everything about the data that made up those reports – when the report was created, who created it, any changes made to it, which system it was created in, etc. Enterprise data governance. Metadata in data governance.

Metadata

Metadata Data Governance Enterprise Reporting

Amazon DataZone introduces OpenLineage-compatible data lineage visualization in preview

AWS Big Data

JULY 8, 2024

We are excited to announce the preview of API-driven, OpenLineage-compatible data lineage in Amazon DataZone to help you capture, store, and visualize lineage of data movement and transformations of data assets on Amazon DataZone. The lineage visualized includes activities inside the Amazon DataZone business data catalog.

Visualization

Visualization Metadata Publishing Sales

Hadoop Data Mining Tools Can Enhance The Value Of Digital Assets

Smart Data Collective

AUGUST 25, 2020

Web developers utilized data to some capacity as well, but marketers rarely considered doing so. Big data has become critical to the evolution of digital marketing. Some of the benefits are detailed below: Optimizing metadata for greater reach and branding benefits. One of the most overlooked factors is metadata.

Data mining

Data mining Metadata Big Data ROI

Visualize Amazon DynamoDB insights in Amazon QuickSight using the Amazon Athena DynamoDB connector and AWS Glue

AWS Big Data

NOVEMBER 17, 2023

DynamoDB offers built-in security, continuous backups, automated multi-Region replication, in-memory caching, and data import and export tools. The scalability and flexible data schema of DynamoDB make it well-suited for a variety of use cases. Data stored in DynamoDB is the basis for valuable business intelligence (BI) insights.

Visualization

Visualization Metadata Testing Internet of Things

Integrate custom applications with AWS Lake Formation – Part 2

AWS Big Data

NOVEMBER 19, 2024

For the purposes of this post, we use a local machine based on MacOS and Visual Studio Code as our integrated development environment (IDE), but you could use your preferred development environment and IDE. Row type – Enable this option to display only rows that have at least one cell with authorized data.

Data Processing

Data Processing Metadata Publishing Testing

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Institutional Data & AI Platform architecture The Institutional Division has implemented a self-service data platform to enable the domain teams to build and manage data products autonomously. The following diagram illustrates the building blocks of the Institutional Data & AI Platform.

Metadata

Metadata Data Governance Data Quality Data-driven

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 3: Visualization and trend analysis using Amazon QuickSight

AWS Big Data

MARCH 29, 2024

QuickSight makes it straightforward for business users to visualize data in interactive dashboards and reports. An AWS Glue crawler scans data on the S3 bucket and populates table metadata on the AWS Glue Data Catalog. Looking at the Skewness Job per Job visualization, there was spike on November 1, 2023.

Metrics

Metrics Visualization Dashboards Publishing

Introducing MongoDB Atlas metadata collection with AWS Glue crawlers

AWS Big Data

FEBRUARY 6, 2023

Review the MongoDB AWS Glue database and table We can navigate to the AWS Glue Data Catalog to examine the tables that were created by the crawler. Choose the table to view the schema and other metadata. Note that the crawler captured nested data as a STRUCT and correctly listed the ARRAY fields. Choose Create job.

Metadata

Metadata Data Lake Machine Learning Big Data

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

AWS Big Data

DECEMBER 4, 2024

After you create a connection once, you can choose to use the same connection across various AWS Glue components including Glue ETL, Glue Visual ETL and zero-ETL. The following are the key components and steps in the integration process: Zero-ETL extracts and loads the data into Amazon S3 , a highly scalable object storage service.

Data Integration

Data Integration Data Lake Statistics Data-driven

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

AWS Big Data

DECEMBER 4, 2024

SageMaker brings together widely adopted AWS ML and analytics capabilities—virtually all of the components you need for data exploration, preparation, and integration; petabyte-scale big data processing; fast SQL analytics; model development and training; governance; and generative AI development.

Data Analytics

Data Analytics Analytics Data Lake Data Quality

Amazon OpenSearch Service launches flow builder to empower rapid AI search innovation

AWS Big Data

MAY 2, 2025

Through a visual designer, you can configure custom AI search flowsa series of AI-driven data enrichments performed during ingestion and search. You can use the flow builder through APIs or a visual designer. The visual designer is recommended for helping you manage workflow projects. that can operate on text and images.

Machine Learning

Machine Learning Visualization Dashboards Metadata

A Few Proven Suggestions for Handling Large Data Sets

Smart Data Collective

SEPTEMBER 26, 2021

Working with massive structured and unstructured data sets can turn out to be complicated. It’s obvious that you’ll want to use big data, but it’s not so obvious how you’re going to work with it. So, let’s have a close look at some of the best strategies to work with large data sets. It’s a good idea to record metadata.

Metadata

Metadata Visualization Unstructured Data Data mining

Copy and mask PII between Amazon RDS databases using visual ETL jobs in AWS Glue Studio

AWS Big Data

AUGUST 26, 2024

You can use AWS Glue Studio to set up data replication and mask PII with no coding required. AWS Glue Studio visual editor provides a low-code graphic environment to build, run, and monitor extract, transform, and load (ETL) scripts. This helps you to discover and work with the data to build ETL jobs.

Visualization

Visualization Metadata Data Transformation Testing

Business Intelligence for Fairs, Congresses and Exhibitions

Smart Data Collective

APRIL 14, 2021

Advancement in big data technology has made the world of business even more competitive. The proper use of business intelligence and analytical data is what drives big brands in a competitive market. Business intelligence tools can include data warehousing, data visualizations, dashboards, and reporting.

Business Intelligence

Business Intelligence Dashboards Visualization Big Data

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architect role Data architects are senior visionaries who translate business requirements into technology requirements and define data standards and principles, often in support of data or digital transformations. Data architect vs. data engineer The data architect and data engineer roles are closely related.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Data Insights for Everyone — The Semantic Layer to the Rescue

Rocket-Powered Data Science

SEPTEMBER 20, 2021

They realized that the search results would probably not provide an answer to my question, but the results would simply list websites that included my words on the page or in the metadata tags: “Texas”, “Cows”, “How”, etc.

Data Science

Data Science Forecasting Business Intelligence Sales

Top analytics announcements of AWS re:Invent 2024

AWS Big Data

FEBRUARY 26, 2025

The next generation of SageMaker also introduces new capabilities, including Amazon SageMaker Unified Studio (preview) , Amazon SageMaker Lakehouse , and Amazon SageMaker Data and AI Governance. These metadata tables are stored in S3 Tables, the new S3 storage offering optimized for tabular data. With AWS Glue 5.0,

Analytics

Analytics Data Lake Metadata Data Warehouse

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

With quality data at their disposal, organizations can form data warehouses for the purposes of examining trends and establishing future-facing strategies. Industry-wide, the positive ROI on quality data is well understood. He/she assists the organization by providing clarity and insight into advanced data technology solutions.

Data Quality

Data Quality Metrics Data-driven Management

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time. Apache Iceberg offers integrations with popular data processing frameworks such as Apache Spark, Apache Flink, Apache Hive, Presto, and more.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Recap of Amazon Redshift key product announcements in 2024

AWS Big Data

DECEMBER 17, 2024

We have enhanced data sharing performance with improved metadata handling, resulting in data sharing first query execution that is up to four times faster when the data sharing producers data is being updated.

Data Lake

Data Lake Data Warehouse Data-driven Optimization

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

AWS Big Data

OCTOBER 21, 2024

Let’s briefly describe the capabilities of the AWS services we referred above: AWS Glue is a fully managed, serverless, and scalable extract, transform, and load (ETL) service that simplifies the process of discovering, preparing, and loading data for analytics. Amazon Athena is used to query, and explore the data.

Sales

Sales Data-driven Data Processing Key Performance Indicator

Unstructured data management and governance using AWS AI/ML and analytics services

AWS Big Data

OCTOBER 25, 2023

But most important of all, the assumed dormant value in the unstructured data is a question mark, which can only be answered after these sophisticated techniques have been applied. Therefore, there is a need to being able to analyze and extract value from the data economically and flexibly. The solution integrates data in three tiers.

Unstructured Data

Unstructured Data Metadata Management Analytics

What Is Data Modeling? Data Modeling Best Practices for Data-Driven Organizations

erwin

JANUARY 17, 2020

What is Data Modeling? Data modeling is a process that enables organizations to discover, design, visualize, standardize and deploy high-quality data assets through an intuitive, graphical interface. Data models provide visualization, create additional metadata and standardize data design across the enterprise.

Data-driven

Data-driven Modeling Metadata Data Governance

How to Do Data Modeling the Right Way

erwin

MAY 27, 2020

And it exists across these hybrid architectures in different formats: big and unstructured and traditional structured business data may physically sit in different places. What’s desperately needed is a way to understand the relationships and interconnections between so many entities in data sets in detail.

Modeling

Modeling Metadata Data Governance Visualization

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

DECEMBER 13, 2023

In addition to using native managed AWS services that BMS didn’t need to worry about upgrading, BMS was looking to offer an ETL service to non-technical business users that could visually compose data transformation workflows and seamlessly run them on the AWS Glue Apache Spark-based serverless data integration engine.

Metadata

Metadata Data Lake Visualization Data Quality

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

AWS Big Data

OCTOBER 11, 2023

It shows a call center streaming data source that sends the latest call center feed in every 15 seconds. The second streaming data source constitutes metadata information about the call center organization and agents that gets refreshed throughout the day. client("s3") S3_BUCKET = ' ' kinesis_client = boto3.client("kinesis")

Management

Management Metadata Analytics Dashboards

Federate to Amazon Redshift Query Editor v2 with Microsoft Entra ID

AWS Big Data

DECEMBER 10, 2024

To interact with and analyze data stored in Amazon Redshift, AWS provides the Amazon Redshift Query Editor V2 , a web-based tool that allows you to explore, analyze, and share data using SQL. The Query Editor V2 offers a user-friendly interface for connecting to your Redshift clusters, executing queries, and visualizing results.

Sales

Sales Metadata Enterprise Testing

Extracting key insights from Amazon S3 access logs with AWS Glue for Ray

AWS Big Data

SEPTEMBER 7, 2023

AWS Glue Data Catalog stores information as metadata tables, where each table specifies a single data store. The AWS Glue crawler writes metadata to the Data Catalog by classifying the data to determine the format, schema, and associated properties of the data. Big Data Architect.

Metadata

Metadata Dashboards Metrics Visualization

Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone

AWS Big Data

AUGUST 15, 2024

To address the issue of data quality, Amazon DataZone now integrates directly with AWS Glue Data Quality, allowing you to visualize data quality scores for AWS Glue Data Catalog assets directly within the Amazon DataZone web portal. Amazon DataZone natively supports data sharing for Amazon Redshift data assets.

Data Quality

Data Quality Visualization Metadata Key Performance Indicator

Deploy Amazon QuickSight dashboards to monitor AWS Glue ETL job metrics and set alarms

AWS Big Data

NOVEMBER 3, 2023

In this post, we explore how to combine AWS Glue usage information and metrics with centralized reporting and visualization using QuickSight. You have metrics available per job run within the AWS Glue console, but they don’t cover all available AWS Glue job metrics, and the visuals aren’t as interactive compared to the QuickSight dashboard.

Metrics

Metrics Dashboards Metadata Visualization

Top 10 Data Governance Trends for 2020: Data’s Real Value Comes Into Focus

erwin

JANUARY 3, 2020

Understanding the data governance trends for the year ahead will give business leaders and data professionals a competitive edge … Happy New Year! Regulatory compliance and data breaches have driven the data governance narrative during the past few years.

Data Governance

Data Governance Digital Transformation IoT Metadata

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

Cargotec captures terabytes of IoT telemetry data from their machinery operated by numerous customers across the globe. This data needs to be ingested into a data lake, transformed, and made available for analytics, machine learning (ML), and visualization. The target accounts read data from the source account S3 buckets.

Metadata

Metadata Data Lake Machine Learning Big Data

What is data governance? Best practices for managing data assets

CIO Business Intelligence

MARCH 24, 2023

The program must introduce and support standardization of enterprise data. Programs must support proactive and reactive change management activities for reference data values and the structure/use of master data and metadata.

Data Governance

Data Governance Management Metadata Data Quality

Top 10 Key Features of BI Tools in 2020

FineReport

FEBRUARY 5, 2020

Both the investment community and the IT circle are paying close attention to big data and business intelligence. Metadata management. Users can centrally manage metadata, including searching, extracting, processing, storing, sharing metadata, and publishing metadata externally. Analytics dashboards.

Metadata

Metadata Dashboards Informatics Visualization

AWS Lake Formation 2023 year in review

AWS Big Data

JANUARY 18, 2024

The publishing and subscription workflows of DataZone enhance collaboration between various roles in your organization and speed up the time to derive business insights from your data. You can enhance the technical metadata of the Data Catalog using AI-powered assistants into business metadata of DataZone, making it more easily discoverable.

Data Lake

Data Lake Metadata Data Governance Statistics

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

SAP Datasphere Powers Business at the Speed of Data

Webinars

Trending Sources

Biggest Trends in Data Visualization Taking Shape in 2022

Webinars

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

How EUROGATE established a data mesh architecture using Amazon DataZone

Best Practices for Metadata Management

CRM’s Have a Big Data Technical Debt Problem: Here’s How to Fix It

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

Manage Amazon OpenSearch Service Visualizations, Alerts, and More with GitHub and Jenkins

The Missing Link in Enterprise Data Governance: Metadata

Amazon DataZone introduces OpenLineage-compatible data lineage visualization in preview

Hadoop Data Mining Tools Can Enhance The Value Of Digital Assets

Visualize Amazon DynamoDB insights in Amazon QuickSight using the Amazon Athena DynamoDB connector and AWS Glue

Integrate custom applications with AWS Lake Formation – Part 2

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 3: Visualization and trend analysis using Amazon QuickSight

Introducing MongoDB Atlas metadata collection with AWS Glue crawlers

Simplify data integration with AWS Glue and zero-ETL to Amazon SageMaker Lakehouse

The next generation of Amazon SageMaker: The center for all your data, analytics, and AI

Amazon OpenSearch Service launches flow builder to empower rapid AI search innovation

A Few Proven Suggestions for Handling Large Data Sets

Copy and mask PII between Amazon RDS databases using visual ETL jobs in AWS Glue Studio

Business Intelligence for Fairs, Congresses and Exhibitions

What is a data architect? Skills, salaries, and how to become a data framework master

Data Insights for Everyone — The Semantic Layer to the Rescue

Top analytics announcements of AWS re:Invent 2024

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Recap of Amazon Redshift key product announcements in 2024

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

Unstructured data management and governance using AWS AI/ML and analytics services

What Is Data Modeling? Data Modeling Best Practices for Data-Driven Organizations

How to Do Data Modeling the Right Way

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

Federate to Amazon Redshift Query Editor v2 with Microsoft Entra ID

Extracting key insights from Amazon S3 access logs with AWS Glue for Ray

Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone

Deploy Amazon QuickSight dashboards to monitor AWS Glue ETL job metrics and set alarms

Top 10 Data Governance Trends for 2020: Data’s Real Value Comes Into Focus

How Cargotec uses metadata replication to enable cross-account data sharing

What is data governance? Best practices for managing data assets

Top 10 Key Features of BI Tools in 2020

AWS Lake Formation 2023 year in review

Stay Connected