Data Transformation, Metadata and Visualization

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

AWS Big Data

OCTOBER 30, 2024

Ali Tore, Senior Vice President of Advanced Analytics at Salesforce, highlighting the value of this integration, says “We’re excited to partner with Amazon to bring Tableau’s powerful data exploration and AI-driven analytics capabilities to customers managing data across organizational boundaries with Amazon DataZone.

Visualization

Visualization Data Lake Testing Data Governance

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

Content includes reports, documents, articles, presentations, visualizations, video, and audio representations of the insights and knowledge that have been extracted from data. We could further refine our opening statement to say that our business users are too often in a state of being data-rich, but insights-poor, and content-hungry.

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There are countless examples of big data transforming many different industries. It can be used for something as visual as reducing traffic jams, to personalizing products and services, to improving the experience in multiplayer video games. We would like to talk about data visualization and its role in the big data movement.

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

In addition to real-time analytics and visualization, the data needs to be shared for long-term data analytics and machine learning applications. From here, the metadata is published to Amazon DataZone by using AWS Glue Data Catalog. This process is shown in the following figure.

IoT

IoT Machine Learning Metadata Data-driven

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Institutional Data & AI Platform architecture The Institutional Division has implemented a self-service data platform to enable the domain teams to build and manage data products autonomously. The following diagram illustrates the building blocks of the Institutional Data & AI Platform.

Metadata

Metadata Data Governance Data Quality Data-driven

Copy and mask PII between Amazon RDS databases using visual ETL jobs in AWS Glue Studio

AWS Big Data

AUGUST 26, 2024

You can use AWS Glue Studio to set up data replication and mask PII with no coding required. AWS Glue Studio visual editor provides a low-code graphic environment to build, run, and monitor extract, transform, and load (ETL) scripts. This helps you to discover and work with the data to build ETL jobs.

Visualization

Visualization Metadata Data Transformation Testing

How to Implement Data Lineage Mapping Techniques

Octopai

MARCH 31, 2021

Look for the Metadata. In order to perform accurate data lineage mapping, every process in the system that transforms or touches the data must be recorded. This metadata (read: data about your data) is key to tracking your data. Data Lineage by Tagging or Self-Contained Data Lineage.

Metadata

Metadata Data Transformation Business Intelligence Reporting

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

DECEMBER 13, 2023

In addition to using native managed AWS services that BMS didn’t need to worry about upgrading, BMS was looking to offer an ETL service to non-technical business users that could visually compose data transformation workflows and seamlessly run them on the AWS Glue Apache Spark-based serverless data integration engine.

Metadata

Metadata Data Lake Visualization Data Quality

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

He/she assists the organization by providing clarity and insight into advanced data technology solutions. As quality issues are often highlighted with the use of dashboard software , the change manager plays an important role in the visualization of data quality. 2 – Data profiling. date, month, and year).

Data Quality

Data Quality Metrics Data-driven Management

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

AWS Big Data

DECEMBER 16, 2024

With the ability to browse metadata, you can understand the structure and schema of the data source, identify relevant tables and fields, and discover useful data assets you may not be aware of. You can navigate to the projects Data page to visually verify the existence of the newly created table. Choose Save.

Visualization

Visualization Data Processing Testing Publishing

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Ontotext

NOVEMBER 18, 2021

And all of them are asking hard questions: “Can you integrate my data, with my particular format?”, “How well can you scale?”, “How many visualizations do you offer?”. Nowadays, data analytics doesn’t exist on its own. You have to take care of data extraction, transformation and loading, and of visualization.

Visualization

Visualization Reporting Metadata Enterprise

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

AWS Big Data

OCTOBER 11, 2023

Traditionally, such a legacy call center analytics platform would be built on a relational database that stores data from streaming sources. Data transformations through stored procedures and use of materialized views to curate datasets and generate insights is a known pattern with relational databases.

Management

Management Metadata Analytics Dashboards

How to Build a Successful Metadata Management Framework

Alation

JUNE 28, 2022

This is where metadata, or the data about data, comes into play. Having a data catalog is the cornerstone of your data governance strategy, but what supports your data catalog? Your metadata management framework provides the underlying structure that makes your data accessible and manageable.

Metadata

Metadata Management Data Governance Machine Learning

From Raw Inputs to Polished Outputs: The Art of Testing Data Transformations

Wayne Yaddow

MARCH 5, 2025

The goal is to examine five major methods of verifying and validating data transformations in data pipelines with an eye toward high-quality data deployment. First, we look at how unit and integration tests uncover transformation errors at an early stage. Applicability by Transformation Type 2.

Testing

Testing Data Transformation Statistics Metadata

Tableau further democratizes analytics with AI-fueled features

CIO Business Intelligence

APRIL 30, 2024

Together the technologies aim to help business users and “novice” data analysts explore their data and gain insights without having to resort to data experts. This is really empowering everyone to be a data expert,” Maxon said. “It Shared Dimensions and Composable Data Sources.

Analytics

Analytics Metrics Visualization Dashboards

Available Now! Automated Testing for Data Transformations

Wayne Yaddow

FEBRUARY 18, 2025

Selecting the strategies and tools for validating data transformations and data conversions in your data pipelines. Introduction Data transformations and data conversions are crucial to ensure that raw data is organized, processed, and ready for useful analysis.

Testing

Testing Data Transformation Data-driven Data Quality

What is Data Lineage? Top 5 Benefits of Data Lineage

erwin

APRIL 29, 2020

These tools range from enterprise service bus (ESB) products, data integration tools; extract, transform and load (ETL) tools, procedural code, application program interfaces (API)s, file transfer protocol (FTP) processes, and even business intelligence (BI) reports that further aggregate and transform data.

Metadata

Metadata Key Performance Indicator Data Governance Data Quality

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

AWS Big Data

NOVEMBER 15, 2023

It seamlessly consolidates data from various data sources within AWS, including AWS Cost Explorer (and forecasting with Cost Explorer ), AWS Trusted Advisor , and AWS Compute Optimizer. They can use their own toolsets or rely on provided blueprints to ingest the data from source systems.

Dashboards

Dashboards Analytics Metadata Data Warehouse

How healthcare organizations can analyze and create insights using price transparency data

AWS Big Data

OCTOBER 11, 2023

The availability of machine-readable files opens up new possibilities for data analytics, allowing organizations to analyze large amounts of pricing data. Using machine learning (ML) and data visualization tools, these datasets can be transformed into actionable insights that can inform decision-making.

Visualization

Visualization Dashboards Data-driven Gap analysis

How HR&A uses Amazon Redshift spatial analytics on Amazon Redshift Serverless to measure digital equity in states across the US

AWS Big Data

DECEMBER 5, 2023

This dynamic tool, powered by AWS and CARTO, provided robust visualizations of which regions and populations were interacting with our survey, enabling us to zoom in quickly and address gaps in coverage. Figure 1: Workflow illustrating data ingesting, transformation, and visualization using Redshift and CARTO.

Measurement

Measurement Dashboards Data Warehouse Analytics

5 best open source data flow lineage tools

Octopai

AUGUST 11, 2024

By reverse-engineering, parsing, and converting scripts, Octopai seamlessly connects all data points within and across organizational systems. While open-source tools such as Apache Atlas, Open Metadata, Egeria, Spline, and OpenLineage offer valuable capabilities, they come with their own sets of pros and cons.

Metadata

Metadata Visualization Data Quality Data Governance

Introducing Cloudera DataFlow Designer: Self-service, No-Code Dataflow Design

Cloudera

DECEMBER 9, 2022

Developers need to onboard new data sources, chain multiple data transformation steps together, and explore data as it travels through the flow. A reimagined visual editor to boost developer productivity and enable self service. Enabling self-service for developers.

Testing

Testing Cost-Benefit Interactive Visualization

NEW: Octopai Announces Support of Microsoft Azure Data Factory

Octopai

JANUARY 19, 2021

With Octopai’s support and analysis of Azure Data Factory, enterprises can now view complete end-to-end data lineage from Azure Data Factory all the way through to reporting for the first time ever. About Octopai: Octopai was founded in 2015 by BI professionals who realized the need for dynamic solutions in a stagnant market.

Metadata

Metadata ROI Machine Learning Data Quality

Addressing the Three Scalability Challenges in Modern Data Platforms

Cloudera

NOVEMBER 22, 2021

In addition, more data is becoming available for processing / enrichment of existing and new use cases e.g., recently we have experienced a rapid growth in data collection at the edge and an increase in availability of frameworks for processing that data. As a result, alternative data integration technologies (e.g.,

Data Processing

Data Processing Data Warehouse Enterprise Visualization

Gain insights from historical location data using Amazon Location Service and AWS analytics services

AWS Big Data

MARCH 13, 2024

You can also use the data transformation feature of Data Firehose to invoke a Lambda function to perform data transformation in batches. Athena is used to run geospatial queries on the location data stored in the S3 buckets. Choose Run.

Analytics

Analytics IoT Metadata Internet of Things

Cloudera DataFlow Designer: The Key to Agile Data Pipeline Development

Cloudera

MARCH 14, 2023

Once a draft has been created or opened, developers use the visual Designer to build their data flow logic and validate it using interactive test sessions. Attributes contain key metadata like the source directory of a file or the source topic of a Kafka message.

Testing

Testing Publishing Metadata Interactive

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

AWS Big Data

AUGUST 1, 2023

For GlueDatabaseName , enter a unique name for the Data Catalog database to hold the Jira data table metadata (the default is jiralake ). This mode will scan all data and disable the change data capture (CDC) features of the stack. The DataBrew job performs data transformation and filtering tasks.

Data Lake

Data Lake Data Transformation Data-driven Cost-Benefit

Exploring the AI and data capabilities of watsonx

IBM Big Data Hub

JULY 17, 2023

By supporting open-source frameworks and tools for code-based, automated and visual data science capabilities — all in a secure, trusted studio environment — we’re already seeing excitement from companies ready to use both foundation models and machine learning to accomplish key tasks.

Machine Learning

Machine Learning Data Warehouse Modeling Cost-Benefit

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To ingest the data, smava uses a set of popular third-party customer data platforms complemented by custom scripts. After the data lands in Amazon S3, smava uses the AWS Glue Data Catalog and crawlers to automatically catalog the available data, capture the metadata, and provide an interface that allows querying all data assets.

Data Lake

Data Lake Data Warehouse Data-driven B2B

How Data Lineage Improves Data Compliance

Octopai

DECEMBER 11, 2022

It’s for that reason that even as the first BCBS-239 implementation deadline came into effect a few years ago, McKinsey reported that one-third of Global Systemically Important Banks had focused on “documenting data lineage up to the level of provisioning data elements and including data transformation.”.

Insurance

Insurance Risk Metadata Reporting

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

Business Intelligence Tools: Business intelligence (BI) tools are used to visualize your data. You should pick those that allow for easy integration and can create beautiful data visualizations. These help data analysts visualize key insights that can help you make better data-backed decisions.

Data Warehouse

Data Warehouse Cost-Benefit Data Science Data Transformation

Cross-account integration between SaaS platforms using Amazon AppFlow

AWS Big Data

APRIL 25, 2023

The following AWS services are used for data ingestion, processing, and load: Amazon AppFlow is a fully managed integration service that enables you to securely transfer data between SaaS applications like Salesforce, SAP, Marketo, Slack, and ServiceNow, and AWS services like Amazon S3 and Amazon Redshift , in just a few clicks.

Sales

Sales Visualization Software Metadata

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

Data ingestion – Steps 1 and 2 use AWS DMS, which connects to the source database and moves full and incremental data (CDC) to Amazon S3 in Parquet format. Data transformation – Steps 3 and 4 represent an EMR Serverless Spark application (Amazon EMR 6.9 Let’s refer to this S3 bucket as the raw layer.

Data Lake

Data Lake Dashboards Metrics Metadata

Optimize data layout by bucketing with Amazon Athena and AWS Glue to accelerate downstream queries

AWS Big Data

APRIL 25, 2024

Alternatively, you can use AWS Glue for Apache Spark, which provides built-in support for bucketing configurations during the data transformation process. AWS Glue allows you to define bucketing parameters, such as the number of buckets and the columns to bucket on, providing an optimized data layout for efficient querying with Athena.

Optimization

Optimization Data Lake Cost-Benefit Reporting

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

This is in contrast to traditional BI, which extracts insight from data outside of the app. We rely on increasingly mobile technology to comb through massive amounts of data and solve high-value problems. Plus, there is an expectation that tools be visually appealing to boot. Their dashboards were visually stunning.

Analytics

Analytics Cost-Benefit Visualization Dashboards

What is Data Mapping?

Jet Global

FEBRUARY 23, 2024

Why Data Mapping is Important Data mapping is a critical element of any data management initiative, such as data integration, data migration, data transformation, data warehousing, or automation. Data mapping is important for several reasons.

Data Warehouse

Data Warehouse Reporting Data Transformation Visualization

Partners in Innovation: Voice of the Customer Enhancements to Logi Symphony

Jet Global

JULY 19, 2023

New Dashboard Layout allows “locking” visual position, swap visual position and adaptive layout for mobile devices. Insiders' Guide to Self-Service Analytics Download Now Visual Enhancements Application and development teams are moving beyond data visualization to data storytelling.

Dashboards

Dashboards Visualization Reporting Interactive

A Stitch in Time: How Jet Analytics Boosts Microsoft Fabric Time-to-Value

Jet Global

MARCH 14, 2024

Data Lineage and Documentation Jet Analytics simplifies the process of documenting data assets and tracking data lineage in Fabric. It offers a transparent and accurate view of how data flows through the system, ensuring robust compliance.

Analytics

Analytics Management Reporting Data Quality

Automating Data Warehouses in the Era of AI, Data Products and Data Lakehouses

BI-Survey

MARCH 6, 2025

While efficiency is a priority, data quality and security remain non-negotiable. Developing and maintaining data transformation pipelines are among the first tasks to be targeted for automation. However, caution is advised since accuracy, timeliness, and other aspects of data quality depend on the quality of data pipelines.

Data Warehouse

Data Warehouse Metadata Unstructured Data Data-driven

Introducing the HubSpot connector for AWS Glue

AWS Big Data

DECEMBER 2, 2024

AWS Glue establishes a secure connection to HubSpot using OAuth for authorization and TLS for data encryption in transit. AWS Glue also supports the ability to apply complex data transformations, enabling efficient data integration and preparation to meet your needs.

Data Lake

Data Lake Testing Data Integration Metadata

“You Complete Me,” said Data Lineage to DataOps Observability.

DataKitchen

JANUARY 23, 2023

To capture a more complete picture of the data’s journey, it is important to have a DataOps Observability system in place. Data lineage is static and often lags by weeks or months. Data lineage is often considered static because it is typically based on snapshots of data and metadata taken at a specific time.

Testing

Testing Data Governance Data Quality Data-driven

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

SAP Datasphere Powers Business at the Speed of Data

Webinars

Trending Sources

Biggest Trends in Data Visualization Taking Shape in 2022

Webinars

How EUROGATE established a data mesh architecture using Amazon DataZone

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Copy and mask PII between Amazon RDS databases using visual ETL jobs in AWS Glue Studio

How to Implement Data Lineage Mapping Techniques

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

How to Build a Successful Metadata Management Framework

From Raw Inputs to Polished Outputs: The Art of Testing Data Transformations

Tableau further democratizes analytics with AI-fueled features

Available Now! Automated Testing for Data Transformations

What is Data Lineage? Top 5 Benefits of Data Lineage

BMW Cloud Efficiency Analytics powered by Amazon QuickSight and Amazon Athena

How healthcare organizations can analyze and create insights using price transparency data

How HR&A uses Amazon Redshift spatial analytics on Amazon Redshift Serverless to measure digital equity in states across the US

5 best open source data flow lineage tools

Introducing Cloudera DataFlow Designer: Self-service, No-Code Dataflow Design

NEW: Octopai Announces Support of Microsoft Azure Data Factory

Addressing the Three Scalability Challenges in Modern Data Platforms

Gain insights from historical location data using Amazon Location Service and AWS analytics services

Cloudera DataFlow Designer: The Key to Agile Data Pipeline Development

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

Exploring the AI and data capabilities of watsonx

How smava makes loans transparent and affordable using Amazon Redshift Serverless

How Data Lineage Improves Data Compliance

The Modern Data Stack Explained: What The Future Holds

Cross-account integration between SaaS platforms using Amazon AppFlow

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Optimize data layout by bucketing with Amazon Athena and AWS Glue to accelerate downstream queries

What Is Embedded Analytics?

What is Data Mapping?

Partners in Innovation: Voice of the Customer Enhancements to Logi Symphony

A Stitch in Time: How Jet Analytics Boosts Microsoft Fabric Time-to-Value

Automating Data Warehouses in the Era of AI, Data Products and Data Lakehouses

Introducing the HubSpot connector for AWS Glue

“You Complete Me,” said Data Lineage to DataOps Observability.

Stay Connected