Data Transformation, Enterprise and Metadata

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

AWS Big Data

OCTOBER 14, 2024

Enterprise data is brought into data lakes and data warehouses to carry out analytical, reporting, and data science use cases using AWS analytical services like Amazon Athena , Amazon Redshift , Amazon EMR , and so on. Table metadata is fetched from AWS Glue. The generated Athena SQL query is run. ./

Metadata

Metadata Data Lake Modeling Data Warehouse

Bridging the gap between mainframe data and hybrid cloud environments

CIO Business Intelligence

FEBRUARY 27, 2025

A high hurdle many enterprises have yet to overcome is accessing mainframe data via the cloud. Giving the mobile workforce access to this data via the cloud allows them to be productive from anywhere, fosters collaboration, and improves overall strategic decision-making. Four key challenges prevent them from doing so: 1.

Metadata

Metadata Data Lake Cost-Benefit Forecasting

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

DECEMBER 13, 2023

In addition to using native managed AWS services that BMS didn’t need to worry about upgrading, BMS was looking to offer an ETL service to non-technical business users that could visually compose data transformation workflows and seamlessly run them on the AWS Glue Apache Spark-based serverless data integration engine.

Metadata

Metadata Data Lake Visualization Data Quality

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

AWS Big Data

NOVEMBER 16, 2023

As with all AWS services, Amazon Redshift is a customer-obsessed service that recognizes there isn’t a one-size-fits-all for customers when it comes to data models, which is why Amazon Redshift supports multiple data models such as Star Schemas, Snowflake Schemas and Data Vault. Data Vault 2.0

Enterprise

Enterprise Data Warehouse Data Lake Optimization

How Your Finance Team Can Lead Your Enterprise Data Transformation

Alation

OCTOBER 26, 2021

Today’s best-performing organizations embrace data for strategic decision-making. Because of the criticality of the data they deal with, we think that finance teams should lead the enterprise adoption of data and analytics solutions. They need trusted data to drive reliable reporting, decision-making, and risk reduction.

Finance

Finance Data Transformation Enterprise Metrics

Ensuring Data Transformation Quality with dbt Core

Wayne Yaddow

MARCH 14, 2025

How dbt Core aids data teams test, validate, and monitor complex data transformations and conversions Photo by NASA on Unsplash Introduction dbt Core, an open-source framework for developing, testing, and documenting SQL-based data transformations, has become a must-have tool for modern data teams as the complexity of data pipelines grows.

Data Transformation

Data Transformation Testing Unstructured Data Data Quality

How to Build a Successful Metadata Management Framework

Alation

JUNE 28, 2022

This is where metadata, or the data about data, comes into play. Having a data catalog is the cornerstone of your data governance strategy, but what supports your data catalog? Your metadata management framework provides the underlying structure that makes your data accessible and manageable.

Metadata

Metadata Management Data Governance Machine Learning

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

No, its ultimate goal is to increase return on investment (ROI) for those business segments that depend upon data. With quality data at their disposal, organizations can form data warehouses for the purposes of examining trends and establishing future-facing strategies. 2 – Data profiling. date, month, and year).

Data Quality

Data Quality Metrics Data-driven Management

Available Now! Automated Testing for Data Transformations

Wayne Yaddow

FEBRUARY 18, 2025

Selecting the strategies and tools for validating data transformations and data conversions in your data pipelines. Introduction Data transformations and data conversions are crucial to ensure that raw data is organized, processed, and ready for useful analysis.

Testing

Testing Data Transformation Data-driven Data Quality

The importance of data ingestion and integration for enterprise AI

IBM Big Data Hub

JANUARY 9, 2024

The entire generative AI pipeline hinges on the data pipelines that empower it, making it imperative to take the correct precautions. 4 key components to ensure reliable data ingestion Data quality and governance: Data quality means ensuring the security of data sources, maintaining holistic data and providing clear metadata.

Enterprise

Enterprise Data Integration Data Quality Contextual Data

What is Data Lineage? Top 5 Benefits of Data Lineage

erwin

APRIL 29, 2020

Many large organizations, in their desire to modernize with technology, have acquired several different systems with various data entry points and transformation rules for data as it moves into and across the organization. Who are the data owners? What are the transformation rules? Data Governance.

Key Performance Indicator

Key Performance Indicator Metadata Data Governance Data Quality

Top 6 Benefits of Automating End-to-End Data Lineage

erwin

SEPTEMBER 17, 2020

It’s paramount that organizations understand the benefits of automating end-to-end data lineage. Critically, it makes it easier to get a clear view of how information is created and flows into, across and outside an enterprise. The importance of end-to-end data lineage is widely understood and ignoring it is risky business.

Cost-Benefit

Cost-Benefit Data Governance Metadata Reporting

Making OT-IT integration a reality with new data architectures and generative AI

CIO Business Intelligence

FEBRUARY 20, 2024

The data transformation imperative What Denso and other industry leaders realise is that for IT-OT convergence to be realised, and the benefits of AI unlocked, data transformation is vital. The company can also unify its knowledge base and promote search and information use that better meets its needs.

Data Architecture

Data Architecture Unstructured Data Manufacturing IT

Why Data Lineage is Key to the LIBOR Transition

Octopai

NOVEMBER 23, 2020

In fact, the LIBOR transition program marks one of the largest data transformation obstacles ever seen in financial services. Building an inventory of what will be affected is a huge undertaking across all of the data, reports, and structures that must be accounted for. Automated Data Lineage for Your LIBOR Project.

Metadata

Metadata Enterprise Business Intelligence Data Governance

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

These tools empower analysts and data scientists to easily collaborate on the same data, with their choice of tools and analytic engines. No more lock-in, unnecessary data transformations, or data movement across tools and clouds just to extract insights out of the data.

Data Lake

Data Lake Data Warehouse Data Architecture Metadata

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

Ontotext

NOVEMBER 18, 2021

We can kind of do it within one enterprise, agreeing on certain templates, but when we start going cross-enterprise, or when we start integrating legacy data, it will be a lot of work doing that by hand. OntoRefine is a data transformation tool that lets you unite plenty of data formats and get them into your triplestore.

Visualization

Visualization Reporting Metadata Enterprise

NEW: Octopai Announces Support of Microsoft Azure Data Factory

Octopai

JANUARY 19, 2021

With Octopai’s support and analysis of Azure Data Factory, enterprises can now view complete end-to-end data lineage from Azure Data Factory all the way through to reporting for the first time ever.

Metadata

Metadata ROI Machine Learning Data Quality

How to use foundation models and trusted governance to manage AI workflow risk

IBM Big Data Hub

OCTOBER 16, 2023

It includes processes that trace and document the origin of data, models and associated metadata and pipelines for audits. Curated foundation models, such as those created by IBM or Microsoft, help enterprises scale and accelerate the use and impact of the most advanced AI capabilities using trusted data.

Risk

Risk Modeling Management Metadata

Automate discovery of data relationships using ML and Amazon Neptune graph technology

AWS Big Data

APRIL 19, 2023

However, when a data producer shares data products on a data mesh self-serve web portal, it’s neither intuitive nor easy for a data consumer to know which data products they can join to create new insights. This is especially true in a large enterprise with thousands of data products.

Technology

Technology Data-driven Machine Learning Sales

How healthcare organizations can analyze and create insights using price transparency data

AWS Big Data

OCTOBER 11, 2023

Due to this low complexity, the solution uses AWS serverless services to ingest the data, transform it, and make it available for analytics. The Data Catalog now contains references to the machine-readable data. Use the Data Catalog and transform the hospital price transparency data.

Visualization

Visualization Dashboards Data-driven Gap analysis

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

erwin

JANUARY 11, 2019

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms , including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. And there’s control of that landscape to facilitate insight and collaboration and limit risk.

Data Governance

Data Governance Risk Metadata Management

Gain insights from historical location data using Amazon Location Service and AWS analytics services

AWS Big Data

MARCH 13, 2024

You can also use the data transformation feature of Data Firehose to invoke a Lambda function to perform data transformation in batches. Athena is used to run geospatial queries on the location data stored in the S3 buckets. Choose Run.

Analytics

Analytics IoT Metadata Internet of Things

Choosing A Graph Data Model to Best Serve Your Use Case

Ontotext

MARCH 27, 2024

For example, GPS, social media, cell phone handoffs are modeled as graphs while data catalogs, data lineage and MDM tools leverage knowledge graphs for linking metadata with semantics. RDF is used extensively for data publishing and data interchange and is based on W3C and other industry standards.

Modeling

Modeling Metadata Data Quality Enterprise

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Alation

AUGUST 11, 2022

We chatted about industry trends, why decentralization has become a hot topic in the data world, and how metadata drives many data-centric use cases. But, through it all, Mohan says it’s critical to view everything through the same lens: gaining business value from data. Data fabric is a technology architecture.

Metadata

Metadata Data Warehouse Data Quality Data Lake

5 best open source data flow lineage tools

Octopai

AUGUST 11, 2024

By reverse-engineering, parsing, and converting scripts, Octopai seamlessly connects all data points within and across organizational systems. While open-source tools such as Apache Atlas, Open Metadata, Egeria, Spline, and OpenLineage offer valuable capabilities, they come with their own sets of pros and cons.

Metadata

Metadata Visualization Data Quality Data Governance

Tackling AI’s data challenges with IBM databases on AWS

IBM Big Data Hub

MARCH 14, 2024

The solution: IBM databases on AWS To solve for these challenges, IBM’s portfolio of SaaS database solutions on Amazon Web Services (AWS), enables enterprises to scale applications, analytics and AI across the hybrid cloud landscape. This allows you to scale all analytics and AI workloads across the enterprise with trusted data. 

Cost-Benefit

Cost-Benefit Metadata Optimization Management

Data platform trinity: Competitive or complementary?

IBM Big Data Hub

JANUARY 18, 2023

Data platform architecture has an interesting history. Towards the turn of millennium, enterprises started to realize that the reporting and business intelligence workload required a new solution rather than the transactional applications. A read-optimized platform that can integrate data from multiple applications emerged.

Data Lake

Data Lake Data Warehouse Data-driven Metadata

Introducing Cloudera DataFlow Designer: Self-service, No-Code Dataflow Design

Cloudera

DECEMBER 9, 2022

Cloudera has been providing enterprise support for Apache NiFi since 2015, helping hundreds of organizations take control of their data movement pipelines on premises and in the public cloud. Developers need to onboard new data sources, chain multiple data transformation steps together, and explore data as it travels through the flow.

Testing

Testing Cost-Benefit Interactive Visualization

Exploring the AI and data capabilities of watsonx

IBM Big Data Hub

JULY 17, 2023

is our enterprise-ready next-generation studio for AI builders, bringing together traditional machine learning (ML) and new generative AI capabilities powered by foundation models. With watsonx.ai, businesses can effectively train, validate, tune and deploy AI models with confidence and at scale across their enterprise. IBM watsonx.ai

Machine Learning

Machine Learning Data Warehouse Modeling Cost-Benefit

Turnkey Cloud DataOps: Solution from Alation and Accenture

Alation

MARCH 22, 2022

Accenture calls it the Intelligent Data Foundation (IDF), and it’s used by dozens of enterprises with very complex data landscapes and analytic requirements. Simply put, IDF standardizes data engineering processes. They can better understand data transformations, checks, and normalization.

Metadata

Metadata Cost-Benefit Data Quality Data Lake

Sure, Trust Your Data… Until It Breaks Everything: How Automated Data Lineage Saves the Day

Octopai

JUNE 9, 2024

This not only protected the organization legally but also reinforced its commitment to high standards of data governance. The trend in the industry shows an increasing investment and emphasis on solving data lineage problems in complex enterprise contexts. This is where Octopai excels.

IT

IT Data-driven Predictive Analytics Data Strategy

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

AWS Big Data

JANUARY 17, 2024

This allows you to simplify security and governance over transactional data lakes by providing access controls at table-, column-, and row-level permissions with your Apache Spark jobs. Many large enterprise companies seek to use their transactional data lake to gain insights and improve decision-making.

Data Lake

Data Lake Snapshot Big Data Data-driven

The Modern Data Stack Explained: What The Future Holds

Alation

JANUARY 17, 2023

Data governance is a key use case of the modern data stack. Who Can Adopt the Modern Data Stack? The modern data stack is well-suited for companies with large amounts of data. These help data analysts visualize key insights that can help you make better data-backed decisions.

Data Warehouse

Data Warehouse Cost-Benefit Data Science Data Transformation

The What & Why of Data Governance

erwin

MARCH 4, 2021

Instead, he suggests they put data governance in real-world scenarios to answer these questions: “What is the problem you believe data governance is the answer to?” Or “How would you recognize having effective data governance in place?”. The Benefits of erwin Data Intelligence. Where is it?

Data Governance

Data Governance Digital Transformation Data-driven Cost-Benefit

How Data Lineage Improves Data Compliance

Octopai

DECEMBER 11, 2022

Commercial enterprises utilized consumer information for marketing without taking appropriate care to keep it private. Let’s take a look at several regulatory standards and explore automated data lineage’s role in smoothing and improving data compliance. Data lineage and financial risk data compliance.

Insurance

Insurance Risk Metadata Reporting

SAP Datasphere Powers Business at the Speed of Data

Rocket-Powered Data Science

MARCH 20, 2023

We could further refine our opening statement to say that our business users are too often in a state of being data-rich, but insights-poor, and content-hungry. This is where we dispel an old “big data” notion (heard a decade ago) that was expressed like this: “we need our data to run at the speed of business.”

Data Warehouse

Data Warehouse Metadata Digital Transformation Machine Learning

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

Under the federated mesh architecture, each divisional mesh functions as a node within the broader enterprise data mesh, maintaining a degree of autonomy in managing its data products. This model balances node or domain-level autonomy with enterprise-level oversight, creating a scalable and consistent framework across ANZ.

Metadata

Metadata Data Governance Data Quality Data-driven

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

AWS Big Data

OCTOBER 30, 2024

This integration enables our customers to seamlessly explore data with AI in Tableau, build visualizations, and uncover insights hidden in their governed data, all while leveraging Amazon DataZone to catalog, discover, share, and govern data across AWS, on premises, and from third-party sources—enhancing both governance and decision-making.”

Visualization

Visualization Data Lake Testing Data Governance

Build a data lake with Apache Flink on Amazon EMR

AWS Big Data

JANUARY 27, 2023

To build a data-driven business, it is important to democratize enterprise data assets in a data catalog. With a unified data catalog, you can quickly search datasets and figure out data schema, data format, and location. For metadata read/write, Flink has the catalog interface.

Data Lake

Data Lake Metadata Business Analysis Data-driven

Why Enterprise Data Lineage is Critical for the Success of Your Modern Data Stack

Octopai

NOVEMBER 13, 2022

The modern data stack is a data management system built out of cloud-based data systems. A given modern data stack will usually include components for data ingestion from your data sources, data transformation, data storage, data analysis and reporting.

Enterprise

Enterprise Data Warehouse Reporting Metadata

Lay the groundwork now for advanced analytics and AI

CIO Business Intelligence

AUGUST 3, 2023

And at First Commerce Bank, EVP and COO Gregory Garcia hopes to leverage unified, real-time data to monitor risks such as worsening vacancy rates that could make it harder for commercial property owners to pay their mortgages.

Analytics

Analytics Data Lake Metadata Cost-Benefit

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

dbt is an open source, SQL-first templating engine that allows you to write repeatable and extensible data transforms in Python and SQL. dbt is predominantly used by data warehouses (such as Amazon Redshift ) customers who are looking to keep their data transform logic separate from storage and engine.

Data Lake

Data Lake Management Metrics Data Warehouse

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

In this blog, we’ll delve into the critical role of governance and data modeling tools in supporting a seamless data mesh implementation and explore how erwin tools can be used in that role. erwin also provides data governance, metadata management and data lineage software called erwin Data Intelligence by Quest.

Metadata

Metadata Data Quality Data Governance Modeling

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

Cloudera

OCTOBER 11, 2021

Modak’s Nabu is a born in the cloud, cloud-neutral integrated data engineering platform designed to accelerate the journey of enterprises to the cloud. The platform converges data cataloging, data ingestion, data profiling, data tagging, data discovery, and data exploration into a unified platform, driven by metadata.

Data Lake

Data Lake Cost-Benefit Data-driven Dashboards

Enriching metadata for accurate text-to-SQL generation for Amazon Athena

Bridging the gap between mainframe data and hybrid cloud environments

Webinars

Trending Sources

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Webinars

Power enterprise-grade Data Vaults with Amazon Redshift – Part 1

How Your Finance Team Can Lead Your Enterprise Data Transformation

Ensuring Data Transformation Quality with dbt Core

How to Build a Successful Metadata Management Framework

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

Available Now! Automated Testing for Data Transformations

The importance of data ingestion and integration for enterprise AI

What is Data Lineage? Top 5 Benefits of Data Lineage

Top 6 Benefits of Automating End-to-End Data Lineage

Making OT-IT integration a reality with new data architectures and generative AI

Why Data Lineage is Key to the LIBOR Transition

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

From Disparate Data to Visualized Knowledge Part I: Moving from Spreadsheets to an RDF Database

NEW: Octopai Announces Support of Microsoft Azure Data Factory

How to use foundation models and trusted governance to manage AI workflow risk

Automate discovery of data relationships using ML and Amazon Neptune graph technology

How healthcare organizations can analyze and create insights using price transparency data

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

Gain insights from historical location data using Amazon Location Service and AWS analytics services

Choosing A Graph Data Model to Best Serve Your Use Case

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

5 best open source data flow lineage tools

Tackling AI’s data challenges with IBM databases on AWS

Data platform trinity: Competitive or complementary?

Introducing Cloudera DataFlow Designer: Self-service, No-Code Dataflow Design

Exploring the AI and data capabilities of watsonx

Turnkey Cloud DataOps: Solution from Alation and Accenture

Sure, Trust Your Data… Until It Breaks Everything: How Automated Data Lineage Saves the Day

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

The Modern Data Stack Explained: What The Future Holds

The What & Why of Data Governance

How Data Lineage Improves Data Compliance

SAP Datasphere Powers Business at the Speed of Data

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

Build a data lake with Apache Flink on Amazon EMR

Why Enterprise Data Lineage is Critical for the Success of Your Modern Data Stack

Lay the groundwork now for advanced analytics and AI

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

Empowering data mesh: The tools to deliver BI excellence

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

Stay Connected