Data Enablement, Machine Learning and Metadata

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

AWS Big Data

JULY 29, 2024

Data lakes provide a unified repository for organizations to store and use large volumes of data. This enables more informed decision-making and innovative insights through various analytics and machine learning applications.

Metadata

Metadata Snapshot Data Lake Metrics

The Power of Graph Databases, Linked Data, and Graph Algorithms

Rocket-Powered Data Science

MARCH 10, 2020

The book Graph Algorithms: Practical Examples in Apache Spark and Neo4j is aimed at broadening our knowledge and capabilities around these types of graph analyses, including algorithms, concepts, and practical machine learning applications of the algorithms.

Metadata

Metadata Machine Learning Prescriptive Analytics ROI

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

This cloud service was a significant leap from the traditional data warehousing solutions, which were expensive, not elastic, and required significant expertise to tune and operate.

Data Warehouse

Data Warehouse Analytics Data Lake Machine Learning

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Introducing watsonx: The future of AI for business

IBM Big Data Hub

MAY 9, 2023

After some impressive advances over the past decade, largely thanks to the techniques of Machine Learning (ML) and Deep Learning , the technology seems to have taken a sudden leap forward. It helps facilitate the entire data and AI lifecycle, from data preparation to model development, deployment and monitoring.

Data Warehouse

Data Warehouse Machine Learning Cost-Benefit Metadata

Minimizing Supply Chain Disruptions with Advanced Analytics

Cloudera

AUGUST 3, 2021

Advanced analytics and enterprise data empower companies to not only have a completely transparent view of movement of materials and products within their line of sight, but also leverage data from their suppliers to have a holistic view 2-3 tiers deep in the supply chain. Open source solutions reduce risk.

Analytics

Analytics Digital Transformation Forecasting Risk

Tableau further democratizes analytics with AI-fueled features

CIO Business Intelligence

APRIL 30, 2024

Tableau says a user working in hospitality could click “Draft with Einstein” for data about travel. The copilot would then use the data source’s metadata and field names to provide a detailed description of the data, enabling other analysts to more easily reference the insights.

Analytics

Analytics Metrics Visualization Dashboards

The Future of the Data Lakehouse – Open

CIO Business Intelligence

JUNE 23, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses.

Data Lake

Data Lake Data Warehouse Machine Learning Data-driven

The Future of the Data Lakehouse – Open

Cloudera

JUNE 18, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses.

Data Lake

Data Lake Data Warehouse Machine Learning Data-driven

The year of the data catalog

Alation

FEBRUARY 13, 2020

Key analyst firms like Forrester, Gartner, and 451 Research have cited “ soaring demands from data catalogs ”, pondered whether data catalogs are the “ most important breakthrough in analytics to have emerged in the last decade ,” and heralded the arrival of a brand new market: Machine Learning Data Catalogs.

Metadata

Metadata Machine Learning Data Governance Reporting

Join the Alation MLDC World Tour!

Alation

FEBRUARY 20, 2020

In a nod to AC/DC, a wink to Gartner’s research report, Data Catalogs Are the New Black in Data Management and Analytics , and inspiration from the inaugural Forrester Wave : Machine Learning Data Catalogs , we have temporarily set aside our Alation orange and have been rocking “black” for the Alation MLDC World Tour.

Machine Learning

Machine Learning Metadata Reporting Data-driven

Usability and Connecting Threads: How Data Fabric Makes Sense Out of Disparate Data

Ontotext

AUGUST 4, 2023

A data fabric utilizes an integrated data layer over existing, discoverable, and inferenced metadata assets to support the design, deployment, and utilization of data across enterprises, including hybrid and multi-cloud platforms. It also helps capture and connect data based on business or domains.

Metadata

Metadata Data-driven Data Architecture Data Quality

Data Catalogs: A Category of Their Own

Alation

FEBRUARY 20, 2020

While this requires technology – AI, machine learning, log parsing, natural language processing,metadata management, this technology must be surfaced in a form accessible to business users – the data catalog. The Forrester Wave : Machine Learning Data Catalogs, Q2 2018.

Machine Learning

Machine Learning Marketing Reporting Data-driven

Shutterstock capitalizes on the cloud’s cutting edge

CIO Business Intelligence

MARCH 6, 2023

The company, which customizes, sells, and licenses more than one billion images, videos, and music clips from its mammoth catalog stored on AWS and Snowflake to media and marketing companies or any customer requiring digital content, currently stores more than 60 petabytes of objects, assets, and descriptors across its distributed data store.

Data Lake

Data Lake Cost-Benefit Recreation/Entertainment Unstructured Data

Process and analyze highly nested and large XML files using AWS Glue and Amazon Athena

AWS Big Data

SEPTEMBER 29, 2023

With these techniques, you can enhance the processing speed and accessibility of your XML data, enabling you to derive valuable insights with ease. Process and transform XML data into a format (like Parquet) suitable for Athena using an AWS Glue extract, transform, and load (ETL) job. xml and technique2.xml. Choose Create.

Metadata

Metadata Visualization Data-driven Optimization

How Ruparupa gained updated insights with an Amazon S3 data lake, AWS Glue, Apache Hudi, and Amazon QuickSight

AWS Big Data

FEBRUARY 22, 2023

The AWS Glue job can transform the raw data in Amazon S3 to Parquet format, which is optimized for analytic queries. The AWS Glue Data Catalog stores the metadata, and Amazon Athena (a serverless query engine) is used to query data in Amazon S3.

Data Lake

Data Lake Dashboards Cost-Benefit Data Warehouse

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

Foundation models (FMs) are large machine learning (ML) models trained on a broad spectrum of unlabeled and generalized datasets. Streaming data facilitates the constant flow of diverse and up-to-date information, enhancing the models’ ability to adapt and generate more accurate, contextually relevant outputs. versions).

Data Lake

Data Lake Unstructured Data Management Snapshot

5 Ways Data Engineers Can Support Data Governance

Alation

JANUARY 26, 2023

That’s why many organizations invest in technology to improve data processes, such as a machine learning data pipeline. However, data needs to be easily accessible, usable, and secure to be useful — yet the opposite is too often the case. Do they have a system to manage the metadata for given assets?

Data Governance

Data Governance Strategy Data Quality Data Collection

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Andrew White

JANUARY 11, 2021

The data suggests several things: The work of traditional analytics and BI continues towards democratization in the business unit directly, we call this domain analytics in our research, part of domain D&A. Many data science labs are set up as shared services. I didn’t mean to imply this. It might have been a slip of the tongue.

Data Analytics

Data Analytics Analytics Data-driven Finance

Hybrid big data analytics with Amazon EMR on AWS Outposts

AWS Big Data

JANUARY 29, 2025

Amazon EMR has long been the leading solution for processing big data in the cloud. Amazon EMR is the industry-leading big data solution for petabyte-scale data processing, interactive analytics, and machine learning using over 20 open source frameworks such as Apache Hadoop , Hive, and Apache Spark.

Big Data

Big Data Data Analytics Analytics Interactive

Data Leaders Brief

Monitoring Apache Iceberg metadata layer using AWS Lambda, AWS Glue, and AWS CloudWatch

The Power of Graph Databases, Linked Data, and Graph Algorithms

Webinars

Trending Sources

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Webinars

Introducing watsonx: The future of AI for business

Minimizing Supply Chain Disruptions with Advanced Analytics

Tableau further democratizes analytics with AI-fueled features

The Future of the Data Lakehouse – Open

The Future of the Data Lakehouse – Open

The year of the data catalog

Join the Alation MLDC World Tour!

Usability and Connecting Threads: How Data Fabric Makes Sense Out of Disparate Data

Data Catalogs: A Category of Their Own

Shutterstock capitalizes on the cloud’s cutting edge

Process and analyze highly nested and large XML files using AWS Glue and Amazon Athena

How Ruparupa gained updated insights with an Amazon S3 data lake, AWS Glue, Apache Hudi, and Amazon QuickSight

Exploring real-time streaming for generative AI Applications

5 Ways Data Engineers Can Support Data Governance

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Hybrid big data analytics with Amazon EMR on AWS Outposts

Stay Connected