Remove 2015 Remove Metadata Remove Testing
article thumbnail

Themes and Conferences per Pacoid, Episode 11

Domino Data Lab

In other words, using metadata about data science work to generate code. One of the longer-term trends that we’re seeing with Airflow , and so on, is to externalize graph-based metadata and leverage it beyond the lifecycle of a single SQL query, making our workflows smarter and more robust. BTW, videos for Rev2 are up: [link].

Metadata 105
article thumbnail

Speed up queries with the cost-based optimizer in Amazon Athena

AWS Big Data

Starting today, the Athena SQL engine uses a cost-based optimizer (CBO), a new feature that uses table and column statistics stored in the AWS Glue Data Catalog as part of the table’s metadata. Testing on the TPC-DS benchmark showed an 11% improvement in overall query performance when using CBO compared to without it.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Introducing Cloudera DataFlow Designer: Self-service, No-Code Dataflow Design

Cloudera

Cloudera has been providing enterprise support for Apache NiFi since 2015, helping hundreds of organizations take control of their data movement pipelines on premises and in the public cloud. Once you have retrieved the data, NiFi stores it in a queue, which allows you to explore the content and metadata attributes of the events.

Testing 100
article thumbnail

How HPE Aruba Supply Chain optimized cost and performance by migrating to an AWS modern data architecture

AWS Big Data

Hewlett-Packard acquired Aruba Networks in 2015, making it a wireless networking subsidiary with a wide range of next-generation network access solutions. Each file arrives as a pair with a tail metadata file in CSV format containing the size and name of the file. To achieve this, Aruba used Amazon S3 Event Notifications.

article thumbnail

The Very Group adopts a data catalog to better organize and leverage its online retail capabilities

CIO Business Intelligence

It launched its first online-only brand, Very, in 2009 and finally abandoned its printed catalogs to go all-in online in 2015. In a first test of the technology, he used Alation to catalog a subset of Very’s data held in an old Teradata database. The whole company rebranded as Very in 2020, the year Pimblett joined.

IT 98
article thumbnail

Benchmark Results Position GraphDB As the Most Versatile Graph Database Engine

Ontotext

The engines must facilitate the advanced data integration and metadata data management scenarios where an EKG is used for data fabrics or otherwise serves as a data hub between diverse data and content management systems. 8xlarge server (256GiB RAM, Intel Xeon Platinum 8375C) against a test driver configured with 4 read and 4 write threads.

article thumbnail

Query AWS Glue Data Catalog views using Amazon Athena and Amazon Redshift

AWS Big Data

Glue Data Catalog views is a new feature of the AWS Glue Data Catalog that customers can use to create a common view schema and single metadata container that can hold view-definitions in different dialects that can be used across engines such as Amazon Redshift and Amazon Athena. Choose SELECT and DESCRIBE for View permissions.

Data Lake 120