Remove category apache-kafka
article thumbnail

The DataOps Vendor Landscape, 2021

DataKitchen

Read the complete blog below for a more detailed description of the vendors and their capabilities. Because it is such a new category, both overly narrow and overly broad definitions of DataOps abound. Apache Oozie — An open-source workflow scheduler system to manage Apache Hadoop jobs. DataOps is a hot topic in 2021.

Testing 300
article thumbnail

Stitch Fix seamless migration: Transitioning from self-managed Kafka to Amazon MSK

AWS Big Data

In our infrastructure, Apache Kafka has emerged as a powerful tool for managing event streams and facilitating real-time data processing. At Stitch Fix, we have used Kafka extensively as part of our data infrastructure to support various needs across the business for over six years.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

An Overview of Real Time Data Warehousing on Cloudera

Cloudera

As an example of this, in this post we look at Real Time Data Warehousing (RTDW), which is a category of use cases customers are building on Cloudera and which is becoming more and more common amongst our customers. Deep Dive into General Purpose RTDW , featuring Apache Kudu, Apache Impala, and Apache NiFi.

article thumbnail

NiFi as a Function in DataFlow Service

Cloudera

With the general availability of Cloudera DataFlow for the Public Cloud (CDF-PC) , our customers can now self-serve deployments of Apache NiFi data flows on Kubernetes clusters in a cost effective way providing auto scaling, resource isolation and monitoring with KPI-based alerting. Functions as a Service.

KPI 110
article thumbnail

Nexthink scales to trillions of events per day with Amazon MSK

AWS Big Data

In this post, Nexthink shares how Amazon Managed Streaming for Apache Kafka (Amazon MSK) empowered them to achieve massive scale in event processing. Furthermore, the absence of a streaming platform like Kafka created dependencies between teams through tight HTTP/gRPC coupling.

article thumbnail

5 Key Takeaways from #Current2023

Cloudera

Recently, Confluent hosted Current 2023 (formerly Kafka summit) in San Jose on Sept 26th and 27th. This blog is for anyone who was interested but unable to attend the conference, or anyone interested in a quick summary of what happened there. More of a Confluent conference now than a kafka conference. Flink is here to stay.

article thumbnail

Streaming Edge Data Collection and Global Data Distribution

Cloudera

In the first blog of the Universal Data Distribution blog series , we discussed the emerging need within enterprise organizations to take control of their data flows. In this second installment of the Universal Data Distribution blog series, we will discuss a few different data distribution use cases and deep dive into one of them. .