2022, Blog and Data Lake - Data Leaders Brief

Eight Top DataOps Trends for 2022

DataKitchen

NOVEMBER 29, 2021

Keep an eye on the eight top trends below that we believe will be significant in the year 2022. The data industry realizes that AI bias is simply a quality problem, and AI systems should be subject to this same level of process control as an automobile rolling off an assembly line. Data Gets Meshier. AI Accountability.

Testing

Testing Data Lake Data Architecture Manufacturing

Use Apache Iceberg in a data lake to support incremental data processing

AWS Big Data

MARCH 2, 2023

Iceberg has become very popular for its support for ACID transactions in data lakes and features like schema and partition evolution, time travel, and rollback. In early 2022, AWS announced general availability of Athena ACID transactions, powered by Apache Iceberg. and later supports the Apache Iceberg framework for data lakes.

Data Lake

Data Lake Data Processing Metadata Snapshot

Simplify data ingestion from Amazon S3 to Amazon Redshift using auto-copy

AWS Big Data

OCTOBER 30, 2024

The following are the recommended best practices when working with files using the auto-copy job: Use unique file names for each file in a auto-copy job (for example, 2022-10-15-batch-1.csv Do not overwrite existing files. He was the CEO and co-founder of DataRow, which was acquired by Amazon in 2020.

Data Warehouse

Data Warehouse Sales Data Lake Recreation/Entertainment

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

2021 Gift Giving Guide for Data Nerds

DataKitchen

DECEMBER 7, 2021

Data Mesh: Delivering Data-Driven Value at Scale , by Zhamak Dehghani. This book is not available until January 2022, but considering all the hype around the data mesh, we expect it to be a best seller. with subject line ‘Data Nerd Gift Ideas’ and we’d be happy to put them in a follow-up blog post.

Data-driven

Data-driven Data Governance Big Data Data Science

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

AWS Big Data

JUNE 23, 2023

This blog post is co-written with Ori Nakar from Imperva. Events and many other security data types are stored in Imperva’s Threat Research Multi-Region data lake. Imperva harnesses data to improve their business outcomes. Imperva’s data lake has a few dozen different datasets, in the scale of petabytes.

Data Lake

Data Lake Dashboards Cost-Benefit Data Warehouse

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

AWS Big Data

JULY 21, 2023

Data-driven organizations treat data as an asset and use it across different lines of business (LOBs) to drive timely insights and better business decisions. This leads to having data across many instances of data warehouses and data lakes using a modern data architecture in separate AWS accounts.

Data Lake

Data Lake Data Warehouse Marketing Management

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

AWS Big Data

APRIL 24, 2023

Building a data lake on Amazon Simple Storage Service (Amazon S3) provides numerous benefits for an organization. However, many use cases, like performing change data capture (CDC) from an upstream relational database to an Amazon S3-based data lake, require handling data at a record level.

Data Lake

Data Lake Data Governance Machine Learning Cost-Benefit

Leadership in 2022: Focus on Empathy

Cloudera

FEBRUARY 18, 2022

Empathy stands out as a core skill that must be alive and nurtured within our teams if we are to achieve our desired outcomes in 2022 and beyond. This blog explores what empathy looks like in a business context, why it’s so important, and what we’re up to at Cloudera. At Cloudera we operate according to core values.

Data Lake

Data Lake Uncertainty Dashboards Optimization

6 Exciting Announcements Made at Snowflake Summit 2022

CDW Research Hub

JULY 22, 2022

The Sirius Data & Analytics Consulting team recently attended Snowflake Summit 2022 in Las Vegas; the first time the annual conference has been held in person since 2019. Whether it was due to being in a room full of data enthusiasts or the magic of Las Vegas, the energy matched the larger attendance and venue.

Data Lake

Data Lake Data Governance Machine Learning Consulting

Cloudera Named a Leader in the 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems (DBMS)

Cloudera

DECEMBER 16, 2022

We are pleased to announce that Cloudera has been named a Leader in the 2022 Gartner ® Magic Quadrant for Cloud Database Management Systems. Cloudera has long had the capabilities of a data lakehouse, if not the label. Get an introduction to the latest version of Cloudera’s Data Platform. and/or its affiliates in the U.S.

Management

Management Metadata Machine Learning Data Lake

The Future of the Data Lakehouse – Open

Cloudera

JUNE 18, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes.

Data Lake

Data Lake Data Warehouse Machine Learning Data-driven

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

We have seen a strong customer demand to expand its scope to cloud-based data lakes because data lakes are increasingly the enterprise solution for large-scale data initiatives due to their power and capabilities. Let’s say that this company is located in Europe and the data product must comply with the GDPR.

Data Lake

Data Lake Management Metrics Data Warehouse

Habib Bank manages data at scale with Cloudera Data Platform

Cloudera

NOVEMBER 17, 2022

The Solution: CDP Private Cloud brings a next-generation hybrid architecture with cloud-native benefits to HBL’s data platform. HBL started their data journey in 2019 when data lake initiative was started to consolidate complex data sources and enable the bank to use single version of truth for decision making.

Management

Management Data Lake Consulting Unstructured Data

Achieve your AI goals with an open data lakehouse approach

IBM Big Data Hub

OCTOBER 4, 2023

Why does AI need an open data lakehouse architecture? from 2022 to 2026. Another IDC study showed that while 2/3 of respondents reported using AI-driven data analytics, most reported that less than half of the data under management is available for this type of analytics.

Data Lake

Data Lake Metadata Data Warehouse Cost-Benefit

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

In February 2022, we introduced Apache Iceberg as a technical preview within CDP. Over the past decade, Cloudera has enabled multi-function analytics on data lakes through the introduction of the Hive table format and Hive ACID. We selected change data capture as our first use case on Iceberg.

Data Lake

Data Lake Data Warehouse Data Architecture Metadata

Does Cost Reduction Play a Role in Digital Transformation?

Cloudera

OCTOBER 6, 2022

CIO blog post : “Digital transformation is a foundational change in how an organization delivers value to its customers.”. For example, we have some customers using their data platform originally established for compliance initiatives to drive new use cases. appeared first on Cloudera Blog. Strategies to maximize impact.

Digital Transformation

Digital Transformation Cost-Benefit Data Lake Machine Learning

OCBC Bank Accelerates Its Data Strategy with Cloudera

Cloudera

DECEMBER 14, 2022

The company recently migrated to Cloudera Data Platform (CDP ) and CDP Machine Learning to power a number of solutions that have increased operational efficiency, enabled new revenue streams and improved risk management. OCBC also won a Cloudera Data Impact Award 2022 in the Transformation category for the project.

Data Strategy

Data Strategy Strategy IT Contextual Data

Turning Streams Into Data Products

Cloudera

JUNE 16, 2022

CSP was recently recognized as a leader in the 2022 GigaOm Radar for Streaming Data Platforms report. Without context, streaming data is useless.” ” SSB enables users to configure data providers using out of the box connectors or their own connector to any data source. Not in the manufacturing space?

Data Lake

Data Lake Manufacturing Metadata Dashboards

Open Data Lakehouse powered by Iceberg for all your Data Warehouse needs

Cloudera

APRIL 3, 2023

In this blog, we will share with you in detail how Cloudera integrates core compute engines including Apache Hive and Apache Impala in Cloudera Data Warehouse with Iceberg. We will publish follow up blogs for other data services. Iceberg basics Iceberg is an open table format designed for large analytic workloads.

Data Warehouse

Data Warehouse Snapshot Metadata Cost-Benefit

Demystifying Modern Data Platforms

Cloudera

SEPTEMBER 15, 2022

July brings summer vacations, holiday gatherings, and for the first time in two years, the return of the Massachusetts Institute of Technology (MIT) Chief Data Officer symposium as an in-person event. A key area of focus for the symposium this year was the design and deployment of modern data platforms.

Data Lake

Data Lake Data Architecture Data-driven Data Warehouse

The Modern Data Lakehouse: An Architectural Innovation

Cloudera

SEPTEMBER 9, 2022

This is the promise of the modern data lakehouse architecture. analyst Sumit Pal, in “Exploring Lakehouse Architecture and Use Cases,” published January 11, 2022: “Data lakehouses integrate and unify the capabilities of data warehouses and data lakes, aiming to support AI, BI, ML, and data engineering on a single platform.”

Metadata

Metadata Machine Learning Unstructured Data Data Lake

Data Mesh vs. Data Fabric: A Love Story

Alation

JANUARY 13, 2022

Thoughtworks says data mesh is key to moving beyond a monolithic data lake. Spoiler alert: data fabric and data mesh are independent design concepts that are, in fact, quite complementary. Thoughtworks says data mesh is key to moving beyond a monolithic data lake 2. Gartner on Data Fabric.

Data Lake

Data Lake Metadata Data-driven Data Governance

Dive deep into AWS Glue 4.0 for Apache Spark

AWS Big Data

MAY 18, 2023

You can discover and connect to over 70 diverse data sources, manage your data in a centralized data catalog, and create, run, and monitor data integration pipelines to load data into your data lakes and your data warehouses. AWS Glue released version 4.0 runtime ( 3.5

Testing

Testing Data Lake Cost-Benefit Data Integration

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

With data ownership decentralization, data owners can create data products for their respective domains, meaning data consumers, both data scientist and business users, can use a combination of these data products for data analytics and data science. 3 March 2022. 11 May 2021. .

Management

Management Metadata Data Architecture Data Lake

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

Building data lakes from continuously changing transactional data of databases and keeping data lakes up to date is a complex task and can be an operational challenge. You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes. For Type , choose Spark.

Data Lake

Data Lake Dashboards Metrics Metadata

Why Invest Now? Three Investors Share the Story Behind Alation’s Series E

Alation

NOVEMBER 2, 2022

And that’s even in the midst of 2022, which has been a tumultuous year from a macro perspective. We had not seen that in the broader intelligence & data governance market.”. The lakehouse] helps businesses really harness the power of data and analytics and AI. And data governance is critical to driving adoption.”.

Data Governance

Data Governance Marketing Finance Data Lake

The hidden history of Db2

IBM Big Data Hub

JULY 5, 2022

Db2’s decades of innovation and expertise running the most demanding transactional, analytical, and operational workloads have culminated today in the 2022 Gartner Peer Insights Customers’ Choice distinction for Cloud Database Management Systems. . To learn more, visit IBM Db2 and our IBM data management page. .

Data Lake

Data Lake Data Warehouse Publishing Structured Data

6 ways to drive Wi-Fi operational efficiencies

CIO Business Intelligence

APRIL 18, 2023

To help take control in these uncertain times, this blog outlines six strategies to modernize your Wi-Fi. 2] AIOps can help identify areas for optimization using existing hardware by combing through a tsunami of data faster than any human ever could. Adopt AI to better leverage existing hardware investments.

IoT

IoT Internet of Things Data Lake Optimization

TDC Digital leverages IBM Cloud for transparent billing and improved customer satisfaction

IBM Big Data Hub

MAY 19, 2023

billion in 2022 to USD 130.0 With high-speed file transfer, integrated services and cross-region offerings, IBM Cloud Object Storage allows you to leverage your data securely. The post TDC Digital leverages IBM Cloud for transparent billing and improved customer satisfaction appeared first on IBM Blog. billion by 2027.

Unstructured Data

Unstructured Data Data Processing Manufacturing Data Lake

Alation Earns 8 Top Rankings in BARC’s The Data Management Survey 23

Alation

OCTOBER 19, 2022

Today, they have issued The Data Management Survey 23 , a report based on a survey of more than 1,200 data management end-users of 23 products (or groups of products). The survey was conducted from January to April 2022 and examined user feedback on product experience across 18 criteria. Subscribe to Alation's Blog.

Management

Management KPI Data Governance Reporting

Automate alerting and reporting for AWS Glue job resource usage

AWS Big Data

MAY 25, 2023

Many organizations today are using AWS Glue to build ETL pipelines that bring data from disparate sources and store the data in repositories like a data lake, database, or data warehouse for further consumption. In April 2022, Auto Scaling for AWS Glue was released for AWS Glue version 3.0 1X 1 4 16 64 G.2X

Reporting

Reporting Metrics Optimization Data Lake

How data stores and governance impact your AI initiatives

IBM Big Data Hub

OCTOBER 12, 2023

To optimize data analytics and AI workloads, organizations need a data store built on an open data lakehouse architecture. This type of architecture combines the performance and usability of a data warehouse with the flexibility and scalability of a data lake. Learn more about IBM watsonx 1.

Cost-Benefit

Cost-Benefit Metadata Data Governance Optimization

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

AWS Big Data

APRIL 20, 2023

In a modern data architecture, unified analytics enable you to access the data you need, whether it’s stored in a data lake or a data warehouse. AWS Glue provides an extensible architecture that enables users with different data processing use cases, and works well with Amazon Redshift.

Visualization

Visualization Data Warehouse Big Data Data Lake

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 1

BizAcuity

MAY 10, 2022

The cloud market is well on track to reach the expected $495 billion dollar mark by the end of 2022. And how this transformation will impact businesses in the short and long run is the main discussion in this blog. In 2022, Amazon is still the single largest leader in the cloud market with over 30% market share. To be continued.

Data-driven

Data-driven IoT Unstructured Data Data Lake

Pillars of Knowledge, Best Practices for Data Governance

Cloudera

AUGUST 4, 2021

As a result, the biometrics market is estimated to be worth a staggering $49 billion by 2022 and huge investments are being made in the development of new algorithms and systems to improve biometric accuracy. SDX is designed to reduce risk and operational costs by delivering consistent data context across deployments.

Data Governance

Data Governance Metadata Data-driven Enterprise

The year’s top 10 enterprise AI trends — so far

CIO Business Intelligence

SEPTEMBER 21, 2023

The world has flipped since 2022,” says David McCurdy, chief enterprise architect and CTO at Insight. To make all this possible, the data had to be collected, processed, and fed into the systems that needed it in a reliable, efficient, scalable, and secure way. Then gen AI came out.

Enterprise

Enterprise Consulting Modeling Cost-Benefit

Customer Data Culture: The Innovators Have Already Reinvented Themselves

Alation

FEBRUARY 13, 2020

The re-insurance product that they introduced was inspired by collaboration between geographically dispersed teams coming together through the Alation Data Catalog. With the introduction of a new data lake, MunichRe created a new way for actuaries and business experts to explore new product concepts and test new markets.

Digital Transformation

Digital Transformation Insurance Data-driven Machine Learning

Using Artificial Intelligence to Make Sense of IoT Data

BizAcuity

MARCH 1, 2019

At the backend, based on the data collected, data is stored in data lakes. Such data is collected from hundreds, thousands and millions of users. Then AI/ML algorithms are run on this collected data.

IoT

IoT Internet of Things Big Data Data-driven

What Is Alation Connected Sheets? Q&A with the Creators

Alation

NOVEMBER 28, 2022

You founded Kloudio to address the spreadsheet problem, and Alation acquired Kloudio in February of 2022. But refreshing this analysis with the latest data was impossible… unless you were proficient in SQL or Python. Read the overview blog: Alation Connected Sheets Brings Trust to Spreadsheets. Subscribe to Alation's Blog.

Metadata

Metadata Enterprise Cost-Benefit Data Quality

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Alation

AUGUST 11, 2022

Today, the brightest minds in our industry are targeting the massive proliferation of data volumes and the accompanying but hard-to-find value locked within all that data. I recently had the opportunity to connect with Mohan at Snowflake Summit 2022 in Las Vegas. Subscribe to Alation's Blog.

Metadata

Metadata Data Warehouse Data Quality Data Lake

Data security: Why a proactive stance is best

IBM Big Data Hub

JULY 7, 2023

But with a proactive approach to data security, organizations can fight back against the seemingly endless waves of threats. IBM Security X-Force found the most common threat on organizations is extortion, which comprised more than a quarter (27%) of all cybersecurity threats in 2022.

Risk

Risk Data Governance Data Lake Data-driven

Unlock The Power of Your Data With These 19 Big Data & Data Analytics Books

datapine

AUGUST 29, 2022

Awarded the “best specialist business book” at the 2022 Business Book Awards, this publication guides readers in discovering how companies are harnessing the power of XR in areas such as retail, restaurants, manufacturing, and overall customer experience. The author, Anil Maheshwari, Ph.D.,

Big Data

Big Data Data Analytics Analytics Data mining

Eight Top DataOps Trends for 2022

Use Apache Iceberg in a data lake to support incremental data processing

Webinars

Trending Sources

Simplify data ingestion from Amazon S3 to Amazon Redshift using auto-copy

Webinars

2021 Gift Giving Guide for Data Nerds

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

Leadership in 2022: Focus on Empathy

6 Exciting Announcements Made at Snowflake Summit 2022

Cloudera Named a Leader in the 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems (DBMS)

The Future of the Data Lakehouse – Open

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

Habib Bank manages data at scale with Cloudera Data Platform

Achieve your AI goals with an open data lakehouse approach

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Does Cost Reduction Play a Role in Digital Transformation?

OCBC Bank Accelerates Its Data Strategy with Cloudera

Turning Streams Into Data Products

Open Data Lakehouse powered by Iceberg for all your Data Warehouse needs

Demystifying Modern Data Platforms

The Modern Data Lakehouse: An Architectural Innovation

Data Mesh vs. Data Fabric: A Love Story

Dive deep into AWS Glue 4.0 for Apache Spark

Augmented data management: Data fabric versus data mesh

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Why Invest Now? Three Investors Share the Story Behind Alation’s Series E

The hidden history of Db2

6 ways to drive Wi-Fi operational efficiencies

TDC Digital leverages IBM Cloud for transparent billing and improved customer satisfaction

Alation Earns 8 Top Rankings in BARC’s The Data Management Survey 23

Automate alerting and reporting for AWS Glue job resource usage

How data stores and governance impact your AI initiatives

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 1

Pillars of Knowledge, Best Practices for Data Governance

The year’s top 10 enterprise AI trends — so far

Customer Data Culture: The Innovators Have Already Reinvented Themselves

Using Artificial Intelligence to Make Sense of IoT Data

What Is Alation Connected Sheets? Q&A with the Creators

Fabrics, Meshes & Stacks, oh my! Q&A with Sanjeev Mohan

Data security: Why a proactive stance is best

Unlock The Power of Your Data With These 19 Big Data & Data Analytics Books

Stay Connected