Data Lake, Data Processing and Visualization

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

AWS Big Data

OCTOBER 30, 2024

Amazon DataZone now launched authentication supports through the Amazon Athena JDBC driver, allowing data users to seamlessly query their subscribed data lake assets via popular business intelligence (BI) and analytics tools like Tableau, Power BI, Excel, SQL Workbench, DBeaver, and more.

Visualization

Visualization Data Lake Testing Data Governance

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

AWS Big Data

DECEMBER 20, 2024

These improvements are available through the Amazon Q chat experience on the AWS Management Console , and the Amazon SageMaker Unified Studio (preview) visual ETL and notebook interfaces. The DataFrame code generation now extends beyond AWS Glue DynamicFrame to support a broader range of data processing scenarios.

Data Integration

Data Integration Visualization Data Processing Data Lake

Scaling RISE with SAP data and AWS Glue

AWS Big Data

NOVEMBER 29, 2024

Customers often want to augment and enrich SAP source data with other non-SAP source data. Such analytic use cases can be enabled by building a data warehouse or data lake. Customers can now use the AWS Glue SAP OData connector to extract data from SAP. Choose Visual ETL to create a job in the Visual Editor.

Visualization

Visualization Data Processing Data-driven Cost-Benefit

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Enrich your serverless data lake with Amazon Bedrock

AWS Big Data

SEPTEMBER 26, 2024

For many organizations, this centralized data store follows a data lake architecture. Although data lakes provide a centralized repository, making sense of this data and extracting valuable insights can be challenging. About the Authors Dave Horne is a Sr.

Data Lake

Data Lake Cost-Benefit Unstructured Data Modeling

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

JANUARY 15, 2025

In addition to real-time analytics and visualization, the data needs to be shared for long-term data analytics and machine learning applications. The applications are hosted in dedicated AWS accounts and require a BI dashboard and reporting services based on Tableau. datazone_env_twinsimsilverdata"."cycle_end";')

IoT

IoT Machine Learning Metadata Data-driven

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

AWS Big Data

AUGUST 3, 2023

Data analytics on operational data at near-real time is becoming a common need. Due to the exponential growth of data volume, it has become common practice to replace read replicas with data lakes to have better scalability and performance. For more information, see Changing the default settings for your data lake.

Data Lake

Data Lake Visualization Dashboards Insurance

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

AWS Big Data

DECEMBER 16, 2024

On your project, in the navigation pane, choose Data. For Add data source , choose Add connection. For Host , enter your host name of your Aurora PostgreSQL database cluster. format(connection_properties["HOST"],connection_properties["PORT"],connection_properties["DATABASE"]) df.write.format("jdbc").option("url",

Visualization

Visualization Data Processing Testing Publishing

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

AWS Big Data

OCTOBER 21, 2024

However, enterprises often encounter challenges with data silos, insufficient access controls, poor governance, and quality issues. Embracing data as a product is the key to address these challenges and foster a data-driven culture. To incorporate this third-party data, AWS Data Exchange is the logical choice.

Sales

Sales Data-driven Data Processing Key Performance Indicator

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 3: Visualization and trend analysis using Amazon QuickSight

AWS Big Data

MARCH 29, 2024

QuickSight makes it straightforward for business users to visualize data in interactive dashboards and reports. QuickSight periodically runs Amazon Athena queries to load query results to SPICE and then visualize the latest metric data. The filtered Worker Utilization per Job visualization shows 0.5,

Metrics

Metrics Visualization Dashboards Publishing

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

DECEMBER 13, 2023

In addition to using native managed AWS services that BMS didn’t need to worry about upgrading, BMS was looking to offer an ETL service to non-technical business users that could visually compose data transformation workflows and seamlessly run them on the AWS Glue Apache Spark-based serverless data integration engine.

Metadata

Metadata Data Lake Visualization Data Quality

Top 15 data management platforms

CIO Business Intelligence

JUNE 9, 2022

All this data arrives by the terabyte, and a data management platform can help marketers make sense of it all. Marketing-focused or not, DMPs excel at negotiating with a wide array of databases, data lakes, or data warehouses, ingesting their streams of data and then cleaning, sorting, and unifying the information therein.

Management

Management Advertising Data Lake Sales

Use AWS Glue to streamline SFTP data processing

AWS Big Data

AUGUST 13, 2024

With AWS Glue, you can discover and connect to hundreds of diverse data sources and manage your data in a centralized data catalog. It enables you to visually create, run, and monitor extract, transform, and load (ETL) pipelines to load data into your data lakes. Select Visual ETL in the central pane.

Data Processing

Data Processing Visualization Data Lake Data Processing

Accomplish Agile Business Intelligence & Analytics For Your Business

datapine

APRIL 15, 2020

It’s necessary to say that these processes are recurrent and require continuous evolution of reports, online data visualization , dashboards, and new functionalities to adapt current processes and develop new ones. You need to determine if you are going with an on-premise or cloud-hosted strategy. Construction Iterations.

Business Intelligence

Business Intelligence Analytics Testing Dashboards

Run Spark SQL on Amazon Athena Spark

AWS Big Data

OCTOBER 23, 2023

Modern applications store massive amounts of data on Amazon Simple Storage Service (Amazon S3) data lakes, providing cost-effective and highly durable storage, and allowing you to run analytics and machine learning (ML) from your data lake to generate insights on your data.

Data Lake

Data Lake Visualization Optimization Interactive

Unlock scalable analytics with a secure connectivity pattern in AWS Glue to read from or write to Snowflake

AWS Big Data

AUGUST 19, 2024

This allows business analysts and decision-makers to gain valuable insights, visualize key metrics, and explore the data in depth, enabling informed decision-making and strategic planning for pricing and promotional strategies. Use Amazon Route 53 to create a private hosted zone that resolves the Snowflake endpoint within your VPC.

Analytics

Analytics Data-driven Data Integration Data Lake

Addressing the Three Scalability Challenges in Modern Data Platforms

Cloudera

NOVEMBER 22, 2021

In addition, data pipelines include more and more stages, thus making it difficult for data engineers to compile, manage, and troubleshoot those analytical workloads. Those incremental costs derive from a variety of reasons: Increased data processing costs associated with legacy deployment types (e.g., CRM platforms).

Data Processing

Data Processing Data Warehouse Enterprise Visualization

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To bring their customers the best deals and user experience, smava follows the modern data architecture principles with a data lake as a scalable, durable data store and purpose-built data stores for analytical processing and data consumption.

Data Lake

Data Lake Data Warehouse Data-driven B2B

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

AWS Big Data

DECEMBER 4, 2024

These nodes can implement analytical platforms like data lake houses, data warehouses, or data marts, all united by producing data products. By treating the data as a product, the outcome is a reusable asset that outlives a project and meets the needs of the enterprise consumer.

Metadata

Metadata Data Governance Data Quality Data-driven

Top 15 data management platforms available today

CIO Business Intelligence

SEPTEMBER 22, 2023

All this data arrives by the terabyte, and a data management platform can help marketers make sense of it all. DMPs excel at negotiating with a wide array of databases, data lakes, or data warehouses, ingesting their streams of data and then cleaning, sorting, and unifying the information therein.

Management

Management Advertising Data Lake Sales

Set up advanced rules to validate quality of multiple datasets with AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

It supports both data quality at rest and data quality in AWS Glue extract, transform, and load (ETL) pipelines. Data quality at rest focuses on validating the data stored in data lakes, databases, or data warehouses. It ensures that the data meets specific quality standards before it is consumed.

Data Quality

Data Quality Data Lake Visualization Data-driven

Get Your Analytics Insights Instantly – Without Abandoning Central IT

Cloudera

JANUARY 21, 2021

Cloudera’s Data Warehouse service allows raw data to be stored in the cloud storage of your choice (S3, ADLSg2). It will be stored in your own namespace, and not force you to move data into someone else’s proprietary file formats or hosted storage. Proprietary file formats mean no one else is invited in! Separate compute.

Data Warehouse

Data Warehouse Data Lake IT Analytics

Periscope Data Expands to Israel, Empowering Data Teams with Powerful Tools

Sisense

DECEMBER 11, 2019

The challenge is to do it right, and a crucial way to achieve it is with decisions based on data and analysis that drive measurable business results. This was the key learning from the Sisense event heralding the launch of Periscope Data in Tel Aviv, Israel — the beating heart of the startup nation. What VCs want from startups.

Data Lake

Data Lake Big Data Sales Data-driven

Access Amazon Athena in your applications using the WebSocket API

AWS Big Data

MARCH 2, 2023

Many organizations are building data lakes to store and analyze large volumes of structured, semi-structured, and unstructured data. In addition, many teams are moving towards a data mesh architecture, which requires them to expose their data sets as easily consumable data products.

Data Lake

Data Lake Testing Interactive Unstructured Data

Ingest, transform, and deliver events published by Amazon Security Lake to Amazon OpenSearch Service

AWS Big Data

JUNE 19, 2023

Security Lake automatically centralizes security data from cloud, on-premises, and custom sources into a purpose-built data lake stored in your account. With Security Lake, you can get a more complete understanding of your security data across your entire organization.

Publishing

Publishing Dashboards Visualization Management

Quantitative and Qualitative Data: A Vital Combination

Sisense

OCTOBER 6, 2020

As quantitative data is always numeric, it’s relatively straightforward to put it in order, manage it, analyze it, visualize it, and do calculations with it. Spreadsheet software like Excel, Google Sheets, or traditional database management systems all mainly deal with quantitative data.

Statistics

Statistics Unstructured Data Data-driven Visualization

Automate deployment of an Amazon QuickSight analysis connecting to an Amazon Redshift data warehouse with an AWS CloudFormation template

AWS Big Data

FEBRUARY 16, 2023

Amazon Redshift is the most widely used data warehouse in the cloud, best suited for analyzing exabytes of data and running complex analytical queries. Amazon QuickSight is a fast business analytics service to build visualizations, perform ad hoc analysis, and quickly get business insights from your data. Create a visual.

Data Warehouse

Data Warehouse Sales Visualization Data Processing

Amazon Redshift data ingestion options

AWS Big Data

SEPTEMBER 5, 2024

Amazon Redshift , a warehousing service, offers a variety of options for ingesting data from diverse sources into its high-performance, scalable environment. This native feature of Amazon Redshift uses massive parallel processing (MPP) to load objects directly from data sources into Redshift tables. AWS Glue 4.0 Sudipta Bagchi is a Sr.

IoT

IoT Data Warehouse Cost-Benefit Reporting

Modern Data Architecture for Telecommunications

Cloudera

SEPTEMBER 6, 2022

Previously, there were three types of data structures in telco: . Entity data sets — i.e. marketing data lakes . It is an edge-to-AI suite of capabilities, including edge analytics, data staging, data quality control, data visualization tools, and machine learning.

Data Architecture

Data Architecture Cost-Benefit Digital Transformation Business Driver

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

Strategize based on how your teams explore data, run analyses, wrangle data for downstream requirements, and visualize data at different levels. The AWS modern data architecture shows a way to build a purpose-built, secure, and scalable data platform in the cloud.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

Building data lakes from continuously changing transactional data of databases and keeping data lakes up to date is a complex task and can be an operational challenge. You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes.

Data Lake

Data Lake Dashboards Metrics Metadata

Stitch Fix seamless migration: Transitioning from self-managed Kafka to Amazon MSK

AWS Big Data

SEPTEMBER 22, 2023

At Stitch Fix, we have been powered by data science since its foundation and rely on many modern data lake and data processing technologies. In our infrastructure, Apache Kafka has emerged as a powerful tool for managing event streams and facilitating real-time data processing.

Management

Management Metrics Cost-Benefit Data Lake

10 Keys to a Secure Cloud Data Lakehouse

Cloudera

OCTOBER 25, 2022

The data lakehouse is gaining in popularity because it enables a single platform for all your enterprise data with the flexibility to run any analytic and machine learning (ML) use case. Cloud data lakehouses provide significant scaling, agility, and cost advantages compared to cloud data lakes and cloud data warehouses.

Data Processing

Data Processing Data Lake Cost-Benefit Risk

Power enterprise-grade Data Vaults with Amazon Redshift – Part 2

AWS Big Data

NOVEMBER 16, 2023

Amazon Redshift is a popular cloud data warehouse, offering a fully managed cloud-based service that seamlessly integrates with an organization’s Amazon Simple Storage Service (Amazon S3) data lake, real-time streams, machine learning (ML) workflows, transactional workflows, and much more—all while providing up to 7.9x

Enterprise

Enterprise Data Warehouse Snapshot Cost-Benefit

PODCAST: Making AI Real – Episode 4: Unlocking the Value of Enterprise AI with Data Engineering Capabilities

bridgei2i

MARCH 3, 2021

Episode 4: Unlocking the Value of Enterprise AI with Data Engineering Capabilities. Unlocking the Value of Enterprise AI with Data Engineering Capabilities. They discuss how the data engineering team is instrumental in easing collaboration between analysts, data scientists and ML engineers to build enterprise AI solutions.

Enterprise

Enterprise Digital Transformation Data-driven Interactive

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

We can determine the following are needed: An open data format ingestion architecture processing the source dataset and refining the data in the S3 data lake. This requires a dedicated team of 3–7 members building a serverless data lake for all data sources. Vijay Bagur is a Sr.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Deep Thoughts on Data Flow with Alation & Trifacta

Alation

FEBRUARY 20, 2020

Data lakes, while useful in helping you to capture all of your data, are only the first step in extracting the value of that data. The combination of Alation and Trifacta allows you to seamlessly complete this workflow and embrace self-service data along with your self-service analysis.

Data Lake

Data Lake Data Processing Data Quality Visualization

Exploring the AI and data capabilities of watsonx

IBM Big Data Hub

JULY 17, 2023

By supporting open-source frameworks and tools for code-based, automated and visual data science capabilities — all in a secure, trusted studio environment — we’re already seeing excitement from companies ready to use both foundation models and machine learning to accomplish key tasks.

Machine Learning

Machine Learning Data Warehouse Modeling Cost-Benefit

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

Cargotec captures terabytes of IoT telemetry data from their machinery operated by numerous customers across the globe. This data needs to be ingested into a data lake, transformed, and made available for analytics, machine learning (ML), and visualization. The job runs in the target account.

Metadata

Metadata Data Lake Machine Learning Big Data

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

Fun fact : I co-founded an e-commerce company (realistically, a mail-order catalog hosted online) in December 1992 using one of those internetworking applications called Gopher , which was vaguely popular at the time. Somehow, the gravity of the data has a geological effect that forms data lakes. Upcoming Events.

Machine Learning

Machine Learning Data Governance Metadata Data Science

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 1

BizAcuity

MAY 10, 2022

2007: Amazon launches SimpleDB, a non-relational (NoSQL) database that allows businesses to cheaply process vast amounts of data with minimal effort. The platform is built on S3 and EC2 using a hosted Hadoop framework. An efficient big data management and storage solution that AWS quickly took advantage of.

Data-driven

Data-driven IoT Unstructured Data Data Lake

Generate security insights from Amazon Security Lake data using Amazon OpenSearch Ingestion

AWS Big Data

AUGUST 28, 2023

Optionally, specify the Amazon S3 storage class for the data in Amazon Security Lake. For more information, refer to Lifecycle management in Security Lake. Review the details and create the data lake. When the pipeline is active, you should see the security data ingested into your Amazon OpenSearch Service domain.

Dashboards

Dashboards Visualization Metadata Management

Data Governance for Dummies: Your Questions, Answered

Alation

FEBRUARY 17, 2023

This past week, I had the pleasure of hosting Data Governance for Dummies author Jonathan Reichental for a fireside chat , along with Denise Swanson , Data Governance lead at Alation. In this way, data governance is the business or process side. Attendance was high, as were the number of excellent questions.

Data Governance

Data Governance Data Quality Metadata Cost-Benefit

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Andrew White

JANUARY 11, 2021

On January 4th I had the pleasure of hosting a webinar. It was titled, The Gartner 2021 Leadership Vision for Data & Analytics Leaders. This was for the Chief Data Officer, or head of data and analytics. Coding skills – SQL, Python or application familiarity – ETL & visualization?

Data Analytics

Data Analytics Analytics Data-driven Finance

Expanding data analysis and visualization options: Amazon DataZone now integrates with Tableau, Power BI, and more

Amazon Q data integration adds DataFrame support and in-prompt context-aware job creation

Webinars

Trending Sources

Scaling RISE with SAP data and AWS Glue

Webinars

Enrich your serverless data lake with Amazon Bedrock

How EUROGATE established a data mesh architecture using Amazon DataZone

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

Introducing a new unified data connection experience with Amazon SageMaker Lakehouse unified data connectivity

Demystify data sharing and collaboration patterns on AWS: Choosing the right tool for the job

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 3: Visualization and trend analysis using Amazon QuickSight

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Top 15 data management platforms

Use AWS Glue to streamline SFTP data processing

Accomplish Agile Business Intelligence & Analytics For Your Business

Run Spark SQL on Amazon Athena Spark

Unlock scalable analytics with a secure connectivity pattern in AWS Glue to read from or write to Snowflake

Addressing the Three Scalability Challenges in Modern Data Platforms

How smava makes loans transparent and affordable using Amazon Redshift Serverless

How ANZ Institutional Division built a federated data platform to enable their domain teams to build data products to support business outcomes

Top 15 data management platforms available today

Set up advanced rules to validate quality of multiple datasets with AWS Glue Data Quality

Get Your Analytics Insights Instantly – Without Abandoning Central IT

Periscope Data Expands to Israel, Empowering Data Teams with Powerful Tools

Access Amazon Athena in your applications using the WebSocket API

Ingest, transform, and deliver events published by Amazon Security Lake to Amazon OpenSearch Service

Quantitative and Qualitative Data: A Vital Combination

Automate deployment of an Amazon QuickSight analysis connecting to an Amazon Redshift data warehouse with an AWS CloudFormation template

Amazon Redshift data ingestion options

Modern Data Architecture for Telecommunications

Create an end-to-end data strategy for Customer 360 on AWS

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Stitch Fix seamless migration: Transitioning from self-managed Kafka to Amazon MSK

10 Keys to a Secure Cloud Data Lakehouse

Power enterprise-grade Data Vaults with Amazon Redshift – Part 2

PODCAST: Making AI Real – Episode 4: Unlocking the Value of Enterprise AI with Data Engineering Capabilities

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Deep Thoughts on Data Flow with Alation & Trifacta

Exploring the AI and data capabilities of watsonx

How Cargotec uses metadata replication to enable cross-account data sharing

Themes and Conferences per Pacoid, Episode 8

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 1

Generate security insights from Amazon Security Lake data using Amazon OpenSearch Ingestion

Data Governance for Dummies: Your Questions, Answered

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Stay Connected