2012, Data Lake and Interactive - Data Leaders Brief

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

AWS Big Data

JULY 18, 2024

Over the years, organizations have invested in creating purpose-built, cloud-based data lakes that are siloed from one another. A major challenge is enabling cross-organization discovery and access to data across these multiple data lakes, each built on different technology stacks.

Data Lake

Data Lake Publishing Metadata Data-driven

Introducing Amazon Q data integration in AWS Glue

AWS Big Data

APRIL 30, 2024

Amazon Q Developer can now generate complex data integration jobs with multiple sources, destinations, and data transformations. Configure an IAM role to interact with Amazon Q. His team works on distributed systems & new interfaces for data integration and efficiently managing data lakes on AWS.

Data Integration

Data Integration Data Lake Data Warehouse Software

Run Spark SQL on Amazon Athena Spark

AWS Big Data

OCTOBER 23, 2023

For interactive applications, Athena Spark allows you to spend less time waiting and be more productive, with application startup time in under a second. Running SQL on data lakes is fast, and Athena provides an optimized, Trino- and Presto-compatible API that includes a powerful optimizer.

Data Lake

Data Lake Visualization Optimization Interactive

Periscope Data Expands to Israel, Empowering Data Teams with Powerful Tools

Sisense

DECEMBER 11, 2019

And he demonstrated how the Periscope Data platform overcomes the challenges of huge data volumes that can’t be easily modeled by traditional BI. Citing Tinder as a major example, Kyle explained how it constantly uses data to enhance users’ interactions and calibrate the user experience. A true unicorn.

Data Lake

Data Lake Big Data Sales Data-driven

Simplify and speed up Apache Spark applications on Amazon Redshift data with Amazon Redshift integration for Apache Spark

AWS Big Data

APRIL 20, 2023

Customers use Amazon Redshift to run their business-critical analytics on petabytes of structured and semi-structured data. Apache Spark is a popular framework that you can use to build applications for use cases such as ETL (extract, transform, and load), interactive analytics, and machine learning (ML). enableHiveSupport().getOrCreate()

Data Lake

Data Lake Data Warehouse Sales Data-driven

Federate Amazon QuickSight access with open-source identity provider Keycloak

AWS Big Data

JUNE 13, 2023

Vamsi Bhadriraju is a Data Architect at AWS. He works closely with enterprise customers to build data lakes and analytical applications on the AWS Cloud. This policy grants the admin privileges in QuickSight to the federated user. Srikanth Baheti is a Specialized World Wide Principal Solutions Architect for Amazon QuickSight.

Metadata

Metadata Dashboards Business Intelligence Data Lake

Migrate workloads from AWS Data Pipeline

AWS Big Data

JULY 25, 2024

AWS Data Pipeline helps customers automate the movement and transformation of data. With Data Pipeline, customers can define data-driven workflows, so that tasks can be dependent on the successful completion of previous tasks. You can visually create, run, and monitor ETL pipelines to load data into your data lakes.

Visualization

Visualization Management Data Integration Testing

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

AWS Big Data

APRIL 19, 2023

To configure AWS CLI interaction with AWS, refer to Quick setup. He is passionate about big data and data analytics. Sandeep Singh is a Lead Consultant at AWS ProServe, focused on analytics, data lake architecture, and implementation. Amol Guldagad is a Data Analytics Consultant based in India. X Python 3.8

Metadata

Metadata Data Lake Testing Consulting

Generate security insights from Amazon Security Lake data using Amazon OpenSearch Ingestion

AWS Big Data

AUGUST 28, 2023

Optionally, specify the Amazon S3 storage class for the data in Amazon Security Lake. For more information, refer to Lifecycle management in Security Lake. Review the details and create the data lake. Choose Next. Additionally, the principal must have permission to pass the pipeline role to OpenSearch Ingestion.

Dashboards

Dashboards Visualization Metadata Management

Why We Started the Data Intelligence Project

Alation

JULY 7, 2022

To answer these questions we need to look at how data roles within the job market have evolved, and how academic programs have changed to meet new workforce demands. In the 2010s, the growing scope of the data landscape gave rise to a new profession: the data scientist. The data scientist.

Metadata

Metadata Data-driven Insurance Statistics

Q&A with Greg Rahn – The changing Data Warehouse market

Cloudera

DECEMBER 12, 2018

And so I actually transitioned out of that group and into the Big Data Appliance group at Oracle, but soon realized that if that was what I wanted to keep doing, this up and coming company called Cloudera might be a better place to do it since these new technologies weren’t just a hobby at Cloudera. As you mentioned, Qlik is in there.

Data Warehouse

Data Warehouse Marketing Big Data Data-driven

Author data integration jobs with an interactive data preparation experience with AWS Glue visual ETL

AWS Big Data

JULY 10, 2024

The AWS Glue Studio visual editor is a graphical interface that enables you to create, run, and monitor data integration jobs in AWS Glue. The new data preparation interface in AWS Glue Studio provides an intuitive, spreadsheet-style view for interactively working with tabular data. Choose the created IAM role.

Interactive

Interactive Data Integration Visualization Statistics

Integrate custom applications with AWS Lake Formation – Part 1

AWS Big Data

NOVEMBER 19, 2024

Lake Formation also makes it straightforward to share data internally across your organization and externally, which lets you create a data mesh or meet other data sharing needs with no data movement. For this post, we use mybucket. For example, myfolder1/myfolder2/datalake-population-function.zip.

Data Lake

Data Lake Metadata Testing Data Processing

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

That resulted in server farms, collecting volumes of log data from customer interactions, data which was then aggregated and fed into machine learning algorithms which created data products as pre-computed results, which in turn made web apps smarter and enhanced e-commerce revenue. We keep feeding the monster data.

Machine Learning

Machine Learning Data Governance Metadata Data Science

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

AUGUST 8, 2019

Once upon a time, circa 2012-ish, data science conferences were replete with talks about an industry hellbent on loading amazing enormous Big Data into some kind of data lake, and applying all kinds of odd astrophysics-ish approaches…for eventual PROFIT! Or something. Nothing Spreads Like Fear”.

Data Science

Data Science Machine Learning Data Governance Statistics

Access Amazon S3 Iceberg tables from Databricks using AWS Glue Iceberg Rest Catalog in Amazon SageMaker Lakehouse

AWS Big Data

JANUARY 23, 2025

Amazon SageMaker Lakehouse enables a unified, open, and secure lakehouse platform on your existing data lakes and warehouses. Its unified data architecture supports data analysis, business intelligence, machine learning, and generative AI applications, which can now take advantage of a single authoritative copy of data.

Data Lake

Data Lake Data Warehouse Metadata Machine Learning

Redefining enterprise transformation in the age of intelligent ecosystems

CIO Business Intelligence

JANUARY 16, 2025

The mega-vendor era By 2020, the basis of competition for what are now referred to as mega-vendors was interoperability, automation and intra-ecosystem participation and unlocking access to data to drive business capabilities, value and manage risk. Planning and Reasoning: The agents are capable of complex planning and multi-step reasoning.

Enterprise

Enterprise Digital Transformation Scorecard Interactive

How Volkswagen streamlined access to data across multiple data lakes using Amazon DataZone – Part 1

Introducing Amazon Q data integration in AWS Glue

Run Spark SQL on Amazon Athena Spark

Periscope Data Expands to Israel, Empowering Data Teams with Powerful Tools

Simplify and speed up Apache Spark applications on Amazon Redshift data with Amazon Redshift integration for Apache Spark

Federate Amazon QuickSight access with open-source identity provider Keycloak

Migrate workloads from AWS Data Pipeline

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

Generate security insights from Amazon Security Lake data using Amazon OpenSearch Ingestion

Why We Started the Data Intelligence Project

Q&A with Greg Rahn – The changing Data Warehouse market

Author data integration jobs with an interactive data preparation experience with AWS Glue visual ETL

Integrate custom applications with AWS Lake Formation – Part 1

Themes and Conferences per Pacoid, Episode 8

Themes and Conferences per Pacoid, Episode 12

Access Amazon S3 Iceberg tables from Databricks using AWS Glue Iceberg Rest Catalog in Amazon SageMaker Lakehouse

Redefining enterprise transformation in the age of intelligent ecosystems

Stay Connected