Remove Data Processing Remove Events Remove Metadata
article thumbnail

Modernize your legacy databases with AWS data lakes, Part 2: Build a data lake using AWS DMS data on Apache Iceberg

AWS Big Data

We recommend using AWS Step Functions Workflow Studio , and setting up Amazon S3 event notifications and an SNS FIFO queue to receive the filename as messages. Because a CDC file can contain data for multiple tables, the job loops over the tables in a file and loads the table metadata from the source table ( RDS column names).

Data Lake 104
article thumbnail

Use Amazon Kinesis Data Streams to deliver real-time data to Amazon OpenSearch Service domains with Amazon OpenSearch Ingestion

AWS Big Data

For example, you can use metadata about the Kinesis data stream name to index by data stream ( ${getMetadata("kinesis_stream_name") ), or you can use document fields to index data depending on the CloudWatch log group or other document data ( ${path/to/field/in/document} ).

Metadata 115
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Implement a custom subscription workflow for unmanaged Amazon S3 assets published with Amazon DataZone

AWS Big Data

The proposed solution involves creating a custom subscription workflow that uses the event-driven architecture of Amazon DataZone. Amazon DataZone keeps you informed of key activities (events) within your data portal, such as subscription requests, updates, comments, and system events. Enter a name for the asset.

article thumbnail

How EUROGATE established a data mesh architecture using Amazon DataZone

AWS Big Data

From here, the metadata is published to Amazon DataZone by using AWS Glue Data Catalog. The applications are hosted in dedicated AWS accounts and require a BI dashboard and reporting services based on Tableau. This process is shown in the following figure.

IoT 107
article thumbnail

Integrate custom applications with AWS Lake Formation – Part 2

AWS Big Data

Add Amplify hosting Amplify can host applications using either the Amplify console or Amazon CloudFront and Amazon Simple Storage Service (Amazon S3) with the option to have manual or continuous deployment. For simplicity, we use the Hosting with Amplify Console and Manual Deployment options.

article thumbnail

Build event-driven data pipelines using AWS Controllers for Kubernetes and Amazon EMR on EKS

AWS Big Data

An event-driven architecture is a software design pattern in which decoupled applications can asynchronously publish and subscribe to events via an event broker. Another option is to use AWS Step Functions , which is a serverless workflow service that integrates with EMR on EKS and EventBridge to build event-driven workflows.

article thumbnail

Ingest, transform, and deliver events published by Amazon Security Lake to Amazon OpenSearch Service

AWS Big Data

When it comes to near-real-time analysis of data as it arrives in Security Lake and responding to security events your company cares about, Amazon OpenSearch Service provides the necessary tooling to help you make sense of the data found in Security Lake. Services such as Amazon Athena and Amazon SageMaker use query access.