Remove 2012 Remove Data Processing Remove Metadata
article thumbnail

Use Amazon OpenSearch Ingestion to migrate to Amazon OpenSearch Serverless

AWS Big Data

Migration of metadata such as security roles and dashboard objects will be covered in another subsequent post. Update the following information for the source: Uncomment hosts and specify the endpoint of the existing OpenSearch Service endpoint. For now, you can leave the default minimum as 1 and maximum as 4.

Metadata 103
article thumbnail

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

Organizations have multiple Hive data warehouses across EMR clusters, where the metadata gets generated. The onboarding of producers is facilitated by sharing metadata, whereas the onboarding of consumers is based on granting permission to access this metadata. The producer account will host the EMR cluster and S3 buckets.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

AWS Big Data

In the second account, Amazon MWAA is hosted in one VPC and Redshift Serverless in a different VPC, which are connected through VPC peering. Cross-account access has been set up between S3 buckets in Account A with resources in Account B to be able to load and unload data.

Metadata 109
article thumbnail

Use Amazon Kinesis Data Streams to deliver real-time data to Amazon OpenSearch Service domains with Amazon OpenSearch Ingestion

AWS Big Data

For example, you can use metadata about the Kinesis data stream name to index by data stream ( ${getMetadata("kinesis_stream_name") ), or you can use document fields to index data depending on the CloudWatch log group or other document data ( ${path/to/field/in/document} ).

article thumbnail

Manage Amazon OpenSearch Service Visualizations, Alerts, and More with GitHub and Jenkins

AWS Big Data

In the trust policy, specify that Amazon Elastic Compute Cloud (Amazon EC2) can assume this role: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" Launch an EC2 instance Note : Make sure to deploy the EC2 instance for hosting Jenkins in the same VPC as the OpenSearch domain.

article thumbnail

Federate Amazon QuickSight access with open-source identity provider Keycloak

AWS Big Data

Download the SAML metadata file. In the navigation pane under Clients , import the SAML metadata file. Insert your specific host domain name where the Keycloak application resides in the following URL: [link] /realms/aws-realm/protocol/saml/descriptor. Download the Keycloak IdP SAML metadata file from that URL location.

article thumbnail

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

After the table is cataloged in your AWS Glue metadata catalog, you can run queries directly on your data in your S3 data lake through OpenSearch Dashboards. You can audit connections to ensure that they are set up in a scalable, cost-efficient, and secure way. Solution overview The following diagram illustrates the solution architecture.

Data Lake 119