Data Architecture, Data Processing and Testing

Data Architecture

Data Processing

Testing

Eight Top DataOps Trends for 2022

DataKitchen

NOVEMBER 29, 2021

In 2022, data organizations will institute robust automated processes around their AI systems to make them more accountable to stakeholders. Model developers will test for AI bias as part of their pre-deployment testing. Quality test suites will enforce “equity,” like any other performance metric. Data Gets Meshier.

Testing

Testing Data Lake Data Architecture Manufacturing

7 types of tech debt that could cripple your business

CIO Business Intelligence

MARCH 25, 2025

Build up: Databases that have grown in size, complexity, and usage build up the need to rearchitect the model and architecture to support that growth over time. What CIOs can do: To make transitions to new AI capabilities less costly, invest in regression testing and change management practices around AI-enabled large-scale workflows.

Risk

Risk Cost-Benefit Data-driven Digital Transformation

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

AWS Big Data

NOVEMBER 27, 2024

Together with price-performance, Amazon Redshift offers capabilities such as serverless architecture, machine learning integration within your data warehouse and secure data sharing across the organization. dbt Cloud is a hosted service that helps data teams productionize dbt deployments. Choose Test Connection.

Data Warehouse

Data Warehouse Analytics Testing Sales

Webinars

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

The future of data: A 5-pillar approach to modern data management

CIO Business Intelligence

DECEMBER 11, 2024

Manish Limaye Pillar #1: Data platform The data platform pillar comprises tools, frameworks and processing and hosting technologies that enable an organization to process large volumes of data, both in batch and streaming modes. He is currently a technology advisor to multiple startups and mid-size companies.

Management

Management Data Governance Data Science Reporting

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

AWS Big Data

MARCH 6, 2025

Copy and save the client ID and client secret needed later for the Streamlit application and the IAM Identity Center application to connect using the Redshift Data API. Generate the client secret and set sign-in redirect URL and sign-out URL to [link] (we will host the Streamlit application locally on port 8501). and v3.12.2.

Visualization

Visualization Sales Data Warehouse Management

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

AWS Big Data

JUNE 10, 2024

Test access to the producer cataloged Amazon S3 data using EMR Serverless in the consumer account. Test access using Athena queries in the consumer account. Test access using SageMaker Studio in the consumer account. It is recommended to use test accounts. The catalog account will host Lake Formation and AWS Glue.

Data Lake

Data Lake Metadata Data Warehouse Data Processing

Build SAML identity federation for Amazon OpenSearch Service domains within a VPC

AWS Big Data

FEBRUARY 7, 2024

Create an Amazon Route 53 public hosted zone such as mydomain.com to be used for routing internet traffic to your domain. For instructions, refer to Creating a public hosted zone. Request an AWS Certificate Manager (ACM) public certificate for the hosted zone. hosted_zone_id – The Route 53 public hosted zone ID.

Dashboards

Dashboards Data Processing Metadata Consulting

4 paths to sustainable AI

CIO Business Intelligence

JANUARY 31, 2024

Weston uses uplift modeling, running a series of A/B tests to determine how potential customers respond to different offers, and then uses the results of those tests to build the model. The size of the data sets is limited by business concerns.

Cost-Benefit

Cost-Benefit Modeling Testing IoT

Data Integrity, the Basis for Reliable Insights

Sisense

AUGUST 28, 2020

Uncomfortable truth incoming: Most people in your organization don’t think about the quality of their data from intake to production of insights. However, as a data team member, you know how important data integrity (and a whole host of other aspects of data management) is. Data integrity: A process and a state.

Data Integration

Data Integration Testing Data Quality Data-driven

How Swisscom automated Amazon Redshift as part of their One Data Platform solution using AWS CDK – Part 1

AWS Big Data

JUNE 12, 2024

One Data Platform The ODP architecture is based on the AWS Well Architected Framework Analytics Lens and follows the pattern of having raw, standardized, conformed, and enriched layers as described in Modern data architecture. See the following admin user code: admin_secret_kms_key_options = KmsKeyOptions(.

Data Architecture

Data Architecture Cost-Benefit Data-driven Experimentation

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

AWS Big Data

AUGUST 1, 2024

Success criteria alignment by all stakeholders (producers, consumers, operators, auditors) is key for successful transition to a new Amazon Redshift modern data architecture. The success criteria are the key performance indicators (KPIs) for each component of the data workflow.

Data Warehouse

Data Warehouse KPI Optimization Cost-Benefit

Four Ways Telcos Can Realize Data-Driven Transformation

Cloudera

OCTOBER 19, 2023

While navigating so many simultaneous data-dependent transformations, they must balance the need to level up their data management practices—accelerating the rate at which they ingest, manage, prepare, and analyze data—with that of governing this data.

Data-driven

Data-driven Data Architecture Predictive Modeling Digital Transformation

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

Overview of solution As a data-driven company, smava relies on the AWS Cloud to power their analytics use cases. smava ingests data from various external and internal data sources into a landing stage on the data lake based on Amazon Simple Storage Service (Amazon S3).

Data Lake

Data Lake Data Warehouse Data-driven B2B

How Universal Data Distribution Accelerates Complex DoD Missions

Cloudera

AUGUST 11, 2022

But information broadly, and the management of data specifically, is still “the” critical factor for situational awareness, streamlined operations, and a host of other use cases across today’s tech-driven battlefields. . With over 400 connectors and processors , UDD enables a broad range of data distribution capabilities. .

Predictive Analytics

Predictive Analytics Data-driven Data Processing Data Architecture

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

Over the years, data lakes on Amazon Simple Storage Service (Amazon S3) have become the default repository for enterprise data and are a common choice for a large set of users who query data for a variety of analytics and machine leaning use cases. Analytics use cases on data lakes are always evolving. Choose ETL Jobs.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

HEMA accelerates their data governance journey with Amazon DataZone

AWS Big Data

DECEMBER 19, 2024

HEMA has a bespoke enterprise architecture, built around the concept of services. Each service is hosted in a dedicated AWS account and is built and maintained by a product owner and a development team, as illustrated in the following figure. Amazon DataZone is the central piece in this architecture.

Data Governance

Data Governance Publishing Data-driven Metadata

CDOs: Your AI is smart, but your ESG is dumb. Here’s how to fix it

CIO Business Intelligence

MARCH 19, 2025

Integrating ESG into data decision-making CDOs should embed sustainability into data architecture, ensuring that systems are designed to optimize energy efficiency, minimize unnecessary data replication and promote ethical data use.

IT Data Governance Data-driven Metrics

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

AWS Big Data

OCTOBER 18, 2023

Choose Test connection to verify that AWS SCT can connect to your source Azure Synapse project. Choose Test connection to verify that AWS SCT can connect to your target Redshift workgroup. When the test is successful, choose OK. Select Redshift data agent , then choose OK. to indicate local host. Choose Test Task.

Analytics

Analytics Data Warehouse Dashboards Testing

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

Those decentralization efforts appeared under different monikers through time, e.g., data marts versus data warehousing implementations (a popular architectural debate in the era of structured data) then enterprise-wide data lakes versus smaller, typically BU-Specific, “data ponds”.

Metadata

Metadata Cost-Benefit Enterprise Interactive

For IT leaders, operationalized gen AI is still a moving target

CIO Business Intelligence

FEBRUARY 28, 2024

And not only do companies have to get all the basics in place to build for analytics and MLOps, but they also need to build new data structures and pipelines specifically for gen AI. And for some use cases, an expensive, high-end commercial LLM might not be required since a locally-hosted open source model might suffice.

IT Consulting Modeling Enterprise

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

AWS Big Data

MAY 30, 2024

Overall, the current architecture didn’t support workload prioritization, therefore a physical model of resources was reserved for this reason. The system had an integration with legacy backend services that were all hosted on premises. Solution overview Amazon Redshift is an industry-leading cloud data warehouse.

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Structured Data

A guide to efficient Oracle implementation

IBM Big Data Hub

DECEMBER 4, 2023

Assemble a cross-collaborative implementation team with well-defined roles and identify major stakeholders to consult and test the system as the project moves forward. During configuration, an organization constructs its data architecture and defines user roles. Security: Ensure all sensitive data is stored appropriately.

Testing

Testing Consulting Digital Transformation Cost-Benefit

How Zurich Insurance Group built a log management solution on AWS

AWS Big Data

JULY 16, 2024

Historic data analysis – Data stored in Amazon S3 can be queried to satisfy one-time audit or analysis tasks. Eventually, this data could be used to train ML models to support better anomaly detection. Zurich has done testing with Amazon SageMaker and has plans to add this capability in the near future.

Insurance

Insurance Management Cost-Benefit Optimization

How Financial Services and Insurance Streamline AI Initiatives with a Hybrid Data Platform

Cloudera

SEPTEMBER 7, 2023

Introduce advanced AI training and programs, including hands-on projects that simulate real-world financial scenarios, or mentorship programs hosted by AI experts. Prepare for higher AI demands, assessing the state of the institution’s infrastructure capacity while taking Into account future data processing needs.

Insurance

Insurance Risk Data-driven Finance

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 2: Cloud Adoption

BizAcuity

MAY 24, 2022

IaaS provides a platform for compute, data storage and networking capabilities. IaaS is mainly used for developing softwares (testing and development, batch processing), hosting web applications and data analysis. To try and test the platforms in accordance with data strategy and governance.

Data-driven

Data-driven Cost-Benefit Digital Transformation Strategy

Introducing erwin Data Modeler 14.0: The next step in a tradition of data modeling excellence

erwin

SEPTEMBER 16, 2024

The gold standard in data modeling solutions for more than 30 years continues to evolve with its latest release, highlighted by: PostgreSQL 16.x Migration and modernization : It enables seamless transitions between legacy systems and modern platforms, ensuring your data architecture evolves without disruption.

Modeling

Modeling Visualization Data Governance Data Architecture

Best BI Tools For 2024 You Need to Know

FineReport

MARCH 31, 2024

Through meticulous testing and research, we’ve curated a list of the ten best BI tools, ensuring accessibility and efficacy for businesses of all sizes. The selection of the best BI tools stands as a critical step in leveraging data effectively, driving success, and maintaining competitive advantage in modern markets.

Dashboards

Dashboards Visualization Data mining Data-driven

CIO 100 Award winners drive business results with IT

CIO Business Intelligence

AUGUST 7, 2024

But Barnett, who started work on a strategy in 2023, wanted to continue using Baptist Memorial’s on-premise data center for financial, security, and continuity reasons, so he and his team explored options that allowed for keeping that data center as part of the mix.

IT Insurance Cost-Benefit Testing

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

Data Environment First off, the solutions you consider should be compatible with your current data architecture. We have outlined the requirements that most providers ask for: Data Sources Strategic Objective Use native connectivity optimized for the data source. Do what you expect your customers to do.

Analytics

Analytics Cost-Benefit Visualization Dashboards

Design patterns for implementing Hive Metastore for Amazon EMR on EKS

AWS Big Data

FEBRUARY 28, 2025

In modern data architectures, the need to manage and query vast datasets efficiently, consistently, and accurately is paramount. For organizations that deal with big data processing, managing metadata becomes a critical concern. Note that this implementation hasnt been tested with other EMR on EKS Spark job submission types.

Metadata

Metadata Data Lake Data Processing Data Architecture

Configure cross-account access of Amazon SageMaker Lakehouse multi-catalog tables using AWS Glue 5.0 Spark

AWS Big Data

MAY 9, 2025

Permissions from Prerequisites for managing Amazon Redshift namespaces in the AWS Glue Data Catalog granted to the Lake Formation administrator role on both accounts. An S3 bucket in the producer account to host the sample Iceberg table data. Test this solution in your accounts and share feedback in the comments section.

Data Lake

Data Lake Data Warehouse Marketing Management

Data Leaders Brief

Eight Top DataOps Trends for 2022

7 types of tech debt that could cripple your business

Webinars

Trending Sources

Unlocking near real-time analytics with petabytes of transaction data using Amazon Aurora Zero-ETL integration with Amazon Redshift and dbt Cloud

Webinars

The future of data: A 5-pillar approach to modern data management

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center

Design a data mesh pattern for Amazon EMR-based data lakes using AWS Lake Formation with Hive metastore federation

Build SAML identity federation for Amazon OpenSearch Service domains within a VPC

4 paths to sustainable AI

Data Integrity, the Basis for Reliable Insights

How Swisscom automated Amazon Redshift as part of their One Data Platform solution using AWS CDK – Part 1

Unlock scalability, cost-efficiency, and faster insights with large-scale data migration to Amazon Redshift

Four Ways Telcos Can Realize Data-Driven Transformation

How smava makes loans transparent and affordable using Amazon Redshift Serverless

How Universal Data Distribution Accelerates Complex DoD Missions

Migrate an existing data lake to a transactional data lake using Apache Iceberg

HEMA accelerates their data governance journey with Amazon DataZone

CDOs: Your AI is smart, but your ESG is dumb. Here’s how to fix it

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

How Cloudera Data Flow Enables Successful Data Mesh Architectures

For IT leaders, operationalized gen AI is still a moving target

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

A guide to efficient Oracle implementation

How Zurich Insurance Group built a log management solution on AWS

How Financial Services and Insurance Streamline AI Initiatives with a Hybrid Data Platform

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 2: Cloud Adoption

Introducing erwin Data Modeler 14.0: The next step in a tradition of data modeling excellence

Best BI Tools For 2024 You Need to Know

CIO 100 Award winners drive business results with IT

What Is Embedded Analytics?

Design patterns for implementing Hive Metastore for Amazon EMR on EKS

Configure cross-account access of Amazon SageMaker Lakehouse multi-catalog tables using AWS Glue 5.0 Spark

Stay Connected