Modeling and Statistics - Data Leaders Brief

Learning Time Series Analysis & Modern Statistical Models

Analytics Vidhya

JANUARY 24, 2023

Introduction Statistical models are significant for understanding and predicting complex data. A viable area for statistical modeling is time-series analysis. Statistical models […] The post Learning Time Series Analysis & Modern Statistical Models appeared first on Analytics Vidhya.

Statistics

Statistics Modeling Finance Technology

Statistical Modelling and Identifiability of Parameters

Analytics Vidhya

MAY 27, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction Identifiability is a very important property of statistical parameters. The post Statistical Modelling and Identifiability of Parameters appeared first on Analytics Vidhya.

Statistics

Statistics Modeling Data Science Publishing

End to End Statistics for Data Science

Analytics Vidhya

OCTOBER 29, 2021

This article was published as a part of the Data Science Blogathon Introduction to Statistics Statistics is a type of mathematical analysis that employs quantified models and representations to analyse a set of experimental data or real-world studies. Data processing is […].

Statistics

Statistics Data Science Experimentation Publishing

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

All about Statistical Modeling

Analytics Vidhya

DECEMBER 14, 2020

What is a Statistical Model? “Modeling is an art, as well as. The post All about Statistical Modeling appeared first on Analytics Vidhya. This article was published as a part of the Data Science Blogathon.

Statistics

Statistics Modeling Data Science Publishing

Textual Statistical Analysis Using pyNLPL (Pineapple) Library

Analytics Vidhya

JULY 17, 2024

Introduction Statistical Analysis of text is one of the important steps of text pre-processing. This type of analysis can help us understand hidden patterns, and the weight of specific words in a sentence, and overall, helps in building good language models. It helps us understand our text data in a deep, mathematical way.

Statistics

Statistics Modeling Analytics IT

Boxing and Unboxing of Statistical Models with Gaussian Learning

Analytics Vidhya

OCTOBER 7, 2020

The post Boxing and Unboxing of Statistical Models with Gaussian Learning appeared first on Analytics Vidhya. This article was published as a part of the Data Science Blogathon. Values offer Focus amidst the Chaos” – Glenn C. Stewart Introduction Joseph.

Statistics

Statistics Modeling Data Science Publishing

Building Language Models in NLP

Analytics Vidhya

JANUARY 3, 2022

Introduction A language model in NLP is a probabilistic statistical model that determines the probability of a given sequence of words occurring in a sentence based on the previous words. The post Building Language Models in NLP appeared first on Analytics Vidhya.

Modeling

Modeling Statistics Data Science Publishing

Building an end-to-end Polynomial Regression Model in R

Analytics Vidhya

NOVEMBER 23, 2021

Regression analysis is used to solve problems of prediction based on data statistical parameters. In this article, we will look at the use of a polynomial regression model on a simple example using real statistic data. The post Building an end-to-end Polynomial Regression Model in R appeared first on Analytics Vidhya.

Modeling

Modeling Statistics Data Science Publishing

What is a Bernoulli Distribution?

Analytics Vidhya

NOVEMBER 20, 2024

A key idea in data science and statistics is the Bernoulli distribution, named for the Swiss mathematician Jacob Bernoulli. It is crucial to probability theory and a foundational element for more intricate statistical models, ranging from machine learning algorithms to customer behaviour prediction.

Statistics

Statistics Machine Learning Data Science Modeling

Statistics 101: Introduction to the Central Limit Theorem (with implementation in R)

Analytics Vidhya

MAY 2, 2019

Introduction What is one of the most important and core concepts of statistics that enables us to do predictive modeling, and yet it often. The post Statistics 101: Introduction to the Central Limit Theorem (with implementation in R) appeared first on Analytics Vidhya.

Statistics

Statistics Predictive Modeling Modeling Analytics

Introduction to Statistics Using the R Programming Language

Analytics Vidhya

AUGUST 29, 2023

Whether you’re delving into descriptive statistics, probability distributions, or sophisticated regression models, R’s versatility and extensive packages facilitate seamless statistical exploration. R, an open-source tool, empowers data enthusiasts to explore, analyze, and visualize data with precision.

Statistics

Statistics Visualization Modeling Analytics

How to Run Binary Logistic Regression Model with Julius?

Analytics Vidhya

JUNE 8, 2024

Introduction Logistic regression is a statistical technique used to model the probability of a binary (categorical variable that can take on two distinct values) outcome based on one or more predictor variables. appeared first on Analytics Vidhya.

Modeling

Modeling Statistics Analytics Data Science

Gain Customer’s Confidence in ML Model Predictions

Analytics Vidhya

APRIL 4, 2022

Introduction One of the key challenges in Machine Learning Model is the explainability of the ML Model that we are building. In general, ML Model is a Black Box. As Data scientists, we may understand the algorithm & statistical methods used behind the scene. […].

Modeling

Modeling Machine Learning Statistics Data Science

The Science of T20 Cricket: Decoding Player Performance with Predictive Modeling

Analytics Vidhya

JUNE 1, 2023

With franchise leagues like IPL and BBL, teams rely on statistical models and tools for competitive edge. The analysis benefits fantasy […] The post The Science of T20 Cricket: Decoding Player Performance with Predictive Modeling appeared first on Analytics Vidhya.

Predictive Modeling

Predictive Modeling Modeling Statistics Optimization

How to Build Your Time Series Model?

Analytics Vidhya

FEBRUARY 20, 2023

Time series analysis is a statistical technique used to analyze data […] The post How to Build Your Time Series Model? Before we take up a time series problem, we must familiarise ourselves with the concept of forecasting. So now the question is, what is a time series? appeared first on Analytics Vidhya.

Modeling

Modeling Statistics Forecasting Analytics

What is F-Beta Score?

Analytics Vidhya

DECEMBER 2, 2024

As indicated in machine learning and statistical modeling, the assessment of models impacts results significantly. Accuracy falls short of capturing these trade-offs as a means to work with imbalanced datasets, especially in terms of precision and recall ratios.

Machine Learning

Machine Learning Statistics Measurement Modeling

What are Mean and Variance of the Normal Distribution?

Analytics Vidhya

NOVEMBER 25, 2024

The normal distribution, also known as the Gaussian distribution, is one of the most widely used probability distributions in statistics and machine learning. Understanding its core properties, mean and variance, is important for interpreting data and modelling real-world phenomena.

Statistics

Statistics Machine Learning Modeling Analytics

Model Collapse: An Experiment

O'Reilly on Data

OCTOBER 24, 2023

At some point in the near future, new models will be trained on code that they have written. At least one research group has experimented with training a generative model on content generated by generative AI, and has found that the output, over successive generations, was more tightly constrained, and less likely to be original or unique.

Modeling

Modeling Statistics Reporting Software

A brief introduction to Multilevel Modelling

Analytics Vidhya

JANUARY 20, 2022

Table of contents Introduction Multilevel Models Advantages of Multilevel models When do we use Multilevel Models Types of Multilevel Model Random intercept model Random coefficient model Hypothesis testing: Likelihood Ratio Testing End-Note Introduction Suppose, you have a dataset of faculty salaries of a university […].

Modeling

Modeling Testing Data Science Publishing

Gaussian Naive Bayes Algorithm for Credit Risk Modelling

Analytics Vidhya

MARCH 1, 2022

Credit evaluations have progressed from being subjective decisions by the bank’s credit experts to a more statistically advanced evaluation. The post Gaussian Naive Bayes Algorithm for Credit Risk Modelling appeared first on Analytics Vidhya. Banks rapidly recognize the increased need for comprehensive credit risk […].

Risk

Risk Modeling Statistics Data Science

Introduction to Linear Model for Optimization

Analytics Vidhya

DECEMBER 23, 2021

Optimization aims to reduce training errors, and Deep Learning Optimization is concerned with finding a suitable model. The post Introduction to Linear Model for Optimization appeared first on Analytics Vidhya. Another goal of optimization in deep learning is to minimize generalization errors. In this article, we will […].

Optimization

Optimization Modeling Deep Learning Data Science

Handling Missing Values with Random Forest

Analytics Vidhya

MAY 4, 2022

Introduction to Random Forest Missing values have always been a concern for any statistical analysis. They significantly reduce the study’s statistical powers, which may lead to faulty conclusions. Most of the algorithms used in statistical modellings such as Linear regression, Logistic Regression, […].

Statistics

Statistics Data Science Publishing Modeling

SARIMA Model for Forecasting Currency Exchange Rates

Analytics Vidhya

JUNE 20, 2023

SARIMA is an excellent time series forecasting technique for estimating time series […] The post SARIMA Model for Forecasting Currency Exchange Rates appeared first on Analytics Vidhya. Currency forecasting may assist people, corporations, and financial organizations make educated financial decisions.

Forecasting

Forecasting Modeling Analytics Statistics

How Machine Learning Models Fail to Deliver in Real-World Scenarios

Analytics Vidhya

SEPTEMBER 29, 2020

The post How Machine Learning Models Fail to Deliver in Real-World Scenarios appeared first on Analytics Vidhya. This article was published as a part of the Data Science Blogathon. Introduction Yesterday, my brother broke an antique at home. I began to.

Machine Learning

Machine Learning Modeling Data Science Publishing

Unbundling the Graph in GraphRAG

O'Reilly on Data

NOVEMBER 19, 2024

Reasons for using RAG are clear: large language models (LLMs), which are effectively syntax engines, tend to “hallucinate” by inventing answers from pieces of their training data. See the primary sources “ REALM: Retrieval-Augmented Language Model Pre-Training ” by Kelvin Guu, et al., at Facebook—both from 2020.

Unstructured Data

Unstructured Data Structured Data Statistics Modeling

Underlying Engineering Behind Alexa’s Contextual ASR

Analytics Vidhya

SEPTEMBER 17, 2022

Introduction Conventionally, an automatic speech recognition (ASR) system leverages a single statistical language model to rectify ambiguities, regardless of context. This article was published as a part of the Data Science Blogathon. However, we can improve the system’s accuracy by leveraging contextual information.

Metadata

Metadata Statistics Data Science Publishing

External Data Supports More Accurate Planning

David Menninger's Analyst Perspectives

JANUARY 15, 2025

I use the term external data to include any information about the world outside an organization (including economic and market statistics), competitors (such as pricing and locations) and customers. This provides useful information about what to do next time to achieve a better outcome and how to refine the model to improve its accuracy.

Predictive Modeling

Predictive Modeling Forecasting Predictive Analytics Statistics

Why it’s hard to design fair machine learning models

O'Reilly on Data

SEPTEMBER 27, 2018

They recently wrote a survey paper, “A Critical Review of Fair Machine Learning,” where they carefully examined the standard statistical tools used to check for fairness in machine learning models. Continue reading Why it’s hard to design fair machine learning models.

Machine Learning

Machine Learning Modeling Statistics IT

HOW TO CHOOSE EVALUATION METRICS FOR CLASSIFICATION MODEL

Analytics Vidhya

OCTOBER 11, 2020

So you have successfully built your classification model. The post HOW TO CHOOSE EVALUATION METRICS FOR CLASSIFICATION MODEL appeared first on Analytics Vidhya. This article was published as a part of the Data Science Blogathon. INTRODUCTION Yay!! What should.

Metrics

Metrics Modeling Data Science Publishing

Q-Q plot – Ensure Your ML Model is Based on the Right Distribution

Analytics Vidhya

SEPTEMBER 6, 2021

The post Q-Q plot – Ensure Your ML Model is Based on the Right Distribution appeared first on Analytics Vidhya. As the name suggests, they plot the quantiles of a sample distribution against quantiles of a theoretical distribution.

Modeling

Modeling Data Science Publishing Analytics

How to Create an ARIMA Model for Time Series Forecasting in Python

Analytics Vidhya

OCTOBER 28, 2020

Introduction A popular and widely used statistical method for time series forecasting. The post How to Create an ARIMA Model for Time Series Forecasting in Python appeared first on Analytics Vidhya. This article was published as a part of the Data Science Blogathon.

Forecasting

Forecasting Modeling Statistics Data Science

Complete R Tutorial To Build Probabilistic Graphical Models!

Analytics Vidhya

OCTOBER 9, 2020

Introduction: Probabilistic Graphical Models (PGM) capture the complex relationships between random variables. The post Complete R Tutorial To Build Probabilistic Graphical Models! This article was published as a part of the Data Science Blogathon. appeared first on Analytics Vidhya.

Modeling

Modeling Data Science Publishing Analytics

Analysis of Imbalanced Datasets – Sample Size vs Accuracy

Analytics Vidhya

JULY 5, 2022

Introduction to Imbalanced Datasets The accuracy achieved by many of the machine learning models using traditional statistical algorithms increases by just around 2% or so when the size of the training dataset is increased from 20% to 80%. This article was published as a part of the Data Science Blogathon.

Statistics

Statistics Machine Learning Data Science Publishing

Understanding the Log-normal Distribution

Analytics Vidhya

JUNE 17, 2024

Introduction The log-normal distribution is a fascinating statistical concept commonly used to model data that exhibit right-skewed behavior. This distribution has wide-ranging applications in various fields, such as biology, finance, and engineering.

Statistics

Statistics Finance Modeling Analytics

Proposals for model vulnerability and security

O'Reilly on Data

MARCH 20, 2019

Apply fair and private models, white-hat and forensic model debugging, and common sense to protect machine learning models from malicious actors. Like many others, I’ve known for some time that machine learning models themselves could pose security risks. This is like a denial-of-service (DOS) attack on your model itself.

Modeling

Modeling Machine Learning Predictive Modeling Consulting

Why you should care about debugging machine learning models

O'Reilly on Data

DECEMBER 12, 2019

Not least is the broadening realization that ML models can fail. And that’s why model debugging, the art and science of understanding and fixing problems in ML models, is so critical to the future of ML. Because all ML models make mistakes, everyone who cares about ML should also care about model debugging. [1]

Machine Learning

Machine Learning Modeling Testing Risk Management

5 Statistical Paradoxes Data Scientists Should Know

KDnuggets

FEBRUARY 23, 2023

Knowing these 5 statistical paradoxes is essential for data scientists to improve their analyses and machine learning models.

Statistics

Statistics Machine Learning Modeling Data Science

Decluttering the performance measures of classification models

Analytics Vidhya

DECEMBER 16, 2020

The post Decluttering the performance measures of classification models appeared first on Analytics Vidhya. This article was published as a part of the Data Science Blogathon. Introduction There are so many performance evaluation measures when it comes to.

Measurement

Measurement Modeling Data Science Publishing

11 Important Model Evaluation Metrics for Machine Learning Everyone should know

Analytics Vidhya

AUGUST 5, 2019

Overview Evaluating a model is a core part of building an effective machine learning model There are several evaluation metrics, like confusion matrix, cross-validation, The post 11 Important Model Evaluation Metrics for Machine Learning Everyone should know appeared first on Analytics Vidhya.

Machine Learning

Machine Learning Metrics Modeling Analytics

What is the Difference Between Covariance and Correlation?

Analytics Vidhya

JULY 2, 2023

Introduction Comprehending and unleashing the intricate affinities among variables in the expansive realm of statistics is integral. Everything from data-driven decision-making to scientific discoveries to predictive modeling depends on our potential to disentangle the hidden connections and patterns within complex datasets.

Statistics

Statistics Predictive Modeling Data-driven Modeling

Geometrical Approach To Understand Logistic Regression

Analytics Vidhya

JULY 23, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction Logistic Regression is another statistical model which is used for. The post Geometrical Approach To Understand Logistic Regression appeared first on Analytics Vidhya.

Statistics

Statistics Data Science Publishing Modeling

20+ Questions to Test your Skills on Logistic Regression

Analytics Vidhya

MAY 28, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction Logistic Regression, a statistical model is a very popular and. The post 20+ Questions to Test your Skills on Logistic Regression appeared first on Analytics Vidhya.

Testing

Testing Statistics Data Science Publishing

Build Better and Accurate Clusters with Gaussian Mixture Models

Analytics Vidhya

OCTOBER 30, 2019

Overview Gaussian Mixture Models are a powerful clustering algorithm Understand how Gaussian Mixture Models work and how to implement them in Python We’ll also. The post Build Better and Accurate Clusters with Gaussian Mixture Models appeared first on Analytics Vidhya.

Modeling

Modeling Analytics Structured Data Statistics

Creating Linear Model, It’s Equation and Visualization for Analysis

Analytics Vidhya

NOVEMBER 25, 2020

The post Creating Linear Model, It’s Equation and Visualization for Analysis appeared first on Analytics Vidhya. This article was published as a part of the Data Science Blogathon. Introduction Have you ever been tasked with visualizing the relationship between each.

Visualization

Visualization Modeling Data Science Publishing

Learning Time Series Analysis & Modern Statistical Models

Statistical Modelling and Identifiability of Parameters

Webinars

Trending Sources

End to End Statistics for Data Science

Webinars

All about Statistical Modeling

Textual Statistical Analysis Using pyNLPL (Pineapple) Library

Boxing and Unboxing of Statistical Models with Gaussian Learning

Building Language Models in NLP

Building an end-to-end Polynomial Regression Model in R

What is a Bernoulli Distribution?

Statistics 101: Introduction to the Central Limit Theorem (with implementation in R)

Introduction to Statistics Using the R Programming Language

How to Run Binary Logistic Regression Model with Julius?

Gain Customer’s Confidence in ML Model Predictions

The Science of T20 Cricket: Decoding Player Performance with Predictive Modeling

How to Build Your Time Series Model?

What is F-Beta Score?

What are Mean and Variance of the Normal Distribution?

Model Collapse: An Experiment

A brief introduction to Multilevel Modelling

Gaussian Naive Bayes Algorithm for Credit Risk Modelling

Introduction to Linear Model for Optimization

Handling Missing Values with Random Forest

SARIMA Model for Forecasting Currency Exchange Rates

How Machine Learning Models Fail to Deliver in Real-World Scenarios

Unbundling the Graph in GraphRAG

Underlying Engineering Behind Alexa’s Contextual ASR

External Data Supports More Accurate Planning

Why it’s hard to design fair machine learning models

HOW TO CHOOSE EVALUATION METRICS FOR CLASSIFICATION MODEL

Q-Q plot – Ensure Your ML Model is Based on the Right Distribution

How to Create an ARIMA Model for Time Series Forecasting in Python

Complete R Tutorial To Build Probabilistic Graphical Models!

Analysis of Imbalanced Datasets – Sample Size vs Accuracy

Understanding the Log-normal Distribution

Proposals for model vulnerability and security

Why you should care about debugging machine learning models

5 Statistical Paradoxes Data Scientists Should Know

Decluttering the performance measures of classification models

11 Important Model Evaluation Metrics for Machine Learning Everyone should know

What is the Difference Between Covariance and Correlation?

Geometrical Approach To Understand Logistic Regression

20+ Questions to Test your Skills on Logistic Regression

Build Better and Accurate Clusters with Gaussian Mixture Models

Creating Linear Model, It’s Equation and Visualization for Analysis

Stay Connected