Article, Modeling and Statistics

End to End Statistics for Data Science

Analytics Vidhya

OCTOBER 29, 2021

This article was published as a part of the Data Science Blogathon Introduction to Statistics Statistics is a type of mathematical analysis that employs quantified models and representations to analyse a set of experimental data or real-world studies. Data processing is […]. Data processing is […].

Statistics

Statistics Data Science Experimentation Publishing

Statistical Modelling and Identifiability of Parameters

Analytics Vidhya

MAY 27, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction Identifiability is a very important property of statistical parameters. The post Statistical Modelling and Identifiability of Parameters appeared first on Analytics Vidhya.

Statistics

Statistics Modeling Data Science Publishing

All about Statistical Modeling

Analytics Vidhya

DECEMBER 14, 2020

This article was published as a part of the Data Science Blogathon. What is a Statistical Model? “Modeling is an art, as well as. The post All about Statistical Modeling appeared first on Analytics Vidhya.

Statistics

Statistics Modeling Data Science Publishing

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Building Language Models in NLP

Analytics Vidhya

JANUARY 3, 2022

This article was published as a part of the Data Science Blogathon. Introduction A language model in NLP is a probabilistic statistical model that determines the probability of a given sequence of words occurring in a sentence based on the previous words.

Modeling

Modeling Statistics Data Science Publishing

What is a Bernoulli Distribution?

Analytics Vidhya

NOVEMBER 20, 2024

A key idea in data science and statistics is the Bernoulli distribution, named for the Swiss mathematician Jacob Bernoulli. It is crucial to probability theory and a foundational element for more intricate statistical models, ranging from machine learning algorithms to customer behaviour prediction.

Statistics

Statistics Machine Learning Data Science Modeling

Building an end-to-end Polynomial Regression Model in R

Analytics Vidhya

NOVEMBER 23, 2021

This article was published as a part of the Data Science Blogathon. Regression analysis is used to solve problems of prediction based on data statistical parameters. In this article, we will look at the use of a polynomial regression model on a simple example using real statistic data.

Modeling

Modeling Statistics Data Science Publishing

Boxing and Unboxing of Statistical Models with Gaussian Learning

Analytics Vidhya

OCTOBER 7, 2020

This article was published as a part of the Data Science Blogathon. The post Boxing and Unboxing of Statistical Models with Gaussian Learning appeared first on Analytics Vidhya. Values offer Focus amidst the Chaos” – Glenn C. Stewart Introduction Joseph.

Statistics

Statistics Modeling Data Science Publishing

Introduction to Statistics Using the R Programming Language

Analytics Vidhya

AUGUST 29, 2023

From foundational concepts to advanced techniques, this article is your comprehensive guide. Whether you’re delving into descriptive statistics, probability distributions, or sophisticated regression models, R’s versatility and extensive packages facilitate seamless statistical exploration.

Statistics

Statistics Visualization Modeling Analytics

What are Mean and Variance of the Normal Distribution?

Analytics Vidhya

NOVEMBER 25, 2024

The normal distribution, also known as the Gaussian distribution, is one of the most widely used probability distributions in statistics and machine learning. Understanding its core properties, mean and variance, is important for interpreting data and modelling real-world phenomena.

Statistics

Statistics Machine Learning Modeling Analytics

Gain Customer’s Confidence in ML Model Predictions

Analytics Vidhya

APRIL 4, 2022

This article was published as a part of the Data Science Blogathon. Introduction One of the key challenges in Machine Learning Model is the explainability of the ML Model that we are building. In general, ML Model is a Black Box.

Modeling

Modeling Machine Learning Statistics Data Science

The Science of T20 Cricket: Decoding Player Performance with Predictive Modeling

Analytics Vidhya

JUNE 1, 2023

With franchise leagues like IPL and BBL, teams rely on statistical models and tools for competitive edge. This article explores how data analytics optimizes strategies by leveraging player performances and opposition weaknesses. Introduction Cricket embraces data analytics for strategic advantage.

Predictive Modeling

Predictive Modeling Modeling Statistics Optimization

How to Build Your Time Series Model?

Analytics Vidhya

FEBRUARY 20, 2023

Introduction In this article, our focus will be on learning how to solve a time series problem. Time series analysis is a statistical technique used to analyze data […] The post How to Build Your Time Series Model? Before we take up a time series problem, we must familiarise ourselves with the concept of forecasting.

Modeling

Modeling Statistics Forecasting Analytics

Introduction to Linear Model for Optimization

Analytics Vidhya

DECEMBER 23, 2021

This article was published as a part of the Data Science Blogathon Optimization Optimization provides a way to minimize the loss function. Optimization aims to reduce training errors, and Deep Learning Optimization is concerned with finding a suitable model. In this article, we will […].

Optimization

Optimization Modeling Deep Learning Data Science

Handling Missing Values with Random Forest

Analytics Vidhya

MAY 4, 2022

This article was published as a part of the Data Science Blogathon. Introduction to Random Forest Missing values have always been a concern for any statistical analysis. They significantly reduce the study’s statistical powers, which may lead to faulty conclusions.

Statistics

Statistics Data Science Publishing Modeling

A brief introduction to Multilevel Modelling

Analytics Vidhya

JANUARY 20, 2022

This article was published as a part of the Data Science Blogathon. The post A brief introduction to Multilevel Modelling appeared first on Analytics Vidhya.

Modeling

Modeling Testing Data Science Publishing

How Machine Learning Models Fail to Deliver in Real-World Scenarios

Analytics Vidhya

SEPTEMBER 29, 2020

This article was published as a part of the Data Science Blogathon. The post How Machine Learning Models Fail to Deliver in Real-World Scenarios appeared first on Analytics Vidhya. Introduction Yesterday, my brother broke an antique at home. I began to.

Machine Learning

Machine Learning Modeling Data Science Publishing

Gaussian Naive Bayes Algorithm for Credit Risk Modelling

Analytics Vidhya

MARCH 1, 2022

This article was published as a part of the Data Science Blogathon. Credit evaluations have progressed from being subjective decisions by the bank’s credit experts to a more statistically advanced evaluation. The post Gaussian Naive Bayes Algorithm for Credit Risk Modelling appeared first on Analytics Vidhya.

Risk

Risk Modeling Statistics Data Science

Underlying Engineering Behind Alexa’s Contextual ASR

Analytics Vidhya

SEPTEMBER 17, 2022

This article was published as a part of the Data Science Blogathon. Introduction Conventionally, an automatic speech recognition (ASR) system leverages a single statistical language model to rectify ambiguities, regardless of context. However, we can improve the system’s accuracy by leveraging contextual information.

Metadata

Metadata Statistics Data Science Publishing

Unbundling the Graph in GraphRAG

O'Reilly on Data

NOVEMBER 19, 2024

Reasons for using RAG are clear: large language models (LLMs), which are effectively syntax engines, tend to “hallucinate” by inventing answers from pieces of their training data. See the primary sources “ REALM: Retrieval-Augmented Language Model Pre-Training ” by Kelvin Guu, et al., at Facebook—both from 2020. What is GraphRAG?

Unstructured Data

Unstructured Data Structured Data Modeling Statistics

Q-Q plot – Ensure Your ML Model is Based on the Right Distribution

Analytics Vidhya

SEPTEMBER 6, 2021

This article was published as a part of the Data Science Blogathon Introduction Q-Q plots are also known as Quantile-Quantile plots. The post Q-Q plot – Ensure Your ML Model is Based on the Right Distribution appeared first on Analytics Vidhya.

Modeling

Modeling Data Science Publishing Analytics

HOW TO CHOOSE EVALUATION METRICS FOR CLASSIFICATION MODEL

Analytics Vidhya

OCTOBER 11, 2020

This article was published as a part of the Data Science Blogathon. So you have successfully built your classification model. The post HOW TO CHOOSE EVALUATION METRICS FOR CLASSIFICATION MODEL appeared first on Analytics Vidhya. INTRODUCTION Yay!! What should.

Metrics

Metrics Modeling Data Science Publishing

Analysis of Imbalanced Datasets – Sample Size vs Accuracy

Analytics Vidhya

JULY 5, 2022

This article was published as a part of the Data Science Blogathon. Introduction to Imbalanced Datasets The accuracy achieved by many of the machine learning models using traditional statistical algorithms increases by just around 2% or so when the size of the training dataset is increased from 20% to 80%.

Statistics

Statistics Machine Learning Data Science Publishing

Understanding the Log-normal Distribution

Analytics Vidhya

JUNE 17, 2024

Introduction The log-normal distribution is a fascinating statistical concept commonly used to model data that exhibit right-skewed behavior. This distribution has wide-ranging applications in various fields, such as biology, finance, and engineering.

Statistics

Statistics Finance Modeling Analytics

Complete R Tutorial To Build Probabilistic Graphical Models!

Analytics Vidhya

OCTOBER 9, 2020

This article was published as a part of the Data Science Blogathon. Introduction: Probabilistic Graphical Models (PGM) capture the complex relationships between random variables. The post Complete R Tutorial To Build Probabilistic Graphical Models! appeared first on Analytics Vidhya.

Modeling

Modeling Data Science Publishing Analytics

Geometrical Approach To Understand Logistic Regression

Analytics Vidhya

JULY 23, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction Logistic Regression is another statistical model which is used for. The post Geometrical Approach To Understand Logistic Regression appeared first on Analytics Vidhya.

Statistics

Statistics Data Science Publishing Modeling

Decluttering the performance measures of classification models

Analytics Vidhya

DECEMBER 16, 2020

This article was published as a part of the Data Science Blogathon. The post Decluttering the performance measures of classification models appeared first on Analytics Vidhya. Introduction There are so many performance evaluation measures when it comes to.

Measurement

Measurement Modeling Data Science Publishing

How to Create an ARIMA Model for Time Series Forecasting in Python

Analytics Vidhya

OCTOBER 28, 2020

This article was published as a part of the Data Science Blogathon. Introduction A popular and widely used statistical method for time series forecasting. The post How to Create an ARIMA Model for Time Series Forecasting in Python appeared first on Analytics Vidhya.

Forecasting

Forecasting Modeling Statistics Data Science

20+ Questions to Test your Skills on Logistic Regression

Analytics Vidhya

MAY 28, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction Logistic Regression, a statistical model is a very popular and. The post 20+ Questions to Test your Skills on Logistic Regression appeared first on Analytics Vidhya.

Testing

Testing Statistics Data Science Publishing

Creating Linear Model, It’s Equation and Visualization for Analysis

Analytics Vidhya

NOVEMBER 25, 2020

This article was published as a part of the Data Science Blogathon. The post Creating Linear Model, It’s Equation and Visualization for Analysis appeared first on Analytics Vidhya. Introduction Have you ever been tasked with visualizing the relationship between each.

Visualization

Visualization Modeling Data Science Publishing

Complete Tutorial On Natural Language Processing using spaCy

Analytics Vidhya

SEPTEMBER 28, 2021

This article was published as a part of the Data Science Blogathon In this tutorial, we will learn: Introduction to Natural Language Processing Phases of Natural Language Processing Introduction to spaCy Installation and local setup of spaCy Statistical models available in spaCy Reading and processing text Spans Tokenization Sentence Detection Stopwords (..)

Statistics

Statistics Data Science Publishing Modeling

Metrics to Evaluate your Classification Model to take the right decisions

Analytics Vidhya

JULY 20, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Evaluation Metrics for Classification Problem Image source ?[link] The post Metrics to Evaluate your Classification Model to take the right decisions appeared first on Analytics Vidhya. link] Abstract The most.

Metrics

Metrics Modeling Data Science Publishing

Proposals for model vulnerability and security

O'Reilly on Data

MARCH 20, 2019

Apply fair and private models, white-hat and forensic model debugging, and common sense to protect machine learning models from malicious actors. Like many others, I’ve known for some time that machine learning models themselves could pose security risks. This is like a denial-of-service (DOS) attack on your model itself.

Modeling

Modeling Machine Learning Predictive Modeling Consulting

Why you should care about debugging machine learning models

O'Reilly on Data

DECEMBER 12, 2019

Not least is the broadening realization that ML models can fail. And that’s why model debugging, the art and science of understanding and fixing problems in ML models, is so critical to the future of ML. Because all ML models make mistakes, everyone who cares about ML should also care about model debugging. [1]

Machine Learning

Machine Learning Modeling Testing Risk Management

Logistic Regression and Maximum Likelihood: Explained Simply (Part I)

Analytics Vidhya

MARCH 29, 2022

This article was published as a part of the Data Science Blogathon. Introduction to Linear Regression Image 1: Sales vs Budget data with a linear model representation Linear regression is a statistical method that presumes a linear relationship between the input and the output variables.

Statistics

Statistics Sales Data Science Publishing

Machine Learning Paradigms with Example

Analytics Vidhya

JULY 25, 2022

This article was published as a part of the Data Science Blogathon. Machine Learning is the method of teaching computer programs to do a specific task accurately (essentially a prediction) by training a predictive model using various statistical algorithms leveraging data. Source: [link] For […].

Machine Learning

Machine Learning Predictive Modeling Statistics Data Science

4 Ways to Evaluate your Machine Learning Model: Cross-Validation Techniques (with Python code)

Analytics Vidhya

MAY 21, 2021

ArticleVideo Book This article was published as a part of the Data Science Blogathon Introduction Whenever we build any machine learning model, we feed it. The post 4 Ways to Evaluate your Machine Learning Model: Cross-Validation Techniques (with Python code) appeared first on Analytics Vidhya.

Machine Learning

Machine Learning Modeling Data Science Publishing

An Accurate Approach to Data Imputation

Analytics Vidhya

JULY 9, 2022

This article was published as a part of the Data Science Blogathon. Introduction In order to build machine learning models that are highly generalizable to a wide range of test conditions, training models with high-quality data is essential.

Machine Learning

Machine Learning Data Science Data Collection Testing

The quest for high-quality data

O'Reilly on Data

JUNE 18, 2019

There has been a significant increase in our ability to build complex AI models for predictions, classifications, and various analytics tasks, and there’s an abundance of (fairly easy-to-use) tools that allow data scientists and analysts to provision complex models within days. Data integration and cleaning.

Machine Learning

Machine Learning Data Quality Statistics Modeling

Lets Open the Black Box of Random Forests

Analytics Vidhya

DECEMBER 4, 2020

This article was published as a part of the Data Science Blogathon. Introduction Random Forests are always referred to as black-box models. Let’s try. The post Lets Open the Black Box of Random Forests appeared first on Analytics Vidhya.

Data Science

Data Science Publishing Modeling Analytics

KL Divergence: The Information Theory Metric that Revolutionized Machine Learning

Analytics Vidhya

JULY 9, 2024

This powerful metric, called relative entropy or information gain, has become indispensable in various fields, from statistical inference to deep learning. Introduction Few concepts in mathematics and information theory have profoundly impacted modern machine learning and artificial intelligence, such as the Kullback-Leibler (KL) divergence.

Machine Learning

Machine Learning Metrics Deep Learning Statistics

Beyond the hype: Do you really need an LLM for your data?

CIO Business Intelligence

FEBRUARY 6, 2025

This article reflects some of what Ive learned. The hype around large language models (LLMs) is undeniable. Think about it: LLMs like GPT-3 are incredibly complex deep learning models trained on massive datasets. Even basic predictive modeling can be done with lightweight machine learning in Python or R.

Unstructured Data

Unstructured Data Manufacturing Data Governance Sales

Sydney and the Bard

O'Reilly on Data

FEBRUARY 16, 2023

Large language models like ChatGPT and Google’s LaMDA aren’t designed to give correct results. Remember that these tools aren’t doing math, they’re just doing statistics on a huge body of text. You can train models that are optimized to be correct—but that’s a different kind of model.

Testing

Testing Statistics Modeling Optimization

Managing risk in machine learning

O'Reilly on Data

NOVEMBER 13, 2018

Considerations for a world where ML models are becoming mission critical. As the data community begins to deploy more machine learning (ML) models, I wanted to review some important considerations. Before I continue, it’s important to emphasize that machine learning is much more than building models. Model lifecycle management.

Machine Learning

Machine Learning Risk Management Statistics

Bivariate Feature Analysis in Python

Analytics Vidhya

MARCH 22, 2022

This article was published as a part of the Data Science Blogathon. Introduction Feature analysis is an important step in building any predictive model. In this article, we will look into a very simple feature analysis technique that can be used in cases such as […].

Predictive Modeling

Predictive Modeling Data Science Publishing Modeling

End to End Statistics for Data Science

Statistical Modelling and Identifiability of Parameters

Webinars

Trending Sources

All about Statistical Modeling

Webinars

Building Language Models in NLP

What is a Bernoulli Distribution?

Building an end-to-end Polynomial Regression Model in R

Boxing and Unboxing of Statistical Models with Gaussian Learning

Introduction to Statistics Using the R Programming Language

What are Mean and Variance of the Normal Distribution?

Gain Customer’s Confidence in ML Model Predictions

The Science of T20 Cricket: Decoding Player Performance with Predictive Modeling

How to Build Your Time Series Model?

Introduction to Linear Model for Optimization

Handling Missing Values with Random Forest

A brief introduction to Multilevel Modelling

How Machine Learning Models Fail to Deliver in Real-World Scenarios

Gaussian Naive Bayes Algorithm for Credit Risk Modelling

Underlying Engineering Behind Alexa’s Contextual ASR

Unbundling the Graph in GraphRAG

Q-Q plot – Ensure Your ML Model is Based on the Right Distribution

HOW TO CHOOSE EVALUATION METRICS FOR CLASSIFICATION MODEL

Analysis of Imbalanced Datasets – Sample Size vs Accuracy

Understanding the Log-normal Distribution

Complete R Tutorial To Build Probabilistic Graphical Models!

Geometrical Approach To Understand Logistic Regression

Decluttering the performance measures of classification models

How to Create an ARIMA Model for Time Series Forecasting in Python

20+ Questions to Test your Skills on Logistic Regression

Creating Linear Model, It’s Equation and Visualization for Analysis

Complete Tutorial On Natural Language Processing using spaCy

Metrics to Evaluate your Classification Model to take the right decisions

Proposals for model vulnerability and security

Why you should care about debugging machine learning models

Logistic Regression and Maximum Likelihood: Explained Simply (Part I)

Machine Learning Paradigms with Example

4 Ways to Evaluate your Machine Learning Model: Cross-Validation Techniques (with Python code)

An Accurate Approach to Data Imputation

The quest for high-quality data

Lets Open the Black Box of Random Forests

KL Divergence: The Information Theory Metric that Revolutionized Machine Learning

Beyond the hype: Do you really need an LLM for your data?

Sydney and the Bard

Managing risk in machine learning

Bivariate Feature Analysis in Python

Stay Connected