Skip to main content

Command Palette

Search for a command to run...

#36 Machine Learning & Data Science Challenge 36

Published
2 min read
#36 Machine Learning & Data Science Challenge 36
B

Greetings.

I am a machine learning engineer based in India, possessing a sustained interest in machine learning since my undergraduate studies. I have completed Stanford University's machine learning course (Andrew Ng) via Coursera, and IBM's machine learning and deep learning curriculum. My current focus is on machine learning and data science projects, aiming to leverage my expertise for impactful, real-world problem-solving.

VIF(Variation Inflation Factor), Weight of Evidence & Information Value. Why and when to use it?

Variation Inflation Factor:

  • It provides an index that measures how much the variance (the square of the estimate's standard deviation) of an estimated regression coefficient is increased because of collinearity.

  • VIF = 1 / (1-R-Square of j-th variable) where R2 of jth variable is the coefficient of determination of the model that includes all independent variables except the jth predictor.

  • Where R-Square of the j-th variable is the multiple R2 for the regression of Xj on the other independent variables (a regression that does not involve the dependent variable Y).

  • If VIF > 5, then there is a problem with multicollinearity.

Understanding VIF:

  • If the variance inflation factor of a predictor variable is 5 this means that the variance for the coefficient of that predictor variable is 5 times as large as it would be if that predictor variable were uncorrelated with the other predictor variables.

  • In other words, if the variance inflation factor of a predictor variable is 5 this means that the standard error for the coefficient of that predictor variable is 2.23 times (√5 = 2.23) as large as it would be if that predictor variable were uncorrelated with the other predictor variables.

Weight of evidence (WOE) and information value (IV) are simple, yet powerful techniques to perform variable transformation and selection.

The formula to create WOE and IV is;

Here is a simple table that shows how to calculate these values.

The IV value can be used to select variables quickly.

Machine Learning & Data Science Interview Challenges

Part 1 of 50

Machine learning and data science are increasingly among the most sought-after skills in tech. Read this article for advice on how to prepare for machine learning and data science interviews.

More from this blog