#18 Machine Learning & Data Science Challenge 18

#18 Machine Learning & Data Science Challenge 18

What are Naive Bayes Classification and Gaussian Naive Bayes?

  • Bayes’ Theorem finds the probability of an event occurring given the probability of another event that has already occurred.

  • Bayes’ theorem is stated mathematically as the following equation:

Naive Bayes Classification:

  1. We assume that no pair of features are dependent. For example, the temperature being ‘Hot’ has nothing to do with the humidity, and the outlook is ‘Rainy’ does not affect the winds. Hence, the features are assumed to be independent.

  2. Secondly, each feature is given the same weight (or importance). For example, knowing the only temperature and humidity alone can’t predict the outcome accurately. None of the attributes is irrelevant and assumed to be contributing equally to the outcome.

Gaussian Naive Bayes:

  • Continuous values associated with each feature are assumed to be distributed according to a Gaussian distribution.

  • A Gaussian distribution is also called Normal distribution.

  • When plotted, it gives a bell-shaped curve that is symmetric about the mean of the feature values as shown below:

  • This is as simple as calculating the mean and standard deviation values of each input variable (x) for each class value.

$$Mean (x) = 1 / n * sum(x)$$

  • Where n is the number of instances, and x is the value for an input variable in your training data.

  • We can calculate the standard deviation using the following equation:

$$Standard Deviation (x) = sqrt (1/n * sum(xi-mean(x)^2 ))$$

Did you find this article valuable?

Support Bhagirath's Blog Vision by becoming a sponsor. Any amount is appreciated!