#32 Machine Learning & Data Science Challenge 32

#32 Machine Learning & Data Science Challenge 32

What is KNN Classifier?

  • KNN means K-Nearest Neighbour Algorithm. It can be used for both classification and regression.

  • It is the simplest machine learning algorithm. Also known as lazy learning.

  • (why? Because it does not create a generalized model during the time of training, the testing phase is very important when it does the actual job. Hence Testing is very costly - in terms of time & money).

  • Also called instance-based or memory-based learning.

  • In k-NN classification, the output is a class membership.

  • An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small).

  • If k = 1, then the object is assigned to the class of that single nearest neighbor.

  • In k-NN regression, the output is the property value for the object. This value is the average of the values of k-nearest neighbors.

  • All three distance measures are only valid for continuous variables. In the instance of categorical variables, the Hamming distance must be used.

How to choose the value of K:

  • K value is a hyperparameter that needs to choose during the time of model building.

  • Also, a small number of neighbors are the most flexible fit, which will have a low bias, but the high variance and a large number of neighbors will have a smoother decision boundary, which means lower variance but higher bias.

  • We should choose an odd number if the number of classes is even. It is said the most common values are to be 3 & 5.

Did you find this article valuable?

Support Bhagirath's Blog Vision by becoming a sponsor. Any amount is appreciated!