KNN is the short name for K nearest neighbors. As its name indicated, this method uses its K nearest neighbors to make the decision that which label does a data point belongs.

KNN uses all the training data as classifiers. It has the following steps

  1. Measure the distance between the test data and the training data
  2. Find the closest ‘K’ neighbors
  3. Vote for labels from the ‘K’ neighbors

Key points need to take into consideration:

  1. How to choose the number of neighbors?
  2. How to avoid the influence of sample size for different labels?

Reference: 1. https://www.datacamp.com/community/tutorials/k-nearest-neighbor-classification-scikit-learn