KNearest Neighbors Algorithm (kNN)
 NOTE  This is an incomplete stub article that I’m displaying for now just in case it is somehow useful in its current state.
Read these:
 source 1
 source 2
 source 3
 source 4
 source 5
 source 6
 source 7
 source 8
 source 9
 source 10
 source 11
 source 12
 source 13
 source 14
 source 15
 source 16
 source 17

no actual training step

“sensitive to the local structure of the data”

input consists of the k closest training examples in the feature space
 output:
 kNN classification  class membership  most common class from k nearest neighbors
 k is a small positive integer
 kNN regression  value  average of the values of K nearest neighbors
 kNN classification  class membership  most common class from k nearest neighbors
 neighbors can be weighted
 closer neighbors have a higher weight
 accounts for a skew based on large number ….
 weight = 1/d

d = distance to the neighbor
 neighbors  taken from the set of objects where the class or value is known ( training set )
 don’t use everydata point if you have a large dataset, this is too computationally expensive.
 the Nearest neighbor search (NNS) algorithm can be used to select neighbors for large data sets
 feature space 
 feature vectors 
k  userdefined constant
 larger values
 reduce effect of noise
 class boundaries are less distinct
 query / test point  unlabeled vector  classified k training examples
Distance Metric:

multiple ways to find the distance membership
 continuous variables  could take any value between points

discrete variables  only specific points
 common distance metric options:
 Use Euclidean distance for continuous variables

1D: dist = ab  2D: dist = sqrt((a1  b1)^2+(a2  b2)62) first point: a1,a2 second point: b1,b2 Pythagorean theorem: a^2+b^2=c^2

 Use overlap metric (or Hamming distance) for discrete variables
 Use Euclidean distance for continuous variables
To improve accuracy, find the distance metric with one of these algorithms:
 Large Margin Nearest Neighbor

Neighbourhood components analysis.
 metric (distance function)  a function that finds the distance between two elements in a set
 metric space  the set
 pseudometric 
Metric learning

supervised metric learning can improve performance
 Feature extraction  “Transforming the input data into the set of features is called feature extraction”

remove redundant data
 Dimension reduction  …..
 Decision boundary  ….
 Data reduction  ….
=======================

OpenCV  realtime computer vision library

face recognition example:
 Haar face detection
 Meanshift tracking analysis
 PCA or Fisher LDA projection into feature space, followed by kNN classification