K-Nearest Neighbors Algorithm (k-NN)

NOTE - This is an incomplete stub article that I’m displaying for now just in case it is somehow useful in its current state.

Read these:

source 1
source 2
source 3
source 4
source 5
source 6
source 7
source 8
source 9
source 10
source 11
source 12
source 13
source 14
source 15
source 16
source 17
source 18
no actual training step
“sensitive to the local structure of the data”
input consists of the k closest training examples in the feature space
output:
- k-NN classification - class membership - most common class from k nearest neighbors
  - k is a small positive integer
- k-NN regression - value - average of the values of K nearest neighbors
neighbors can be weighted
- closer neighbors have a higher weight
- accounts for a skew based on large number ….
weight = 1/d
d = distance to the neighbor
neighbors - taken from the set of objects where the class or value is known ( training set )
- don’t use everydata point if you have a large dataset, this is too computationally expensive.
- the Nearest neighbor search (NNS) algorithm can be used to select neighbors for large data sets
feature space -
feature vectors -

k - user-defined constant

Distance Metric:

multiple ways to find the distance membership
continuous variables - could take any value between points
discrete variables - only specific points
common distance metric options:
- Use Euclidean distance for continuous variables
  - 1D: dist = a-b
  - 2D: dist = sqrt((a1 - b1)^2+(a2 - b2)62) first point: a1,a2 second point: b1,b2 Pythagorean theorem: a^2+b^2=c^2
- Use overlap metric (or Hamming distance) for discrete variables

To improve accuracy, find the distance metric with one of these algorithms:

Large Margin Nearest Neighbor
Neighbourhood components analysis.
metric (distance function) - a function that finds the distance between two elements in a set
metric space - the set
pseudo-metric -

Metric learning

supervised metric learning can improve performance
Feature extraction - “Transforming the input data into the set of features is called feature extraction”
remove redundant data
Dimension reduction - …..
Decision boundary - ….
Data reduction - ….

=======================

OpenCV - real-time computer vision library
face recognition example:
- Haar face detection
- Mean-shift tracking analysis
- PCA or Fisher LDA projection into feature space, followed by k-NN classification