Table of Contents
Vector quantization / Clustering
References
Sequential leader
For every new sample :
- if the distance between the sample and a cluster is smaller than a given threshold, then add the sample to the cluster, else create a new cluster with the sample
Pairwise Clustering
Initially every sample is a cluster.
Repeat until the desired number of clusters is obtained :
- merge the two closest clusters
k-means
k-moyennes
Randomly chose k clusters.
For every sample :
- decrease the distance between the closest cluster to the sample, and the sample (using a learning rate)
- decrease the learning rate in time
k-means++
A variant that initializes centers so that there is a guarantee in accuracy, and a faster convergence :
- chose the first center randomly with uniform distribution among the samples
- chose the next centers randomly with probability proportional to the minimum distance of the sample to the already chosen centers.
References
Elbow criterion
A way to chose the optimal number of clusters k.
Compute for different number of clusters the ratio of the intra-clusters variance to the total variance. The optimal number of clusters is when adding clusters do not bring significant decrease of the ratio.
GNG (Growing Neural Gas)
Kohonen auto-organizing maps
Cartes auto-organisatrices de Kohonen