K-Means Clustering

Algorithm

Initialize kk prototypes wj=xp, j{1,,k}, p{1,,P}w_j=x_{p}, \ j\in\{1,\ldots,k\}, \ p\in\{1,\ldots,P\}where we pick random points in the dataset (i.e. kPk\le P) to initialize each weight. Each cluster CjC_{j} is associated with prototype wjw_{j} while for each xpx_p in input set: put xpx_p in cluster with nearest prototype wjw_j for jj in range(kk): wj=1cJxlcjxlw_j=\frac{1}{|c_{J}|}\sum\limits_{x_{l}\in c_{j}}x_{l} where cjc_{j} is the cluster size E=j=1kxlcjxlwj2E=\sum\limits^{k}_{j=1}\sum\limits_{x_{l}\in c_{j}}|x_{l}-w_{j}|^2 if(EE not increasing): break