KMeansInitStrategy¶
- pycave.clustering.kmeans.types.KMeansInitStrategy¶
Strategy for initializing KMeans centroids.
random: Centroids are sampled randomly from the data. This has complexity
O(n)
forn
datapoints.kmeans++: Centroids are computed iteratively. The first centroid is sampled randomly from the data. Subsequently, centroids are sampled from the remaining datapoints with probability proportional to
D(x)^2
whereD(x)
is the distance of datapointx
to the closest centroid chosen so far. This has complexityO(kn)
fork
clusters andn
datapoints. If done on mini-batches, the complexity increases toO(k^2 n)
.
alias of
Literal
['random', 'kmeans++']