Selecting a Distance Function
Lloyd's algorithm converges for the class of distance functions called Bregman Divergences. We provide a number of Bregman Divergences. When selecting a distance function, consider the domain of the input data. For example, frequency data is integral. Similarity of frequencies or distributions are best performed using the Kullback-Leibler divergence.
BregmanDivergence.EUCLIDEAN
Squared Euclidean
BregmanDivergence.RELATIVE_ENTROPY
BregmanDivergence.DISCRETE_KL
Kullback-Leibler
BregmanDivergence.DISCRETE_SMOOTHED_KL
Kullback-Leibler
BregmanDivergence.SPARSE_SMOOTHED_KL
Kullback-Leibler
BregmanDivergence.LOGISTIC_LOSS
Logistic Loss
BregmanDivergence.GENERALIZED_I
Generalized I
BregmanDivergence.ITAKURA_SAITO
You may construct instances of BregmanDivergence
using the BregmanDivergence
companion object.
From this, one may construct a distance function using the BregmanPointOps
companion function.
From your BregmanDivergence
, you may create an instance of the distance function by using the apply
method of the BregmanPointOps
companion object.
Last updated