Creating a Custom Embedding

How to create a custom embedding

Perhaps you have a dimensionality reduction method that is not provided by one of the standard embeddings. You may create your own embedding by implementing the Embedding trait.

package com.massivedatascience.transforms

trait Embedding extends Serializable {
  /**
   * Tranform a weighted vector into another space
   * @param v the weighted vector
   * @return the transformed vector
   */
  def embed(v: WeightedVector): WeightedVector = WeightedVector(embed(v.homogeneous), v.weight)
  def embed(v: Vector): Vector
  def embed(v: VectorIterator): Vector
}

For example, If the number of clusters desired is small, but the dimension is high, one may also use the method of Random Projections. At present, no embedding is provided for random projections, but, hey, I have to leave something for you to do!

Last updated