matchms.similarity.vector_similarity_functions module

Collection of functions for calculating vector-vector similarities.

matchms.similarity.vector_similarity_functions.cosine_similarity(u: ndarray, v: ndarray) float64[source]

Calculate cosine similarity score.

Parameters:
  • u – Input vector.

  • v – Input vector.

Returns:

The Cosine similarity score between vectors u and v.

Return type:

cosine_similarity

matchms.similarity.vector_similarity_functions.cosine_similarity_matrix(references: ndarray, queries: ndarray) ndarray[source]

Returns matrix of cosine similarity scores between all-vs-all vectors of references and queries.

Parameters:
  • references – Reference vectors as 2D numpy array. Expects that vector_i corresponds to references[i, :].

  • queries – Query vectors as 2D numpy array. Expects that vector_i corresponds to queries[i, :].

Returns:

Matrix of all-vs-all similarity scores. scores[i, j] will contain the score between the vectors references[i, :] and queries[j, :].

Return type:

scores

matchms.similarity.vector_similarity_functions.dice_similarity(u: ndarray, v: ndarray) float64[source]

Computes the Dice similarity coefficient (DSC) between two boolean 1-D arrays.

The Dice similarity coefficient between u and v, is

\[\begin{split}DSC(u,v) = \\frac{2|u \cap v|} {|u| + |v|}\end{split}\]
Parameters:
  • u – Input array. Expects boolean vector.

  • v – Input array. Expects boolean vector.

Returns:

The Dice similarity coefficient between 1-D arrays u and v.

Return type:

dice_similarity

matchms.similarity.vector_similarity_functions.dice_similarity_matrix(references: ndarray, queries: ndarray) ndarray[source]

Returns matrix of dice similarity scores between all-vs-all vectors of references and queries.

Parameters:
  • references – Reference vectors as 2D numpy array. Expects that vector_i corresponds to references[i, :].

  • queries – Query vectors as 2D numpy array. Expects that vector_i corresponds to queries[i, :].

Returns:

Matrix of all-vs-all similarity scores. scores[i, j] will contain the score between the vectors references[i, :] and queries[j, :].

Return type:

scores

matchms.similarity.vector_similarity_functions.jaccard_index(u: ndarray, v: ndarray) float64[source]

Computes the Jaccard-index (or Jaccard similarity coefficient) of two boolean 1-D arrays. The Jaccard index between 1-D boolean arrays u and v, is defined as

\[\begin{split}J(u,v) = \\frac{u \cap v} {u \cup v}\end{split}\]
Parameters:
  • u – Input array. Expects boolean vector.

  • v – Input array. Expects boolean vector.

Returns:

The Jaccard similarity coefficient between vectors u and v.

Return type:

jaccard_similarity

matchms.similarity.vector_similarity_functions.jaccard_similarity_matrix(references: ndarray, queries: ndarray) ndarray[source]

Returns matrix of jaccard indices between all-vs-all vectors of references and queries.

Parameters:
  • references – Reference vectors as 2D numpy array. Expects that vector_i corresponds to references[i, :].

  • queries – Query vectors as 2D numpy array. Expects that vector_i corresponds to queries[i, :].

Returns:

Matrix of all-vs-all similarity scores. scores[i, j] will contain the score between the vectors references[i, :] and queries[j, :].

Return type:

scores