matchms.similarity.BaseSimilarity module¶

class matchms.similarity.BaseSimilarity.BaseSimilarity[source]¶

Bases: object

Similarity function base class. When building a custom similarity measure, inherit from this class and implement the desired methods.

is_commutative¶: Whether similarity function is commutative, which means that the order of spectrums does not matter (similarity(A, B) == similarity(B, A)). Default is True.

keep_score(score)[source]¶: In the .matrix method scores will be collected in a sparse way. Overwrite this method here if values other than False or 0 should not be stored in the final collection.

matrix(references: List[Spectrum], queries: List[Spectrum], array_type: str = 'numpy', is_symmetric: bool = False) → ndarray[source]¶

Optional: Provide optimized method to calculate an np.array of similarity scores for given reference and query spectrums. If no method is added here, the following naive implementation (i.e. a double for-loop) is used.

Parameters:

references – List of reference objects
queries – List of query objects
array_type – Specify the output array type. Can be “numpy” or “sparse”. Default is “numpy” and will return a numpy array. “sparse” will return a COO-sparse array.
is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster.

abstract pair(reference: Spectrum, query: Spectrum) → float[source]¶

Method to calculate the similarity for one input pair.

Parameters:

reference – Single reference spectrum.
query – Single query spectrum.
Returns – score as numpy array (using self.score_datatype). For instance returning np.asarray(score, dtype=self.score_datatype)

score_datatype¶: alias of float64

sparse_array(references: List[Spectrum], queries: List[Spectrum], idx_row, idx_col, is_symmetric: bool = False)[source]¶

Optional: Provide optimized method to calculate an sparse matrix of similarity scores.

Compute similarity scores for pairs of reference and query spectrums as given by the indices idx_row (references) and idx_col (queries). If no method is added here, the following naive implementation (i.e. a for-loop) is used.

Parameters:

references – List of reference objects
queries – List of query objects
idx_row – List/array of row indices
idx_col – List/array of column indices
is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster.

to_dict() → dict[source]¶: Return a dictionary representation of a similarity function.

matchms.similarity.BaseSimilarity module¶

Navigation

Related Topics