matchms.calculate_scores module

matchms.calculate_scores.calculate_scores(references: List[object] | Tuple[object] | ndarray, queries: List[object] | Tuple[object] | ndarray, similarity_function: BaseSimilarity, array_type: str = 'numpy', is_symmetric: bool = False) Scores[source]

Calculate the similarity between all reference objects versus all query objects.

Example to calculate scores between 2 spectrums and iterate over the scores

import numpy as np
from matchms import calculate_scores, Spectrum
from matchms.similarity import CosineGreedy

spectrum_1 = Spectrum(mz=np.array([100, 150, 200.]),
                      intensities=np.array([0.7, 0.2, 0.1]),
                      metadata={'id': 'spectrum1'})
spectrum_2 = Spectrum(mz=np.array([100, 140, 190.]),
                      intensities=np.array([0.4, 0.2, 0.1]),
                      metadata={'id': 'spectrum2'})
spectrums = [spectrum_1, spectrum_2]

scores = calculate_scores(spectrums, spectrums, CosineGreedy())

for (reference, query, score) in scores:
    print(f"Cosine score between {reference.get('id')} and {query.get('id')}" +
          f" is {score[0]:.2f} with {score[1]} matched peaks")

Should output

Cosine score between spectrum1 and spectrum1 is 1.00 with 3 matched peaks
Cosine score between spectrum1 and spectrum2 is 0.83 with 1 matched peaks
Cosine score between spectrum2 and spectrum1 is 0.83 with 1 matched peaks
Cosine score between spectrum2 and spectrum2 is 1.00 with 3 matched peaks
  • references – List of reference objects

  • queries – List of query objects

  • similarity_function – Function which accepts a reference + query object and returns a score or tuple of scores

  • array_type – Specify the type of array to store and compute the scores. Choose from “numpy” or “sparse”.

  • is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster. Default is False.

Return type: