matchms.Scores module

class matchms.Scores.Scores(references: Union[List[object], Tuple[object], numpy.ndarray], queries: Union[List[object], Tuple[object], numpy.ndarray], similarity_function: matchms.similarity.BaseSimilarity.BaseSimilarity, is_symmetric: bool = False)[source]

Bases: object

Contains reference and query spectrums and the scores between them.

The scores can be retrieved as a matrix with the Scores.scores attribute. The reference spectrum, query spectrum, score pairs can also be iterated over in query then reference order.

Example to calculate scores between 2 spectrums and iterate over the scores

import numpy as np
from matchms import calculate_scores
from matchms import Spectrum
from matchms.similarity import CosineGreedy

spectrum_1 = Spectrum(mz=np.array([100, 150, 200.]),
                      intensities=np.array([0.7, 0.2, 0.1]),
                      metadata={'id': 'spectrum1'})
spectrum_2 = Spectrum(mz=np.array([100, 140, 190.]),
                      intensities=np.array([0.4, 0.2, 0.1]),
                      metadata={'id': 'spectrum2'})
spectrum_3 = Spectrum(mz=np.array([110, 140, 195.]),
                      intensities=np.array([0.6, 0.2, 0.1]),
                      metadata={'id': 'spectrum3'})
spectrum_4 = Spectrum(mz=np.array([100, 150, 200.]),
                      intensities=np.array([0.6, 0.1, 0.6]),
                      metadata={'id': 'spectrum4'})
references = [spectrum_1, spectrum_2]
queries = [spectrum_3, spectrum_4]

similarity_measure = CosineGreedy()
scores = calculate_scores(references, queries, similarity_measure)

for (reference, query, score) in scores:
    print(f"Cosine score between {reference.get('id')} and {query.get('id')}" +
          f" is {score['score']:.2f} with {score['matches']} matched peaks")

Should output

Cosine score between spectrum1 and spectrum3 is 0.00 with 0 matched peaks
Cosine score between spectrum1 and spectrum4 is 0.80 with 3 matched peaks
Cosine score between spectrum2 and spectrum3 is 0.14 with 1 matched peaks
Cosine score between spectrum2 and spectrum4 is 0.61 with 1 matched peaks
__init__(references: Union[List[object], Tuple[object], numpy.ndarray], queries: Union[List[object], Tuple[object], numpy.ndarray], similarity_function: matchms.similarity.BaseSimilarity.BaseSimilarity, is_symmetric: bool = False)[source]
Parameters
  • references – List of reference objects

  • queries – List of query objects

  • similarity_function – Expected input is an object based on BaseSimilarity. It is expected to provide a .pair() and .matrix() method for computing similarity scores between references and queries.

  • is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster. Default is False.

calculate()matchms.Scores.Scores[source]

Calculate the similarity between all reference objects v all query objects using the most suitable available implementation of the given similarity_function. Advised method to calculate similarity scores is calculate_scores().

Deprecated since version 0.6.0: Calculate scores via calculate_scores() function.

property scores

Scores as numpy array

For example

import numpy as np
from matchms import calculate_scores, Scores, Spectrum
from matchms.similarity import IntersectMz

spectrum_1 = Spectrum(mz=np.array([100, 150, 200.]),
                      intensities=np.array([0.7, 0.2, 0.1]))
spectrum_2 = Spectrum(mz=np.array([100, 140, 190.]),
                      intensities=np.array([0.4, 0.2, 0.1]))
spectrums = [spectrum_1, spectrum_2]

scores = calculate_scores(spectrums, spectrums, IntersectMz()).scores

print(scores[0].dtype)
print(scores.shape)
print(scores)

Should output

float64
(2, 2)
[[1.  0.2]
 [0.2 1. ]]
scores_by_query(query: Union[List[object], Tuple[object], numpy.ndarray], sort: bool = False)numpy.ndarray[source]

Return all scores for the given query spectrum.

For example

import numpy as np
from matchms import calculate_scores, Scores, Spectrum
from matchms.similarity import CosineGreedy

spectrum_1 = Spectrum(mz=np.array([100, 150, 200.]),
                      intensities=np.array([0.7, 0.2, 0.1]),
                      metadata={'id': 'spectrum1'})
spectrum_2 = Spectrum(mz=np.array([100, 140, 190.]),
                      intensities=np.array([0.4, 0.2, 0.1]),
                      metadata={'id': 'spectrum2'})
spectrum_3 = Spectrum(mz=np.array([110, 140, 195.]),
                      intensities=np.array([0.6, 0.2, 0.1]),
                      metadata={'id': 'spectrum3'})
spectrum_4 = Spectrum(mz=np.array([100, 150, 200.]),
                      intensities=np.array([0.6, 0.1, 0.6]),
                      metadata={'id': 'spectrum4'})
references = [spectrum_1, spectrum_2, spectrum_3]
queries = [spectrum_2, spectrum_3, spectrum_4]

scores = calculate_scores(references, queries, CosineGreedy())
selected_scores = scores.scores_by_query(spectrum_4, sort=True)
print([x[1]["score"].round(3) for x in selected_scores])

Should output

[0.796, 0.613, 0.0]
Parameters
  • query – Single query Spectrum.

  • sort – Set to True to obtain the scores in a sorted way (relying on the sort() function from the given similarity_function).

scores_by_reference(reference: Union[List[object], Tuple[object], numpy.ndarray], sort: bool = False)numpy.ndarray[source]

Return all scores for the given reference spectrum.

Parameters
  • reference – Single reference Spectrum.

  • sort – Set to True to obtain the scores in a sorted way (relying on the sort() function from the given similarity_function).