matchms.similarity.ModifiedCosine module

class matchms.similarity.ModifiedCosine.ModifiedCosine(tolerance: float = 0.1, intensity_power: float = 1.0, use_hungarian: bool = False, noise_cutoff: float = 0.01)[source]

Bases: BaseSimilarity

Calculate an approximate modified cosine score between mass spectra.

This is matchms central Modified Cosine class. The Modified Cosine score aims at quantifying the similarity between two mass spectra. Two peaks are considered a potential match if their m/z ratios lie within the given tolerance, or if their m/z ratios lie within the tolerance once a mass-shift is applied. The mass shift is the difference in precursor m/z between the two spectra.

Matchms provides various implementations of the Modified Cosine score which are combined here in what we believe to be the typical best choice for most users.

By default, the parameter use_hungarian is set to False, which means that the greedy algorithm is used to find the best matches. This is typically faster than the Hungarian algorithm, and for most applications the results are very similar. If you need the exact optimal solution, you can set use_hungarian to True, which will use the Hungarian algorithm to find the best matches.

For more conceptual context, see Watrous et al. [PNAS, 2012, https://www.pnas.org/content/109/26/E1743].

__init__(tolerance: float = 0.1, intensity_power: float = 1.0, use_hungarian: bool = False, noise_cutoff: float = 0.01)[source]

Initialize the modified cosine score class.

Parameters:

tolerance – Peaks will be considered a match when <= tolerance apart. Default is 0.1.
intensity_power – The power to raise intensity to in the cosine function. The default is 1.
use_hungarian – Whether to use the Hungarian algorithm to find the best matches. The default is False, which means that the greedy algorithm is used to find the best matches. The greedy algorithm is typically faster than the Hungarian algorithm, and for most applications the results are very similar.
noise_cutoff – Minimum relative intensity for a peak to be considered. Default is 0.01. Will only be used if use_hungarian is False.

property is_structured_score: bool: Return True if this similarity uses a structured score dtype.

matrix(spectra_1: Sequence[Spectrum], spectra_2: Sequence[Spectrum] | None = None, score_fields: Sequence[str] | None = None, progress_bar: bool = True, n_jobs: int = -1)[source]

Calculate matrix of Modified Cosine scores.

Parameters:

spectra_1 – First collection of input spectra.
spectra_2 – Second collection of input spectra. If None, compare spectra_1 against itself.
score_fields – Requested score fields. Only ("score",) is supported.
progress_bar – When True, show a progress bar.
n_jobs – Number of parallel jobs to run. Default is -1, which means that all available CPUs minus one will be used.

Returns:

Dense score matrix as a Scores object.

Return type:

Scores

pair(spectrum_1: Spectrum, spectrum_2: Spectrum) → tuple[float, int][source]: Calculate approximate modified cosine score between two spectra.

sparse_matrix(spectra_1, spectra_2=None, idx_row=None, idx_col=None, score_fields=None, score_filter=None, progress_bar: bool = True): Sparse score computation is not available for this similarity.

to_dict() → dict: Return a dictionary representation of the similarity function.