matchms.similarity.ModifiedCosineHungarian module

class matchms.similarity.ModifiedCosineHungarian.ModifiedCosineHungarian(tolerance: float = 0.1, mz_power: float = 0.0, intensity_power: float = 1.0)[source]

Bases: BaseSimilarityWithSparse

Calculate exact modified cosine score between mass spectra.

The modified cosine score quantifies similarity between two mass spectra with optional precursor-based mass shift. Potential matches are all peak pairs that are within tolerance either unshifted or shifted by precursor_mz(reference) - precursor_mz(query).

Peak assignment is solved globally via Hungarian assignment (linear sum assignment), which yields an exact one-to-one maximum-weight matching.

See Watrous et al. [PNAS, 2012, https://www.pnas.org/content/109/26/E1743] for the modified cosine concept.

__init__(tolerance: float = 0.1, mz_power: float = 0.0, intensity_power: float = 1.0)[source]

Initialize exact modified cosine.

Parameters:
  • tolerance – Peaks will be considered a match when <= tolerance apart. Default is 0.1.

  • mz_power – The power to raise mz to in the cosine function. The default is 0, in which case the peak intensity products will not depend on the m/z ratios.

  • intensity_power – The power to raise intensity to in the cosine function. The default is 1.

property is_structured_score: bool

Return True if this similarity uses a structured score dtype.

keep_score(score) bool

Return whether a score should be retained in sparse outputs.

This defines the default sparse retention behavior. Users can override it per call via score_filter=....

Default behavior: - scalar score: keep if score != 0 - structured score: keep if all fields are non-zero

matrix(spectra_1: Sequence[Spectrum], spectra_2: Sequence[Spectrum] | None = None, score_fields: Sequence[str] | None = None, progress_bar: bool = True)

Calculate a dense similarity matrix.

Parameters:
  • spectra_1 – First collection of spectra.

  • spectra_2 – Second collection of spectra. If None, compare spectra_1 against itself. For commutative similarities this automatically uses a symmetric optimization.

  • score_fields – Score fields to return. - None means return all available fields. - For scalar scores, only ("score",) is valid. - For structured scores, this can be a subset such as ("score",).

  • progress_bar – When True, show a progress bar. Default is True.

Returns:

Dense score result wrapped in a Scores container.

Return type:

Scores

pair(spectrum_1: Spectrum, spectrum_2: Spectrum) tuple[float, int][source]

Calculate exact modified cosine score between two spectra.

sparse_matrix(spectra_1: Sequence[Spectrum], spectra_2: Sequence[Spectrum] | None = None, idx_row: ArrayLike | None = None, idx_col: ArrayLike | None = None, score_fields: Sequence[str] | None = None, score_filter: Callable[[ndarray], bool] | None = None, progress_bar: bool = True)

Calculate sparse similarity results.

Filtering is applied to the full score before score field projection.

Parameters:
  • spectra_1 – First collection of spectra.

  • spectra_2 – Second collection of spectra. If None, compare spectra_1 against itself.

  • idx_row – Row indices of pairs to compute. If None and idx_col is also None, all pairwise comparisons are considered and only retained scores are stored.

  • idx_col – Column indices of pairs to compute. Must have the same shape as idx_row.

  • score_fields – Score fields to return. - None means return all available fields. - For scalar scores, only ("score",) is valid. - For structured scores, this can be a subset such as ("score",).

  • score_filter – Optional callable receiving the full score and returning whether it should be retained. If None, keep_score() is used.

  • progress_bar – When True, show a progress bar.

Returns:

Sparse score result wrapped in a Scores container.

Return type:

Scores

to_dict() dict

Return a dictionary representation of the similarity function.