matchms.similarity.cosine_linear_functions module
- matchms.similarity.cosine_linear_functions.linear_cosine_score(spec1, spec2, tolerance, mz_power, intensity_power)[source]
Compute the CosineLinear similarity between two well-separated spectra.
Both spectra must have consecutive m/z gaps > 2 * tolerance (as ensured by sirius_merge_close_peaks). Uses an O(n+m) two-pointer sweep.
- Parameters:
spec1 – 2D array (N, 2) with columns [mz, intensity], sorted ascending m/z.
spec2 – 2D array (M, 2) with columns [mz, intensity], sorted ascending m/z.
tolerance – Maximum allowed difference between m/z values for a match.
mz_power – Power to raise m/z values to.
intensity_power – Power to raise intensity values to.
- Returns:
score (float) – Cosine similarity score.
matches (int) – Number of matched peak pairs.
- matchms.similarity.cosine_linear_functions.sirius_merge_close_peaks(spec, mz_tolerance)[source]
Merge close peaks following the Sirius/BOECKER lab algorithm.
Peaks are merged greedily in descending intensity order. Each unconsumed peak adopts its own m/z and sums the intensities of all unconsumed neighbors within a merge window of 2 * mz_tolerance. The result is guaranteed to have consecutive m/z gaps > 2 * mz_tolerance. When multiple peaks share the same intensity, the lower-m/z peak is processed first so the representative peak stays deterministic across NumPy and Numba sort implementations.
- Parameters:
spec – 2D array (N, 2) with columns [mz, intensity], sorted by ascending m/z.
mz_tolerance – Tolerance for scoring. Merge window is 2 * mz_tolerance.
- Returns:
(M, 2) array of merged peaks sorted by ascending m/z.
- Return type: