matchms.similarity.NeutralLossesCosine module

class matchms.similarity.NeutralLossesCosine.NeutralLossesCosine(tolerance: float = 0.1, mz_power: float = 0.0, intensity_power: float = 1.0, ignore_peaks_above_precursor: bool = True)[source]

Bases: BaseSimilarity

Calculate ‘neutral losses cosine score’ between mass spectra.

The neutral losses cosine score aims at quantifying the similarity between two mass spectra. The score is calculated by finding best possible matches between peaks of two spectra. Two peaks are considered a potential match if their m/z ratios lie within the given ‘tolerance’ once a mass-shift is applied. The mass shift is the difference in precursor-m/z between the two spectra. In general, ModifiedCosine is recommended over NeutralLossesCosine because it will on average deliver more reliable results.

__init__(tolerance: float = 0.1, mz_power: float = 0.0, intensity_power: float = 1.0, ignore_peaks_above_precursor: bool = True)[source]
Parameters:
  • tolerance – Peaks will be considered a match when <= tolerance apart. Default is 0.1.

  • mz_power – The power to raise mz to in the cosine function. The default is 0, in which case the peak intensity products will not depend on the m/z ratios.

  • intensity_power – The power to raise intensity to in the cosine function. The default is 1.

  • ignore_peaks_above_precursor – By default this is set to True, meaning that peaks with m/z values larger than the precursor-m/z will be ignored (since those would correspond to negative “neutral losses”).

keep_score(score)

In the .matrix method scores will be collected in a sparse way. Overwrite this method here if values other than False or 0 should not be stored in the final collection.

matrix(references: List[Spectrum], queries: List[Spectrum], array_type: str = 'numpy', is_symmetric: bool = False) ndarray

Optional: Provide optimized method to calculate an np.array of similarity scores for given reference and query spectrums. If no method is added here, the following naive implementation (i.e. a double for-loop) is used.

Parameters:
  • references – List of reference objects

  • queries – List of query objects

  • array_type – Specify the output array type. Can be “numpy” or “sparse”. Default is “numpy” and will return a numpy array. “sparse” will return a COO-sparse array.

  • is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster.

pair(reference: Spectrum, query: Spectrum) Tuple[float, int][source]

Calculate neutral losses cosine score between two spectra.

Parameters:
  • reference – Single reference spectrum.

  • query – Single query spectrum.

Return type:

Tuple with cosine score and number of matched peaks.

sparse_array(references: List[Spectrum], queries: List[Spectrum], idx_row, idx_col, is_symmetric: bool = False)

Optional: Provide optimized method to calculate an sparse matrix of similarity scores.

Compute similarity scores for pairs of reference and query spectrums as given by the indices idx_row (references) and idx_col (queries). If no method is added here, the following naive implementation (i.e. a for-loop) is used.

Parameters:
  • references – List of reference objects

  • queries – List of query objects

  • idx_row – List/array of row indices

  • idx_col – List/array of column indices

  • is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster.

to_dict() dict

Return a dictionary representation of a similarity function.