matchms.similarity.PrecursorMzMatch module¶
- class matchms.similarity.PrecursorMzMatch.PrecursorMzMatch(tolerance: float = 0.1, tolerance_type: str = 'Dalton')[source]¶
Bases:
BaseSimilarity
Return True if spectrums match in precursor m/z (within tolerance), and False otherwise. The match within tolerance can be calculated based on an absolute m/z difference (tolerance_type=”Dalton”) or based on a relative difference in ppm (tolerance_type=”ppm”).
Example to calculate scores between 2 pairs of spectrums and iterate over the scores
import numpy as np from matchms import calculate_scores from matchms import Spectrum from matchms.similarity import PrecursorMzMatch spectrum_1 = Spectrum(mz=np.array([]), intensities=np.array([]), metadata={"id": "1", "precursor_mz": 100}) spectrum_2 = Spectrum(mz=np.array([]), intensities=np.array([]), metadata={"id": "2", "precursor_mz": 110}) spectrum_3 = Spectrum(mz=np.array([]), intensities=np.array([]), metadata={"id": "3", "precursor_mz": 103}) spectrum_4 = Spectrum(mz=np.array([]), intensities=np.array([]), metadata={"id": "4", "precursor_mz": 111}) references = [spectrum_1, spectrum_2] queries = [spectrum_3, spectrum_4] similarity_score = PrecursorMzMatch(tolerance=5.0, tolerance_type="Dalton") scores = calculate_scores(references, queries, similarity_score) for (reference, query, score) in scores: print(f"Precursor m/z match between {reference.get('id')} and {query.get('id')}" + f" is {score}")
Should output
Precursor m/z match between 1 and 3 is [1.0] Precursor m/z match between 2 and 4 is [1.0]
- __init__(tolerance: float = 0.1, tolerance_type: str = 'Dalton')[source]¶
- Parameters:
tolerance – Specify tolerance below which two m/z are counted as match.
tolerance_type – Chose between fixed tolerance in Dalton (=”Dalton”) or a relative difference in ppm (=”ppm”).
- keep_score(score)¶
In the .matrix method scores will be collected in a sparse way. Overwrite this method here if values other than False or 0 should not be stored in the final collection.
- matrix(references: List[Spectrum], queries: List[Spectrum], array_type: str = 'numpy', is_symmetric: bool = False) ndarray [source]¶
Compare parent masses between all references and queries.
- Parameters:
references – List/array of reference spectrums.
queries – List/array of Single query spectrums.
array_type – Specify the output array type. Can be “numpy” or “sparse”. Default is “numpy” and will return a numpy array. “sparse” will return a COO-sparse array.
is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster.
- pair(reference: Spectrum, query: Spectrum) float [source]¶
Compare precursor m/z between reference and query spectrum.
- Parameters:
reference – Single reference spectrum.
query – Single query spectrum.
- sparse_array(references: List[Spectrum], queries: List[Spectrum], idx_row, idx_col, is_symmetric: bool = False)¶
Optional: Provide optimized method to calculate an sparse matrix of similarity scores.
Compute similarity scores for pairs of reference and query spectrums as given by the indices idx_row (references) and idx_col (queries). If no method is added here, the following naive implementation (i.e. a for-loop) is used.
- Parameters:
references – List of reference objects
queries – List of query objects
idx_row – List/array of row indices
idx_col – List/array of column indices
is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster.