matchms.similarity.IntersectMz module¶
- class matchms.similarity.IntersectMz.IntersectMz(scaling: float = 1.0)[source]¶
Bases:
BaseSimilarity
Example score for illustrating how to build custom spectra similarity score.
IntersectMz will count all exact matches of peaks and divide it by all unique peaks found in both spectrums.
Example of how matchms similarity functions can be used:
import numpy as np from matchms import Spectrum from matchms.similarity import IntersectMz spectrum_1 = Spectrum(mz=np.array([100, 150, 200.]), intensities=np.array([0.7, 0.2, 0.1])) spectrum_2 = Spectrum(mz=np.array([100, 140, 190.]), intensities=np.array([0.4, 0.2, 0.1])) # Construct a similarity function similarity_measure = IntersectMz(scaling=1.0) score = similarity_measure.pair(spectrum_1, spectrum_2) print(f"IntersectMz score is {score:.2f}")
Should output
IntersectMz score is 0.20
- __init__(scaling: float = 1.0)[source]¶
Constructor. Here, function parameters are defined.
- Parameters:
scaling – Scale scores to maximum possible score being ‘scaling’.
- keep_score(score)¶
In the .matrix method scores will be collected in a sparse way. Overwrite this method here if values other than False or 0 should not be stored in the final collection.
- matrix(references: List[Spectrum], queries: List[Spectrum], array_type: str = 'numpy', is_symmetric: bool = False) ndarray ¶
Optional: Provide optimized method to calculate an np.array of similarity scores for given reference and query spectrums. If no method is added here, the following naive implementation (i.e. a double for-loop) is used.
- Parameters:
references – List of reference objects
queries – List of query objects
array_type – Specify the output array type. Can be “numpy” or “sparse”. Default is “numpy” and will return a numpy array. “sparse” will return a COO-sparse array.
is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster.
- pair(reference: Spectrum, query: Spectrum) float [source]¶
This will calculate the similarity score between two spectra.
- score_datatype¶
alias of
float64
- sparse_array(references: List[Spectrum], queries: List[Spectrum], idx_row, idx_col, is_symmetric: bool = False)¶
Optional: Provide optimized method to calculate an sparse matrix of similarity scores.
Compute similarity scores for pairs of reference and query spectrums as given by the indices idx_row (references) and idx_col (queries). If no method is added here, the following naive implementation (i.e. a for-loop) is used.
- Parameters:
references – List of reference objects
queries – List of query objects
idx_row – List/array of row indices
idx_col – List/array of column indices
is_symmetric – Set to True when references and queries are identical (as for instance for an all-vs-all comparison). By using the fact that score[i,j] = score[j,i] the calculation will be about 2x faster.