matchms.similarity.BaseSimilarity module
- class matchms.similarity.BaseSimilarity.BaseSimilarity[source]
Bases:
ABCSimilarity function base class.
When building a custom similarity measure, inherit from this class and implement the desired methods.
- is_commutative
Whether the similarity function is commutative, meaning that the order of spectra does not matter:
similarity(A, B) == similarity(B, A). Default is True.
- score_datatype
NumPy dtype of a single score value. Examples are
np.float64for scalar scores or a structured dtype such asnp.dtype([("score", np.float64), ("matches", np.int64)])for multi-field scores.
- score_fields
Names of the score fields. For scalar scores this should usually be
("score",). For structured scores, this should match the dtype field names, for instance("score", "matches").
- matrix(spectra_1: Sequence[Spectrum], spectra_2: Sequence[Spectrum] | None = None, score_fields: Sequence[str] | None = None, progress_bar: bool = True)[source]
Calculate a dense similarity matrix.
- Parameters:
spectra_1 – First collection of spectra.
spectra_2 – Second collection of spectra. If None, compare
spectra_1against itself. For commutative similarities this automatically uses a symmetric optimization.score_fields – Score fields to return. -
Nonemeans return all available fields. - For scalar scores, only("score",)is valid. - For structured scores, this can be a subset such as("score",).progress_bar – When True, show a progress bar. Default is True.
- Returns:
Dense score result wrapped in a
Scorescontainer.- Return type:
- abstractmethod pair(spectrum_1: Spectrum, spectrum_2: Spectrum)[source]
Calculate the similarity for one pair of spectra.
- Parameters:
spectrum_1 – First spectrum.
spectrum_2 – Second spectrum.
- Returns:
Similarity result for one pair. The returned value should be compatible with
self.score_datatype.- Return type:
score
Examples
- Scalar score:
return np.asarray(score, dtype=self.score_datatype)- Structured score:
return np.asarray((score, matches), dtype=self.score_datatype)
- score_datatype
alias of
float64
- class matchms.similarity.BaseSimilarity.BaseSimilarityWithSparse[source]
Bases:
BaseSimilarityBase similarity class with a default sparse implementation.
This class extends BaseSimilarity by providing a default implementation of sparse_matrix() that applies a score filter to the dense results.
Subclasses can override keep_score() to define the default filtering behavior, and users can also pass a custom score_filter=… to sparse_matrix() for per-call control.
- keep_score(score) bool[source]
Return whether a score should be retained in sparse outputs.
This defines the default sparse retention behavior. Users can override it per call via
score_filter=....Default behavior: - scalar score: keep if
score != 0- structured score: keep if all fields are non-zero
- matrix(spectra_1: Sequence[Spectrum], spectra_2: Sequence[Spectrum] | None = None, score_fields: Sequence[str] | None = None, progress_bar: bool = True)
Calculate a dense similarity matrix.
- Parameters:
spectra_1 – First collection of spectra.
spectra_2 – Second collection of spectra. If None, compare
spectra_1against itself. For commutative similarities this automatically uses a symmetric optimization.score_fields – Score fields to return. -
Nonemeans return all available fields. - For scalar scores, only("score",)is valid. - For structured scores, this can be a subset such as("score",).progress_bar – When True, show a progress bar. Default is True.
- Returns:
Dense score result wrapped in a
Scorescontainer.- Return type:
- abstractmethod pair(spectrum_1: Spectrum, spectrum_2: Spectrum)
Calculate the similarity for one pair of spectra.
- Parameters:
spectrum_1 – First spectrum.
spectrum_2 – Second spectrum.
- Returns:
Similarity result for one pair. The returned value should be compatible with
self.score_datatype.- Return type:
score
Examples
- Scalar score:
return np.asarray(score, dtype=self.score_datatype)- Structured score:
return np.asarray((score, matches), dtype=self.score_datatype)
- score_datatype
alias of
float64
- sparse_matrix(spectra_1: Sequence[Spectrum], spectra_2: Sequence[Spectrum] | None = None, idx_row: ArrayLike | None = None, idx_col: ArrayLike | None = None, score_fields: Sequence[str] | None = None, score_filter: Callable[[ndarray], bool] | None = None, progress_bar: bool = True)[source]
Calculate sparse similarity results.
Filtering is applied to the full score before score field projection.
- Parameters:
spectra_1 – First collection of spectra.
spectra_2 – Second collection of spectra. If None, compare
spectra_1against itself.idx_row – Row indices of pairs to compute. If None and
idx_colis also None, all pairwise comparisons are considered and only retained scores are stored.idx_col – Column indices of pairs to compute. Must have the same shape as
idx_row.score_fields – Score fields to return. -
Nonemeans return all available fields. - For scalar scores, only("score",)is valid. - For structured scores, this can be a subset such as("score",).score_filter – Optional callable receiving the full score and returning whether it should be retained. If None,
keep_score()is used.progress_bar – When True, show a progress bar.
- Returns:
Sparse score result wrapped in a
Scorescontainer.- Return type: