matchms.Fingerprints module
- class matchms.Fingerprints.Fingerprints(fingerprint_algorithm: str = 'daylight', fingerprint_method: str = 'bit', nbits: int = 2048, ignore_stereochemistry: bool = False, **kwargs)[source]
Bases:
objectComputes and stores inchikey-fingerprint mapping for a list of spectra,
For example
from matchms import Fingerprints from matchms import Spectrum import numpy as np spectrum_1 = Spectrum(mz=np.array([100, 150, 200.]), intensities=np.array([0.7, 0.2, 0.1]), metadata={"inchikey": "OTMSDBZUPAUEDD-UHFFFAOYSA-N", "smiles":"CC"}) spectrum_2 = Spectrum(mz=np.array([100, 150, 200.]), intensities=np.array([0.7, 0.2, 0.1]), metadata={"inchikey": "UGFAIRIUMAVXCW-UHFFFAOYSA-N","smiles": "[C-]#[O+]"}) spectra = [spectrum_1, spectrum_2] fpgen = Fingerprints() fpgen.compute_fingerprints(spectra) print(fpgen.fingerprint_count) print(type(fpgen.get_fingerprint_by_inchikey('OTMSDBZUPAUEDD-UHFFFAOYSA-N')))
Should output
2 <class 'numpy.ndarray'>
- config
The configuration for the fingerprints e.g., used algorithm, nbits, …
- fingerprints
The computed fingerprints. Use after compute_fingerprints().
- fingerprints_count
The number of fingerprints computed.
- to_dataframe
A DataFrame containing the inchikey and fingerprint
- __init__(fingerprint_algorithm: str = 'daylight', fingerprint_method: str = 'bit', nbits: int = 2048, ignore_stereochemistry: bool = False, **kwargs)[source]
- Parameters:
fingerprint_algorithm – The fingerprint algorithm to use. Available options: daylight, morgan1, morgan2, morgan3.
fingerprint_method – The fingerprint method to use. Available options: bit, sparse_bit, count, sparse_count.
nbits – The number of bits or fingerprint size. Defaults to 2048.
ignore_stereochemistry – Determines which inchikey version will be used. If set to true the first 14 chars of the inchikey are used.
- compute_fingerprint(spectrum: Spectrum) ndarray | None[source]
Computes a single fingerprint for a given spectrum.
- Parameters:
spectrum – A spectrum for which a fingerprint is to be calculated.
Return
--------------
Optional[np.ndarray] – The corresponding fingerprint.
- compute_fingerprints(spectra: list[Spectrum])[source]
Computes fingerprints for a list of spectra.
This will first create a dict with unique spectra and then computes fingerprints for all mols. Only valid fingerprints will be added to the mapping. Query specific fingerprints by using get_fingerprint_by_spectrum() or get_fingerprint_by_inchikey()
- Parameters:
spectra – List of Spectrum