matchms.filtering.metadata_processing.derive_annotation_from_compound_name module

matchms.filtering.metadata_processing.derive_annotation_from_compound_name.derive_annotation_from_compound_name(spectrum_in: <module 'matchms.Spectrum' from '/home/docs/checkouts/readthedocs.org/user_builds/matchms/checkouts/latest/readthedocs/../matchms/Spectrum.py'>, annotated_compound_names_file: str | None = None, mass_tolerance: float = 0.1)[source]

Adds smiles, inchi, inchikey based on compound name by searching pubchem

This filter is only run, if there is not yet a valid smiles or inchi in the metadata. The smiles, inchi and inchikey are only added if the found annotation is close enough to the parent mass.

Parameters:
  • spectrum_in – The input spectrum.

  • annotated_compound_names_file (Optional[str]) – Any compound name that was searched for on pubchem will be added to this file. If a compound name is already in this file it will be used instead of looking up at pubchem. This file can be reused for future runs, speeding up the process. If None. The compound names found will still be cached for this run, but won’t be reusable for future runs. The csv file should contain the columns [“compound_name”, “smiles”, “inchi”, “inchikey”, “monoisotopic_mass”]

  • mass_tolerance – Acceptable mass difference between query compound and pubchem result.