matchms.filtering.SpeciesString module

class matchms.filtering.SpeciesString.SpeciesString(dirty: str)[source]

Bases: object

A class to process and clean different types of chemical structure strings including InChI, InChIKey, and SMILES.

The class takes a raw input string, determines the intended structure type, and then cleans the string based on its type.

dirty

Raw input string representing a chemical structure.

Type:

str

target

The intended structure type determined from the input string. Could be ‘inchi’, ‘inchikey’, ‘smiles’, or None if no valid type was identified.

Type:

str

cleaned

The cleaned structure string.

Type:

str

__init__(dirty: str)[source]

Constructs a new instance of the SpeciesString class.

Parameters:

dirty (str) – The raw input string representing a chemical structure.

clean()[source]

Clean the input string based on its determined structure type.

clean_as_inchi()[source]

Search for valid inchi and harmonize it.

clean_as_inchikey()[source]

Search for valid inchikey and harmonize it.

clean_as_smiles()[source]

Search for valid smiles and harmonize it.

guess_target()[source]

Determine the intended structure type of the input string.

looks_like_a_smiles()[source]

Return True if string is made of allowed charcters for smiles.

looks_like_an_inchi()[source]

Search for first piece of InChI.

looks_like_an_inchikey()[source]

Return True if string has format of inchikey.