matchms.Pipeline module

class matchms.Pipeline.Pipeline(workflow: OrderedDict, progress_bar: bool = True, logging_level: str = 'WARNING', logging_file: str | None = None)[source]

Bases: object

Central pipeline class.

The pipeline applies filters to one or two collections of spectra and then executes a sequence of similarity computations and mask steps.

Notes

  • If only spectra_1 is provided during run(), the pipeline assumes a symmetric all-vs-all computation and sets is_symmetric=True.

  • If spectra_2 is also provided, the pipeline computes spectra_1 vs spectra_2 and sets is_symmetric=False.

__init__(workflow: OrderedDict, progress_bar: bool = True, logging_level: str = 'WARNING', logging_file: str | None = None)[source]
import_spectra(spectra_1: list[str] | str, spectra_2: list[str] | str | None = None) None[source]

Import one or two spectra collections from file(s).

run(spectra_1, spectra_2=None, cleaned_spectra_1_file=None, cleaned_spectra_2_file=None, create_report: bool = True)[source]

Execute the pipeline workflow.

matchms.Pipeline.check_score_computation(score_computations: Sequence[str | list[dict]]) None[source]

Check if score computations are valid.

matchms.Pipeline.create_workflow(yaml_file_name: str | None = None, spectra_1_filters: Iterable[str | Callable | tuple[Callable | str, dict[str, any]]] = (), spectra_2_filters: Iterable[str | Callable | tuple[Callable | str, dict[str, any]]] = (), score_computations: Iterable[str | list[str | dict]] = ()) OrderedDict[source]

Create a workflow specification for Pipeline.

matchms.Pipeline.get_unused_filters(yaml_file)[source]

Checks which filters from matchms are not used in the yaml file.