matchms.importing package

Functions for importing mass spectral data

Matchms provides a import functions for several commonly used data types, such as .mzML, .mzXML, .mgf, or .msp. It is also possible to load data from .json files (tested for json files from GNPS or json files made with matchms). Another option is to load spectra based on a unique identifier (USI) (load_from_usi()).

For more extensive import options we recommend building custom importers using pyteomics or pymzml.

To process spectrum metadata, matchms can also make use of known adduct information which is imported via load_adducts.

matchms.importing.load_from_json(filename: str, metadata_harmonization: bool = True) List[Spectrum][source]

Load spectrum(s) from json file.

JSON document formatted like the GNPS Spectra library. Spectrums with zero peaks will be skipped.

Example:

from matchms.importing import load_from_json

file_json = "gnps_testdata.json"
spectrums = load_from_json(file_json)
Parameters
  • filename – Provide filename for json file containing spectrum(s).

  • metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.

matchms.importing.load_from_mgf(source: Union[str, TextIO], metadata_harmonization: bool = True) Generator[Spectrum, None, None][source]

Load spectrum(s) from mgf file.

This function will create ~matchms.Spectrum for every spectrum in the given .mgf file (or the file-like object).

Examples:

from matchms.importing import load_from_mgf

file_mgf = "pesticides.mgf"
spectra_from_path = list(load_from_mgf(file_mgf))

# Or you can read the file in your application
with open(file_mgf, 'r') as spectra_file:
    spectra_from_file = list(load_from_mgf(spectra_file))
Parameters
  • source – Accepts both filename (with path) for .mgf file or a file-like object from a preloaded MGF file.

  • metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.

matchms.importing.load_from_msp(filename: str, metadata_harmonization: bool = True) Generator[Spectrum, None, None][source]

MSP file to a Spectrum objects Function that reads a .msp file and converts the info in Spectrum objects.

Parameters
  • filename – Path of the msp file.

  • metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.

Yields

Yield a spectrum object with the data of the msp file

Example:

from matchms.importing import load_from_msp

# Download msp file from MassBank of North America repository at https://mona.fiehnlab.ucdavis.edu/
file_msp = "MoNA-export-GC-MS-first10.msp"
spectrums = list(load_from_msp(file_msp))
matchms.importing.load_from_mzml(filename: str, ms_level: int = 2, metadata_harmonization: bool = True) Generator[Spectrum, None, None][source]

Load spectrum(s) from mzml file.

This function will create ~matchms.Spectrum for every spectrum of desired ms_level found in a given MzML file. For more extensive parsing options consider using pyteomics or pymzml packages.

Example:

from matchms.importing import load_from_mzml

file_mzml = "testdata.mzml"
spectrums = list(load_from_mzml(file_mzml))
Parameters
  • filename – Filename for mzml file to import.

  • ms_level – Specify which ms level to import. Default is 2.

  • metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.

matchms.importing.load_from_mzxml(filename: str, ms_level: int = 2, metadata_harmonization: bool = True) Generator[Spectrum, None, None][source]

Load spectrum(s) from mzml file.

This function will create ~matchms.Spectrum for every spectrum of desired ms_level found in a given MzXML file. For more extensive parsing options consider using the pyteomics package.

Example:

from matchms.importing import load_from_mzxml

file_mzxml = "testdata.mzxml"
spectrums = list(load_from_mzml(file_mzxml))
Parameters
  • filename – Filename for mzXML file to import.

  • ms_level – Specify which ms level to import. Default is 2.

  • metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.

matchms.importing.load_from_usi(usi: str, server: str = 'https://metabolomics-usi.ucsd.edu', metadata_harmonization: bool = True)[source]

Load spectrum from metabolomics USI.

USI returns JSON data with keys “peaks”, “n_peaks” and “precuror_mz”

from matchms.importing import load_from_usi

spectrum = load_from_usi("mzspec:MASSBANK::accession:SM858102")
print(f"Found spectrum with precursor m/z of {spectrum.get("precursor_mz"):.2f}.")
Parameters
  • usi – Provide the usi.

  • server (string) – USI server

  • metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.

Submodules