matchms.importing package
Functions for importing mass spectral data
Matchms provides import functions for several commonly used mass spectral data
formats, such as .mzML, .mzXML, .mgf, or .msp. It is also possible to
load spectra from .json files (tested for json files from GNPS or json files
made with matchms), from pickle files, or based on a unique spectrum identifier
(USI) (load_from_usi()).
The individual load_from_* functions and load_spectra()
return spectra as Spectrum objects or iterables of
Spectrum objects. For collection-based workflows, use
load_ms2_dataset() to directly load a file as a
SpectraCollection.
For more extensive import options we recommend building custom importers using pyteomics or pymzml.
To process spectrum metadata, matchms can also make use of known adduct
information which is imported via load_adducts.
- matchms.importing.load_from_json(filename: str, metadata_harmonization: bool = True) list[Spectrum][source]
Load spectrum(s) from json file.
JSON document formatted like the GNPS Spectra library. Spectra with zero peaks will be skipped.
Example:
from matchms.importing import load_from_json file_json = "gnps_testdata.json" spectra = load_from_json(file_json)
- Parameters:
filename – Provide filename for json file containing spectrum(s).
metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.
- matchms.importing.load_from_mgf(filename: str | Path | TextIO, metadata_harmonization: bool = True) Generator[Spectrum, None, None][source]
Load spectrum(s) from mgf file.
This function will create ~matchms.Spectrum for every spectrum in the given .mgf file (or the file-like object).
Examples:
from matchms.importing import load_from_mgf file_mgf = "pesticides.mgf" spectra_from_path = list(load_from_mgf(file_mgf)) # Or you can read the file in your application with open(file_mgf, "r") as spectra_file: spectra_from_file = list(load_from_mgf(spectra_file))
- Parameters:
filename – Accepts both filename (with path) for .mgf file or a file-like object from a preloaded MGF file.
metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.
- matchms.importing.load_from_msp(filename: str, metadata_harmonization: bool = True) Generator[Spectrum, None, None][source]
MSP file to a
Spectrumobjects Function that reads a .msp file and converts the info inSpectrumobjects.- Parameters:
filename – Path of the msp file.
metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.
- Yields:
Yield a spectrum object with the data of the msp file
Example:
from matchms.importing import load_from_msp # Download msp file from MassBank of North America repository at https://mona.fiehnlab.ucdavis.edu/ file_msp = "MoNA-export-GC-MS-first10.msp" spectra = list(load_from_msp(file_msp))
- matchms.importing.load_from_mzml(filename: str | Path, ms_level: int = 2, metadata_harmonization: bool = True) Generator[Spectrum, None, None][source]
Load spectrum(s) from mzml file.
This function will create ~matchms.Spectrum for every spectrum of desired ms_level found in a given MzML file. For more extensive parsing options consider using pyteomics or pymzml packages.
Example:
from matchms.importing import load_from_mzml file_mzml = "testdata.mzml" spectra = list(load_from_mzml(file_mzml))
- Parameters:
filename – Filename for mzml file to import.
ms_level – Specify which ms level to import. Default is 2.
metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.
- matchms.importing.load_from_mzxml(filename: str | Path, ms_level: int = 2, metadata_harmonization: bool = True) Generator[Spectrum, None, None][source]
Load spectrum(s) from mzml file.
This function will create ~matchms.Spectrum for every spectrum of desired ms_level found in a given MzXML file. For more extensive parsing options consider using the pyteomics package.
Example:
from matchms.importing import load_from_mzxml file_mzxml = "testdata.mzxml" spectra = list(load_from_mzml(file_mzxml))
- Parameters:
filename – Filename for mzXML file to import.
ms_level – Specify which ms level to import. Default is 2.
metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.
- matchms.importing.load_from_pickle(filename: str, metadata_harmonization: bool) list[Spectrum][source]
Load spectra stored in pickle
- Args:
filename (str): Pickled file with spectra.
- Returns:
Any: Unpickled object. Should be a list of Spectra.
- matchms.importing.load_from_usi(usi: str, server: str = 'https://metabolomics-usi.gnps2.org', metadata_harmonization: bool = True) Spectrum | None[source]
Load spectrum from metabolomics USI.
USI returns JSON data with keys “peaks”, “n_peaks” and “precuror_mz”
from matchms.importing import load_from_usi spectrum = load_from_usi("mzspec:MASSBANK::accession:SM858102") print(f"Found spectrum with precursor m/z of {spectrum.get('precursor_mz'):.2f}.")
- Parameters:
usi – Provide the usi.
server (string) – USI server
metadata_harmonization (bool, optional) – Set to False if metadata harmonization to default keys is not desired. The default is True.
- matchms.importing.load_ms2_dataset(file: str, metadata_harmonization: bool = True, ftype: str = 'auto', **kwargs) SpectraCollection[source]
Load spectra from a file as a SpectraCollection.
- Parameters:
file – Path to file containing spectra.
metadata_harmonization – If True, harmonize metadata during import.
ftype – File type to use for import. By default,
"auto"guesses the file type from the file extension. Alternatively, pass an explicit file type, for example"mzml","json","mgf","msp","mzxml", or"pickle".**kwargs – Additional keyword arguments to pass to the SpectraCollection constructor, for example
mz_precision.
- Returns:
Imported spectra as a collection.
- Return type:
- matchms.importing.load_spectra(file: str, metadata_harmonization: bool = True, ftype: str | None = 'auto') list[Spectrum] | Generator[Spectrum, None, None][source]
Load spectra from a file as matchms Spectrum objects.
The following file extensions can be loaded with this function:
mzML,json,mgf,msp,mzxmlandpickle.A pickled file is expected to directly contain a list of matchms Spectrum objects.
- Parameters:
file – Path to file containing spectra.
metadata_harmonization – If True, harmonize metadata during import.
ftype – File type to use for import. By default,
"auto"guesses the file type from the file extension. Alternatively, pass an explicit file type, for example"mzml","json","mgf","msp","mzxml", or"pickle".
- Returns:
Imported spectra.
- Return type:
Submodules
- matchms.importing.load_from_json module
- matchms.importing.load_from_mgf module
- matchms.importing.load_from_msp module
- matchms.importing.load_from_mzml module
- matchms.importing.load_from_mzxml module
- matchms.importing.load_from_pickle module
- matchms.importing.load_from_usi module
- matchms.importing.load_spectra module
- matchms.importing.parsing_utils module