matchms.networking.SimilarityNetwork module

class matchms.networking.SimilarityNetwork.SimilarityNetwork(top_n: int = 20, max_links: int = 10, score_cutoff: float = 0.7, link_method: str = 'single', keep_unconnected_nodes: bool = True)[source]

Bases: object

Create a similarity network from all-vs-all spectrum similarities.

For example

import numpy as np
from matchms import Spectrum, calculate_scores
from matchms.similarity import ModifiedCosineGreedy
from matchms.networking import SimilarityNetwork

spectrum_1 = Spectrum(
    mz=np.array([100, 150, 200.0]),
    intensities=np.array([0.7, 0.2, 0.1]),
    metadata={"precursor_mz": 100.0, "test_id": "one"},
)
spectrum_2 = Spectrum(
    mz=np.array([104.9, 140, 190.0]),
    intensities=np.array([0.4, 0.2, 0.1]),
    metadata={"precursor_mz": 105.0, "test_id": "two"},
)

modified_cosine = ModifiedCosineGreedy(tolerance=0.2)
spectra = [spectrum_1, spectrum_2]
scores = calculate_scores(spectra, spectra, modified_cosine)

identifiers = [s.get("test_id") for s in spectra]

ms_network = SimilarityNetwork()
ms_network.create_network(scores, identifiers=identifiers, score_name="score")

nodes = list(ms_network.graph.nodes())
nodes.sort()
print(nodes)

Should output

['one', 'two']

__init__(top_n: int = 20, max_links: int = 10, score_cutoff: float = 0.7, link_method: str = 'single', keep_unconnected_nodes: bool = True)[source]

Parameters:

top_n – Consider an edge between node A and node B if the score falls into the top_n hits of A or B (link_method="single"), or into the top_n hits of both A and B (link_method="mutual"). From those potential links, only max_links are kept per node, so top_n must be >= max_links.
max_links – Maximum number of outgoing links to add per node. Default is 10. Due to incoming links, total degree can be higher.
score_cutoff – Threshold for similarities. Edges are only created for similarities >= score_cutoff.
link_method –
Choose between "single" and "mutual". - "single" adds all eligible top-k links. - "mutual" only adds a link if both nodes rank each other

within their respective top-k lists.
keep_unconnected_nodes – If True (default), all identifiers are included as nodes even if they have no edges. If False, isolated nodes are removed.

create_network(scores: <module 'matchms.Scores' from '/home/docs/checkouts/readthedocs.org/user_builds/matchms/checkouts/development/matchms/Scores.py'>, identifiers: ~collections.abc.Sequence[str], score_name: str | None = None) → None[source]

Create a similarity network from a square all-vs-all Scores object.

Parameters:

scores – Matchms Scores object containing all-vs-all similarities. The score matrix must be square.
identifiers – Node identifiers corresponding to the rows/columns of the score matrix. Must have length equal to scores.shape[0].
score_name – Name of the score field to use. If None: - scalar Scores: the only field is used - multi-field Scores: "score" is used if present

export_to_file(filename: str, graph_format: str = 'graphml')[source]

Save the network to a file.

Parameters:

filename – Path to output file.
graph_format – Output format. Supported formats are: "cyjs", "gexf", "gml", "graphml", "json".

export_to_graphml(filename: str)[source]: Save the network as GraphML.