Author: Charles Tapley Hoyt
Estimated Run Time: 45 seconds
This notebook shows how to combine multiple graphs from different sources and summarize them together. This might be useful during projects where multiple curators are creating BEL scripts that should be joined for scientific use, but for provenance, should be kept separate.
import os
import time
import sys
import pybel
import pybel_tools
from pybel_tools.summary import info_str
print(sys.version)
3.6.3 (default, Oct 9 2017, 09:47:56) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)]
print(time.asctime())
Thu Mar 15 14:37:02 2018
pybel.utils.get_version()
'0.11.2-dev'
pybel_tools.utils.get_version()
'0.5.2-dev'
bms_base = os.environ['BMS_BASE']
human_dir = os.path.join(bms_base, 'cbn', 'Human-2.0')
mouse_dir = os.path.join(bms_base, 'cbn', 'Mouse-2.0')
rat_dir = os.path.join(bms_base, 'cbn', 'Rat-2.0')
In this notebook, pickled instances of networks from the Causal Biological Networks database are used.
%%time
graphs = []
for d in (human_dir, mouse_dir, rat_dir):
for p in os.listdir(d):
if not p.endswith('gpickle'):
continue
path = os.path.join(d, p)
g = pybel.from_pickle(path)
graphs.append(g)
CPU times: user 291 ms, sys: 78.2 ms, total: 369 ms Wall time: 451 ms
len(graphs)
138
The graphs are combine with the union
function, which retains all node and edges from each graph
%%time
combine = pybel.struct.union(graphs)
CPU times: user 42.4 s, sys: 165 ms, total: 42.5 s Wall time: 42.7 s
print(info_str(combine))
Nodes: 5343 Edges: 28766 Citations: 4580 Authors: 0 Network density: 0.001007837278459561 Components: 466 Average degree: 5.383866741530975
Because networks are represented with Python objects, they can easily be operated upon and passed to functions that already create the appropriate summaries.