In the introductory_tutorial
we ran through building structural covariance network analyses using scona
🍪.
In this tutorial we'll cover some of the visualisation tools to communicate these results.
Click on any of the links below to jump to that section
plot_degree
report_global_measures
import scona as scn
import scona.datasets as datasets
import numpy as np
import networkx as nx
import pandas as pd
from IPython.display import display
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
%load_ext autoreload
%autoreload 2
If you're not sure about this step, please check out the introductory_tutorial
notebook for more explanation.
# Read in sample data from the NSPN WhitakerVertes PNAS 2016 paper.
df, names, covars, centroids = datasets.NSPN_WhitakerVertes_PNAS2016.import_data()
# calculate residuals of the matrix df for the columns of names
df_res = scn.create_residuals_df(df, names, covars)
# create a correlation matrix over the columns of df_res
M = scn.create_corrmat(df_res, method='pearson')
# Initialise a weighted graph G from the correlation matrix M
G = scn.BrainNetwork(network=M, parcellation=names, centroids=centroids)
# Threshold G at cost 10 to create a binary graph with 10% as many edges as the complete graph G.
G10 = G.threshold(10)
# Create a GraphBundle object that contains the G10 graph called "real_graph"
bundleGraphs = scn.GraphBundle([G10], ["real_graph"])
# Add ten random graphs to this bundle
# (In real life you'd want more than 10 random graphs,
# but this step can take quite a long time to run so
# for the demo we just create 10)
bundleGraphs.create_random_graphs("real_graph", 10)
Creating 10 random graphs - may take a little while
plot_degree
¶The degree of each node is the number of edges adjacent to the node. For example if a node is connected to four other nodes then its degree is 4. If it is connected to 50 other nodes, its degree is 50.
Brain networks are usually "scale-free" which means that their degree distribution follows a power law. You can think of them as having a "heavy tail": there are a small number of nodes that have a large number of connections.
This is in contrast to - for example - an Erdős–Rényi graph where each node is connected to the others with a set, random probability. This graph is often called a binomial graph because the probability of connections follows a binomial (Yes-No) distribution.
One of the first things to check for the structural covariance network analysis with scona
is that our degree distribution shows this pattern.
The degree distribution is already saved in the G10
graph object.
But we'll just spend a few moments showing how you can access that information.
You can make a dictionary of the node ids (the dictionary key) and their degree (the dictionary value).
degrees = dict(G10.degree())
# Print the degree of every 50th node to show what's inside this dictionary
for node_id, degree in list(degrees.items())[::50]:
print ('Node: {:3d} has degree = {:2d}'.format(node_id, degree))
Node: 0 has degree = 47 Node: 50 has degree = 16 Node: 100 has degree = 80 Node: 150 has degree = 9 Node: 200 has degree = 11 Node: 250 has degree = 37 Node: 300 has degree = 25
You can see the information for a specific node from the graph itself.
Although note that the degree needs to be calculated. It hasn't been added to the attributes yet.
# Display the nodal attributes
G10.nodes[150]
{'name': 'lh_insula_part3', 'x': -37.400137000000001, 'y': -8.5937070000000002, 'z': 4.4363890000000001, 'centroids': array([-37.400137, -8.593707, 4.436389])}
scona
has a command for that.
Lets go ahead and add the degree to the nodal attributes....along with a few other measures.
# Calculate nodal measures for graph G10
G10.calculate_nodal_measures()
# Display the nodal attributes
G10.nodes[150]
{'name': 'lh_insula_part3', 'x': -37.400137000000001, 'y': -8.5937070000000002, 'z': 4.4363890000000001, 'centroids': array([-37.400137, -8.593707, 4.436389]), 'module': 1, 'degree': 9, 'closeness': 0.39308578745198464, 'betweenness': 0.0011242664849842761, 'shortest_path_length': 2.5357142857142856, 'clustering': 0.5277777777777778, 'participation_coefficient': 0.0}
Look at all that information!
We only want to visualise the degree distribution at the moment though.
plot_degree_dist
¶# import the function to plot network measures
from scona.visualisations import plot_degree_dist
We only need the BrainNetwork graph to plot the degree distribution.
By default we add an Erdős–Rényi random graph that has the same number of nodes as our BrainNetwork Graph for comparison. The default colours are blue for the degree distribution of the real graph and a grey line for the random graph.
plot_degree_dist(G10)
The random graph is a good sanity check that your degree distribution is not random...but it rather swamps the plot. So this example allows you to plot only the degree distribution of the real graph, without the random graph.
plot_degree_dist(G10, binomial_graph=False)
You can save this figure in any location.
You can do that by passing a file name and (optional) directory path to the figure_name
option.
If you don't set a directory path the figure will be saved in the local directory.
For this tutorial we'll save the output in a figures
folder inside this tutorials
directory.
plot_degree_dist(G10, binomial_graph=False, figure_name="figures/DegreeDistribution.png")
☝️ Did you see an error message?
The code checks to see if the directory that you want to save your figure to actually exists. If it doesn't then it creates the directory, but gives you a little warning first to check that it isn't coming as a surprised (for example if you have tried to save your figure in the wrong place!)
We have the tutorials/figures
directory specifically ignored in this project so we shouldn't ever see changes there.
Note that if you don't pass a file ending the file will be saved as a png
by default.
If you add a file extension allowed by matplotlib
(eg .jpg
, .svg
, .pdf
etc) then the figure will be saved in that format.
You can pass a pair of colours to the plot_degree_dist
function.
The first colour is that of the histogram for the real graph.
The second colour is the line for the Erdős-Rényi graph.
In the example below, we've chosen red and black 🎨
plot_degree_dist(G10, color=["red", "black"])
report_global_measures
¶One of the first things we want to know are how the global attributes of the network compare to those of random networks.
Specifically we'll calculate:
a
: assortativityC
: clusteringE
: efficiencyL
: shortest pathM
: modularitysigma
: small worldand plot a bar chart that compares the real network to the random graphs.
# Calculate the global measures
bundleGraphs_measures = bundleGraphs.report_global_measures()
# Show the dataframe so we can see the measures
display(bundleGraphs_measures)
assortativity | average_clustering | average_shortest_path_length | efficiency | modularity | |
---|---|---|---|---|---|
real_graph | 0.090769 | 0.449889 | 2.376243 | 0.479840 | 0.382855 |
real_graph_R0 | -0.088234 | 0.226946 | 2.084902 | 0.519406 | 0.121712 |
real_graph_R1 | -0.092617 | 0.247408 | 2.079741 | 0.520277 | 0.124544 |
real_graph_R2 | -0.087816 | 0.223323 | 2.084056 | 0.519651 | 0.130772 |
real_graph_R3 | -0.096873 | 0.230480 | 2.085008 | 0.519368 | 0.123251 |
real_graph_R4 | -0.079377 | 0.223141 | 2.084310 | 0.519523 | 0.123564 |
real_graph_R5 | -0.088001 | 0.233167 | 2.078832 | 0.520325 | 0.128539 |
real_graph_R6 | -0.093329 | 0.227506 | 2.076399 | 0.520728 | 0.125078 |
real_graph_R7 | -0.061999 | 0.227014 | 2.097064 | 0.517494 | 0.129245 |
real_graph_R8 | -0.080110 | 0.227883 | 2.088434 | 0.518872 | 0.120861 |
real_graph_R9 | -0.073402 | 0.234468 | 2.088646 | 0.518867 | 0.116025 |
Now you have everything to plot the network measures of the BrainNetwork Graph and compare these measures to random measures values obtained from 10 random graphs stored inside the graph bundle bundleGraphs
.
plot_network_measures
¶# import the function to plot network measures
from scona.visualisations import plot_network_measures
There are 2 required parameters for the plot_network_measures
function:
GraphBundle
object (e.g. bundleGraphs
)GraphBundle
(e.g. "real_graph"
)The default colours are blue and grey, and by default the error bars show 95% confidence intervals.
plot_network_measures(bundleGraphs, real_network="real_graph")
You'll probably want to save the beautiful figure you've made!
You can do that by passing a file name and (optional) directory path to the figure_name
option.
If you don't set a directory path the figure will be saved in the local directory.
For this tutorial we'll save the output in a figures
folder inside this tutorials
directory.
For fun, we'll also adjust the colours to make the real network orange (#FF4400) and the random network turquoise (#00BBFF).
plot_network_measures(bundleGraphs, "real_graph",
figure_name="figures/NetworkMeasuresDemo",
color=["#FF4400", "#00BBFF"])
You might not want to show the legend. That's fine!
We'll also use this example to save an svg
file.
plot_network_measures(bundleGraphs, "real_graph",
figure_name="figures/NetworkMeasuresDemoNoLegend.svg",
show_legend=False)
You might not want to show the random graphs.
In this case you have to create a new graph bundle that only contains the real graph, and pass that to the plot_network_measures
function.
For this example we've also changed the colour to green (to show off 😉).
# Create a new graph bundle
realBundle = scn.GraphBundle([G10], ["real_graph"])
plot_network_measures(realBundle, real_network = "real_graph",
color=["green"])
The variance of measures obtained from random graphs is - by default - shown as the 95% confidence interval.
They're calculated by bootstrapping the random graphs. There's more information in the seaborn documentation if you're curious.
But you don't have to calculate them. You can plot the standard deviations instead if you'd prefer. (These are a bit larger than the 95% confidence intervals so they're a bit easier to see in the plot below.)
plot_network_measures(bundleGraphs, real_network="real_graph",
ci="sd")
Alternatively you could show the 99% confidence interval.
plot_network_measures(bundleGraphs, real_network="real_graph",
ci=99)
You can't publish results with 10 random graphs. These don't give meaningful variations. So let's add 90 more random graphs.
(This still isn't enough, but much better than 10! We'd recommend that you run 1000 random graphs for publication quality results.)
This takes some time (around 5 minutes) so the cell below is commented out by default.
Remove the #
at the start of each of the lines below to run the commands yourself.
#bundleGraphs.create_random_graphs("real_graph", 90)
#print (len(bundleGraphs))
Congratulations! 🎉
You created additional 90 random graphs, to give you a total of 100 random graphs and 1 real graph, and you managed to answer to some of your emails while waiting.
Here's a beautiful plot of your network measures with 95% confidence intervals....which you can't see because the random networks are all so similar to each other 🤦
#plot_network_measures(bundleGraphs, real_network="real_graph")
Function requries GraphBundle object - scona way to handle across-network comparisons. Basically, it is a dictionary, containing BrainNetwork objects as values and strings (corresponding names of BrainNetwork) as keys.
It is also required to pass the name of the real Graph in GraphBundle (e.g. "Real_Graph") as a string.
Let's create input for the function
print ('Woo')
Woo
# instantiate the GraphBundle object with the BrainNetwork Graph and corresponding name for this Graph
bundleGraphs = scn.GraphBundle([H], ["Real_Graph"])
This creates a dictionary-like object with BrainNetwork H keyed by 'Real_Graph'.
bundleGraphs
{'Real_Graph': <scona.classes.BrainNetwork at 0x7fa66f8ce160>}
Now add a series of random graphs created by edge swap randomisation of H (keyed by 'Real_Graph').
The create_random_graphs method of the GraphBundle class takes in a real network (in our case Real_Graph) and creates a number (10 in the example below) of random graphs. The output is a dictionary of all these graphs.
# Note that 10 is not usually a sufficient number of random graphs to do meaningful analysis,
# it is used here for time considerations
bundleGraphs.create_random_graphs("Real_Graph", 10)
Creating 10 random graphs - may take a little while
bundleGraphs
{'Real_Graph': <scona.classes.BrainNetwork at 0x7fa66f8ce160>, 'Real_Graph_R0': <scona.classes.BrainNetwork at 0x7fa66cc893c8>, 'Real_Graph_R1': <scona.classes.BrainNetwork at 0x7fa6a26c6b70>, 'Real_Graph_R2': <scona.classes.BrainNetwork at 0x7fa68f7a0d30>, 'Real_Graph_R3': <scona.classes.BrainNetwork at 0x7fa68f7a0e10>, 'Real_Graph_R4': <scona.classes.BrainNetwork at 0x7fa68f7a0b00>, 'Real_Graph_R5': <scona.classes.BrainNetwork at 0x7fa66cc89400>, 'Real_Graph_R6': <scona.classes.BrainNetwork at 0x7fa66cc894a8>, 'Real_Graph_R7': <scona.classes.BrainNetwork at 0x7fa66cc89438>, 'Real_Graph_R8': <scona.classes.BrainNetwork at 0x7fa66cc89390>, 'Real_Graph_R9': <scona.classes.BrainNetwork at 0x7fa66cc89358>}
Well-done! The required input - GraphBundle is created which contains real network keyed by "Real_Graph" and 10 random graphs. Now, let's plot the rich club coefficient values of our BrainNetwork Graph and compare real rich club values to random rich club values obtained from 10 random Graphs (stored inside the GraphBundle).
# import the function to plot rich club values
from scona.visualisations import plot_rich_club
# plot the figure and display without saving to a file
plot_rich_club(bundleGraphs, real_network="Real_Graph")
# show rich club values for degrees from 55 to 65
rich_club_df = bundleGraphs.report_rich_club()
rich_club_df.iloc[55:66, :]
Real_Graph | Real_Graph_R0 | Real_Graph_R1 | Real_Graph_R2 | Real_Graph_R3 | Real_Graph_R4 | Real_Graph_R5 | Real_Graph_R6 | Real_Graph_R7 | Real_Graph_R8 | Real_Graph_R9 | |
---|---|---|---|---|---|---|---|---|---|---|---|
55 | 0.566783 | 0.500581 | 0.494774 | 0.472706 | 0.500581 | 0.480836 | 0.486643 | 0.479675 | 0.477352 | 0.499419 | 0.501742 |
56 | 0.574390 | 0.509756 | 0.501220 | 0.473171 | 0.501220 | 0.482927 | 0.485366 | 0.481707 | 0.480488 | 0.502439 | 0.508537 |
57 | 0.578205 | 0.510256 | 0.511538 | 0.480769 | 0.510256 | 0.488462 | 0.487179 | 0.491026 | 0.488462 | 0.503846 | 0.515385 |
58 | 0.599099 | 0.522523 | 0.513514 | 0.483483 | 0.533033 | 0.490991 | 0.496997 | 0.496997 | 0.490991 | 0.513514 | 0.518018 |
59 | 0.606723 | 0.534454 | 0.527731 | 0.494118 | 0.546218 | 0.500840 | 0.500840 | 0.500840 | 0.505882 | 0.529412 | 0.526050 |
60 | 0.606723 | 0.534454 | 0.527731 | 0.494118 | 0.546218 | 0.500840 | 0.500840 | 0.500840 | 0.505882 | 0.529412 | 0.526050 |
61 | 0.615054 | 0.552688 | 0.535484 | 0.509677 | 0.556989 | 0.505376 | 0.518280 | 0.513978 | 0.522581 | 0.537634 | 0.556989 |
62 | 0.625287 | 0.560920 | 0.547126 | 0.512644 | 0.560920 | 0.503448 | 0.521839 | 0.528736 | 0.519540 | 0.542529 | 0.558621 |
63 | 0.652422 | 0.564103 | 0.561254 | 0.535613 | 0.561254 | 0.521368 | 0.532764 | 0.541311 | 0.532764 | 0.566952 | 0.572650 |
64 | 0.653333 | 0.580000 | 0.570000 | 0.536667 | 0.560000 | 0.543333 | 0.540000 | 0.546667 | 0.540000 | 0.573333 | 0.583333 |
65 | 0.663043 | 0.586957 | 0.583333 | 0.547101 | 0.561594 | 0.550725 | 0.539855 | 0.547101 | 0.536232 | 0.576087 | 0.594203 |
More examples of plotting rich club values:
plot_rich_club(bundleGraphs, real_network="Real_Graph",figure_name="Rich_club_values", color=["#FF4400", "#00BBFF"])
Please, give your own location (path-to_file) to figure_name in order to save a figure.
Note: if location does not exist, we will notify you and try to automatically create necessary directories.
plot_rich_club(bundleGraphs, real_network="Real_Graph", figure_name="/home/pilot/GSoC/mynewdir1/Rich_Club_Values",
show_legend=False)
/home/pilot/anaconda3/lib/python3.6/site-packages/scona/helpers.py:25: UserWarning: The path - /home/pilot/GSoC/mynewdir1/Rich_Club_Values does not exist. But we will create this directory for you and store the figure there. "directory for you and store the figure there.".format(path_name))
Simply, do not create random graphs in GraphBundle
realGraph = scn.GraphBundle([H], ["Real_Graph"])
realGraph
{'Real_Graph': <scona.classes.BrainNetwork at 0x7fa66f8ce160>}
plot_rich_club(realGraph, real_network="Real_Graph", color=["green"])