## Run GraphScope like NetworkX¶

Graphscope provides a set of graph analysis interfaces compatible with Networkx.

In this article, we will show how to use graphscope to perform graph analysis like Networkx.

### How does Networkx perform graph analysis ?¶

Usually, the graph analysis process of NetworkX starts with the construction of a graph.

In the following example, we create an empty graph first, and then expand the data through the interface of NetworkX.

In [ ]:
# Install graphscope package if you are NOT in the Playground

!pip3 install graphscope

In [ ]:
import networkx

In [ ]:
# Initialize an empty graph
G = networkx.Graph()

# Add edges (1, 2)and（1 3） by add_edges_from interface

# Add vertex "4" by add_node interface


Then we can query the graph information.

In [ ]:
# Query the number of vertices by number_of_nodes interface.
G.number_of_nodes()

In [ ]:
# Similarly, query the number of edges by number_of_edges interface.
G.number_of_edges()

In [ ]:
# Query the degree of each vertex by degree interface.
sorted(d for n, d in G.degree())


Finally, calling the builtin algorithm of NetworkX to analysis the graph G.

In [ ]:
# Run 'connected components' algorithm
list(networkx.connected_components(G))

In [ ]:
# Run 'clustering' algorithm
networkx.clustering(G)


### How to use NetworkX interface from GraphScope¶

Graph Building

To use NetworkX interface from graphscope, we just need to replace import networkx as nx with import graphscope.nx as nx.

Here we use nx.Graph() interace to create an empty undirected graph.

In [ ]:
import graphscope
graphscope.set_option(show_log=True)
import graphscope.nx as nx

# Initialize an empty graph
G = nx.Graph()


Just like operating NetworkX, you can add vertices by add_node add_nodes_from and add edges by add_edge add_edges_from.

In [ ]:
# Add one vertex by add_node interface

# Or add a batch of vertices from iterable list

G.add_nodes_from([(4, {"color": "red"}), (5, {"color": "green"})])

# Similarly, add one edge by add_edge interface
e = (2, 3)

# Or add a batch of edges from iterable list

G.add_edges_from([(1, 2), (2, 3, {'weight': 3.1415})])


Query Graph

Just like operating NetworkX, you can search the number of vertices/edge by number_of_nodes/number_of_edges interface, or query the neighbor of vertex by adj interface.

In [ ]:
# Query the number of vertices by number_of_nodes interface.
G.number_of_nodes()

In [ ]:
# Similarly, query the number of edges by number_of_edges interface.
G.number_of_edges()

In [ ]:
# list the vertices in graph G
list(G.nodes)

In [ ]:
# list the edges in graph G
list(G.edges)

In [ ]:
# query the nerghbors of vertex '1'

In [ ]:
# search the degree of vertex '1'
G.degree(1)


Delete

Just like operating NetworkX, you can remove vertices by remove_nodeor remove_nodes_from interface, and remove edges by remove_edge or remove_edges_from interface.

In [ ]:
# remove one vertex by remove_node interface
G.remove_node(5)
list(G.nodes)

In [ ]:
# remove a batch of vertices by remove_nodes_from interface
G.remove_nodes_from([4, 5])
list(G.nodes)

In [ ]:
# remove one edge by remove_edge interface
G.remove_edge(1, 2)
list(G.edges)

In [ ]:
# remove a batch of edges by remove_edges_from interface
G.remove_edges_from([(1, 3), (2, 3)])
list(G.edges)

In [ ]:
# query the number of vertices after removal
G.number_of_nodes()

In [ ]:
# query the number of edges after removal
G.number_of_edges()


Graph Analysis

The interface of graph analysis module in graphscope is also compatible with NetworkX.

In following examples, we use connected_components to analyze the connected components of the graph, use clustering to get the clustering coefficient of each vertex, and all_pairs_shortest_path to compute the shortest path between any two vertices.

In [ ]:
# Building graph
G = nx.Graph()

In [ ]:
# Run connected_components
list(nx.connected_components(G))

In [ ]:
# Run clustering
nx.clustering(G)

In [ ]:
# Run all_pairs_shortest_path
sp = dict(nx.all_pairs_shortest_path(G))
sp[3]


Graph Display

Like NetworkX, you can draw a graph by draw interface, which relies on the drawing function of 'Matplotlib'.

You should install matplotlib first if you are not in playground environment.

In [ ]:
!pip3 install matplotlib


In [ ]:
# Create a star graph with 5 vertices
G = nx.star_graph(5)

# Sraw
nx.draw(G, with_labels=True, font_weight='bold')


### The performance speed-up of GraphScope over NetworkX can reach up to several orders of magnitudes.¶

Let's see how much GraphScope improves the algorithm performance compared with NetworkX by a simple experiment.

We run clustering algorithm on twitter datasets.

In [ ]:
!wget https://raw.githubusercontent.com/GraphScope/gstest/master/twitter.e -P /tmp


In [ ]:
import os
import graphscope.nx as gs_nx
import networkx as nx

In [ ]:
# loading graph in NetworkX
)
type(g1)

In [ ]:
# Loading graph in GraphScope
)
type(g2)


Run algorithm and display time both in GraphScope and NetworkX.

In [ ]:
%%time
# GraphScope
ret_gs = gs_nx.clustering(g2)

In [ ]:
%%time
# NetworkX
ret_nx = nx.clustering(g1)

In [ ]:
# Result comparison
ret_gs == ret_nx

In [ ]: