使用兼容 NetworkX 的 API 进行图操作¶

GraphScope 支持使用兼容 NetworkX 的 API 进行图操作。本次教程参考了 tutorial in NetworkX 的组织方式来介绍这些 API。

In [ ]:

# Install graphscope package if you are NOT in the Playground

!pip3 install graphscope

In [ ]:

# Import the graphscope and graphscope networkx module.

import graphscope
import graphscope.nx as nx

graphscope.set_option(show_log=False)  # enable logging

创建图¶

创建一个空图，只需要简单地创建一个 Graph 对象。

In [ ]:

G = nx.Graph()

点¶

图 G 可以通过多种方式进行扩充。在 graphscope.nx 中，支持一些可哈希的 Python object 作为图的点, 其中包括 int，str，float，tuple，bool 对象。首先，我们从空图和简单的图操作开始，如下所示，你可以一次增加一个顶点，

In [ ]:

G.add_node(1)

也可以从任何可迭代的容器中增加顶点，例如一个列表

In [ ]:

G.add_nodes_from([2, 3])

你也可以通过格式为(node, node_attribute_dict)的二元组的容器，将点的属性和点一起添加，如下所示：

点属性我们将在后面进行讨论。

In [ ]:

G.add_nodes_from(
    [
        (4, {"color": "red"}),
        (5, {"color": "green"}),
    ]
)

一个图的节点也可以直接添加到另一个图中：

In [ ]:

H = nx.path_graph(10)
G.add_nodes_from(H)

经过上面的操作后，现在图 G 中包含了图 H 的节点。

In [ ]:

list(G.nodes)

In [ ]:

list(G.nodes.data())  # shows the node attributes

边¶

图 G 也可以一次增加一条边来进行扩充，

In [ ]:

G.add_edge(1, 2)
e = (2, 3)
G.add_edge(*e)  # unpack edge tuple*

In [ ]:

list(G.edges)

或者通过一次增加包含多条边的list，

In [ ]:

G.add_edges_from([(1, 2), (1, 3)])

In [ ]:

list(G.edges)

或者通过增加任意 ebunch 的边。 ebunch 表示任意一个可迭代的边元组的容器。一个边元组可以是一个只包含首尾两个顶点的二元组，例如 (1, 3) ，或者一个包含顶点和边属性字典的三元组，例如 (2, 3, {'weight': 3.1415})。

边属性我们会在后面进行讨论

In [ ]:

G.add_edges_from([(2, 3, {"weight": 3.1415})])

In [ ]:

list(G.edges.data())  # shows the edge arrtibutes

In [ ]:

G.add_edges_from(H.edges)

In [ ]:

list(G.edges)

用户也可以通过 .update(nodes, edges) 同时增加点和边

In [ ]:

G.update(edges=[(10, 11), (11, 12)], nodes=[10, 11, 12])

In [ ]:

list(G.nodes)

In [ ]:

list(G.edges)

当增加已存在的点或边时，这些点和边会被忽略，不会产生报错。如下所示，在去除掉所有的点和边之后，

In [ ]:

G.clear()

这里我们增加点和边，graphscope.nx 会忽略掉已经存在的点和边。

In [ ]:

G.add_edges_from([(1, 2), (1, 3)])
G.add_node(1)
G.add_edge(1, 2)
G.add_node("spam")  # adds node "spam"
G.add_nodes_from("spam")  # adds 4 nodes: 's', 'p', 'a', 'm'
G.add_edge(3, "m")

目前图 G 共包含8个顶点和3条边，可以使用如下所示方法进行查看：

In [ ]:

G.number_of_nodes()

In [ ]:

G.number_of_edges()

查看图的元素¶

我们可以查看图的顶点和边。可以使用四种基本的图属性来查看图元素：G.nodes，G.edges，G.adj 和 G.degree。这些属性都是 set-like 的视图，分别表示图中点，边，点邻居和度数。这些接口提供了一个只读的关于图结构的视图。这些视图也可以像字典一样，用户可以查看点和边的属性，然后通过方法 .items()，.data('span') 遍历数据属性。

用户可以指定使用一个特定的容器类型，而不是一个视图。这里我们使用了lists，然而sets, dicts, tuples和其他容器可能在其他情况下更合适。

In [ ]:

list(G.nodes)

In [ ]:

list(G.edges)

In [ ]:

list(G.adj[1])  # or list(G.neighbors(1))

In [ ]:

G.degree[1]  # the number of edges incident to 1

用户可以使用一个 nbunch 来查看一个点子集的边和度。一个 nbunch 可以是 None （表示全部节点），一个节点或者一个可迭代的顶点容器。

In [ ]:

G.edges([2, "m"])

In [ ]:

G.degree([2, 3])

删除图元素¶

用户可以使用类似增加节点和边的方式来从图中删除顶点和边。相关方法 Graph.remove_node(), Graph.remove_nodes_from(), Graph.remove_edge() 和 Graph.remove_edges_from(), 例如

In [ ]:

G.remove_node(2)
G.remove_nodes_from("spam")
list(G.nodes)

In [ ]:

list(G.edges)

In [ ]:

G.remove_edge(1, 3)
G.remove_edges_from([(1, 2), (2, 3)])
list(G.edges)

使用图构造函数来构建图¶

图对象并不一定需要以增量的方式构建 - 用户可以直接将图数据传给 Graph/DiGraph 的构造函数来构建图对象。当通过实例化一个图类来创建一个图结构时，用户可以使用多种格式来指定图数据，如下所示。

In [ ]:

G.add_edge(1, 2)
H = nx.DiGraph(G)  # create a DiGraph using the connections from G
list(H.edges())

In [ ]:

edgelist = [(0, 1), (1, 2), (2, 3)]
H = nx.Graph(edgelist)
list(H.edges)

访问边和邻居¶

除了通过 Graph.edges 和 Graph.adj 视图外，用户也可以通过下标来访问边和顶点的邻居；

In [ ]:

G = nx.Graph([(1, 2, {"color": "yellow"})])

In [ ]:

G[1]  # same as G.adj[1]

In [ ]:

G[1][2]

In [ ]:

G.edges[1, 2]

当边已经存在时，可以通过下标来获取或设置边的属性：

In [ ]:

G.add_edge(1, 3)
G[1][3]["color"] = "blue"
G.edges[1, 3]

In [ ]:

G.edges[1, 2]["color"] = "red"
G.edges[1, 2]

用户可以通过 G.adjacency(), 或 G.adj.items() 快速地查看所有点的 (节点，邻居）对。如下所示：

注意当图是无向图时，每条边会在遍历时出现两次。

In [ ]:

FG = nx.Graph()
FG.add_weighted_edges_from([(1, 2, 0.125), (1, 3, 0.75), (2, 4, 1.2), (3, 4, 0.375)])
for n, nbrs in FG.adj.items():
    for nbr, eattr in nbrs.items():
        wt = eattr["weight"]
        if wt < 0.5:
            print(f"({n}, {nbr}, {wt:.3})")

如下所示，可以方便地访问所有边和边的属性。

In [ ]:

for (u, v, wt) in FG.edges.data("weight"):
    if wt < 0.5:
        print(f"({u}, {v}, {wt:.3})")

添加图属性，顶点属性和边属性¶

属性如权重、标签、颜色等可以被attach到图、点或者边上。

每个图、节点和边都可以保存 key/value 属性，默认属性是空。属性可以通过 add_edge, add_node 或直接对属性字典进行操作来增加或修改属性。

图属性¶

在创建新图的时候定义图属性

In [ ]:

G = nx.Graph(day="Friday")
G.graph

或者在创建后修改图属性

In [ ]:

G.graph["day"] = "Monday"
G.graph

节点属性¶

可以使用 add_node(), add_nodes_from(), or G.nodes 等方法增加节点属性。

In [ ]:

G.add_node(1, time="5pm")
G.add_nodes_from([3], time="2pm")
G.nodes[1]

In [ ]:

G.nodes[1]["room"] = 714
G.nodes.data()

注意向 G.nodes 增加一个节点并不会真正增加节点到图中，如果需要增加新节点，应该使用 G.add_node(). 边的使用同样如此。

边属性¶

可以通过 add_edge(), add_edges_from() 或下标来增加或修改边属性。

In [ ]:

G.add_edge(1, 2, weight=4.7)
G.add_edges_from([(3, 4), (4, 5)], color="red")
G.add_edges_from([(1, 2, {"color": "blue"}), (2, 3, {"weight": 8})])
G[1][2]["weight"] = 4.7
G.edges[3, 4]["weight"] = 4.2

In [ ]:

G.edges.data()

特殊的属性如 weight 的值应该是数值型，因为一些需要带权重的边的算法会使用到这一属性。

抽取子图和边子图¶

graphscope.nx 支持通过传入一个点集或边集来抽取一个 deepcopy 的子图。

In [ ]:

G = nx.path_graph(10)
# induce a subgraph by nodes
H = G.subgraph([0, 1, 2])
list(H.nodes)

In [ ]:

list(H.edges)

In [ ]:

# induce a edge subgraph by edges
K = G.edge_subgraph([(1, 2), (3, 4)])
list(K.nodes)

In [ ]:

list(K.edges)

需要注意的是，这里抽取子图与NetworkX的实现有一些区别，NetworkX返回的是一个子图的视图，但 graphscope.nx 返回的子图是一个独立于原始图的子图或边子图。

图的拷贝¶

用户可以使用 to_directed 方法来获取一个图的有向表示。

In [ ]:

DG = G.to_directed()  # here would return a "deepcopy" directed representation of G.
list(DG.edges)

In [ ]:

# or with
DGv = G.to_directed(as_view=True)  # return a view.
list(DGv.edges)

In [ ]:

# or with
DG = nx.DiGraph(G)  # return a "deepcopy" of directed representation of G.
list(DG.edges)

或者通过 copy 方法得到一个图的拷贝。

In [ ]:

H = G.copy()  # return a view of copy
list(H.edges)

In [ ]:

# or with
H = G.copy(as_view=False)  # return a "deepcopy" copy
list(H.edges)

In [ ]:

# or with
H = nx.Graph(G)  # return a "deepcopy" copy
list(H.edges)

注意，graphscope.nx 不支持浅拷贝。

有向图¶

DiGraph 类提供了额外的方法和属性来指定有向边，如：DiGraph.out_edges, DiGraph.in_degree, DiGraph.predecessors(), DiGraph.successors() etc.

为了让算法方便地在两种图类型上运行，有向版本的 neighbors 等同于 successors() ，degree 返回 in_degree 和 out_degree 的和。

In [ ]:

DG = nx.DiGraph()
DG.add_weighted_edges_from([(1, 2, 0.5), (3, 1, 0.75)])

In [ ]:

DG.out_degree(1, weight="weight")

In [ ]:

DG.degree(1, weight="weight")

In [ ]:

list(DG.successors(1))

In [ ]:

list(DG.neighbors(1))

In [ ]:

list(DG.predecessors(1))

在 graphscope.nx 中，存在有些算法仅能用于有向图的分析，而另一些算法仅能用于无向图的分析。如果你想将一个有向图转化为无向图，你可以使用 Graph.to_undirected():

In [ ]:

H = DG.to_undirected()  # return a "deepcopy" of undirected represetation of DG.
list(H.edges)

In [ ]:

# or with
H = nx.Graph(DG)  # create an undirected graph H from a directed graph G
list(H.edges)

DiGraph 也可以通过 DiGraph.reverse() 来反转边。

In [ ]:

K = DG.reverse()  # retrun a "deepcopy" of reversed copy.
list(K.edges)

In [ ]:

# or with
K = DG.reverse(copy=False)  # return a view of reversed copy.
list(K.edges)

图分析¶

图 G 的结构可以通过使用各式各样的图理论函数进行分析，例如：

In [ ]:

G = nx.Graph()
G.add_edges_from([(1, 2), (1, 3)])
G.add_node(4)
sorted(d for n, d in G.degree())

In [ ]:

nx.builtin.clustering(G)

在 graphscope.nx 中，我们支持用于图分析的内置算法，算法的详细内容可以参考 builtin algorithm

通过 GraphScope graph object来创建图¶

除了通过networkx的方式创建图之外，我们也可以使用标准的 GraphScope 的方式创建图，这一部分将会在下一个教程中进行介绍，下面我们展示一个简单的示例：

In [ ]:

# we load a GraphScope graph with load_ldbc
from graphscope.dataset import load_ldbc

graph = load_ldbc(directed=False)

# create graph with the GraphScope graph object
G = nx.Graph(graph)

将图转化为graphscope.graph¶

正如同 graphscope.nx Graph 可以从 GraphScope graph 转化而来，graphscope.nx Graph也可以转化为 GraphScope graph. 例如：

In [ ]:

nodes = [(0, {"foo": 0}), (1, {"foo": 1}), (2, {"foo": 2})]
edges = [(0, 1, {"weight": 0}), (0, 2, {"weight": 1}), (1, 2, {"weight": 2})]
G = nx.Graph()
G.update(edges, nodes)
g = graphscope.g(G)

In [ ]: