Exercise 10.1 - Solution

Signal Classification using Dynamic Graph Convolutional Neural Networks

After a long journey through the universe before reaching the earth, the cosmic particles interact with the galactic magnetic field $B$. As these particles carry a charge $q$ they are deflected in the field by the Lorentz force $F = q \cdot v × B$. Sources of cosmic particles are located all over the sky, thus arrival distributions of the cosmic particles are isotropic in general. However, particles originating from the same source generate on top of the isotropic arrival directions, street-like patterns from galactic magnetic field deflections.

In this tasks we want to classify whether a simulated set of $500$ arriving cosmic particles contains street-like patterns (signal), or originates from an isotropic background.

Training graph networks can be computationally demanding, thus, we recommend to use a GPU for this task.

In [16]:
from tensorflow import keras
import numpy as np
from matplotlib import pyplot as plt

layers = keras.layers

print("keras", keras.__version__)
keras 2.4.0

Download EdgeConv Layer

In [2]:
import gdown
import os

url = "https://raw.githubusercontent.com/DeepLearningForPhysicsResearchBook/deep-learning-physics/main/edgeconv.py"
output = 'edgeconv.py'

if os.path.exists(output) == False:
    gdown.download(url, output, quiet=False)

from edgeconv import EdgeConv

Download Data

In [3]:
url = "https://drive.google.com/u/0/uc?export=download&confirm=HgGH&id=1XKN-Ik7BDyMWdQ230zWS2bNxXL3_9jZq"
output = 'cr_sphere.npz'

if os.path.exists(output) == False:
    gdown.download(url, output, quiet=True)
In [4]:
f = np.load(output)
n_train = 10000
x_train, x_test = f['data'][:-n_train], f['data'][-n_train:]
labels = keras.utils.to_categorical(f['label'], num_classes=2)
y_train, y_test = labels[:-n_train], labels[-n_train:]
In [5]:
print("x_train.shape", x_train.shape)
print("y_train.shape", y_train.shape)
x_train.shape (40000, 500, 4)
y_train.shape (40000, 2)
In [6]:
# define coordinates for very first EdgeConv
train_points, test_points = x_train[..., :3], x_test[..., :3]

# Use normalized Energy as features for convolutional layers
train_features, test_features = x_train[..., -1, np.newaxis], x_test[..., -1, np.newaxis]
train_features = np.concatenate([train_features, train_points], axis=-1)

train_input_data = [train_points, train_features]
test_input_data = [test_points, test_features]

Plot an example sky map

In [7]:
def scatter(v, c=None, zlabel="", title="", **kwargs):

    def vec2ang(v):
        x, y, z = np.asarray(v)
        phi = np.arctan2(y, x)
        theta = np.arctan2(z, (x * x + y * y) ** .5)
        return phi, theta

    lons, lats = vec2ang(v)
    lons = -lons
    fig = plt.figure(figsize=kwargs.pop('figsize', [12, 6]))
    ax = fig.add_axes([0.1, 0.1, 0.85, 0.9], projection="hammer")
    events = ax.scatter(lons, lats, c=c, s=12, lw=2)

    cbar = plt.colorbar(events, orientation='horizontal', shrink=0.85, pad=0.05, aspect=30, label=zlabel)
    plt.xticks(fontsize=16)
    plt.yticks(fontsize=16)
    return fig


test_id = 0
example_map = x_test[test_id]
fig = scatter(example_map[:, 0:3].T, c=example_map[:, 3], zlabel="Energy (normed)", title = "Event %i" % test_id)

Design DGCNN

Start with defining a kernel network

Design a kernel network. The input to the kernel network is the central pixel coordinate and the neighborhood pixel coordinates. Hint: using layers.BatchNormalization can help to stabilize the training process of a DGCNN.

You can make use of the code snippet below.

Note that the output of the DNN should be (None, nodes), where None is a placeholder for the batch size.

In this case, we perform subtraction and concatenate the result with the central pixel value to combine translational invariance with local information.

In [8]:
def kernel_nn(data, nodes=16):
    d1, d2 = data  # get xi ("central" pixel) and xj ("neighborhood" pixels)

    dif = layers.Subtract()([d1, d2])  # perform substraction for translational invariance
    x = layers.Concatenate(axis=-1)([d1, dif])  # add information on the absolute pixel value

    x = layers.Dense(nodes, use_bias=False, activation="relu")(x)
    x = layers.BatchNormalization()(x)

    x = layers.Dense(nodes, use_bias=False, activation="relu")(x)
    x = layers.BatchNormalization()(x)

    x = layers.Dense(nodes, use_bias=False, activation="relu")(x)
    x = layers.BatchNormalization()(x)
    return x

Build complete graph network model

In the first layer, it might be advantageous to choose the next neighbors using the coordinates of the cosmic ray but perform the convolution using their energies also. Thus, we input y = EdgeConv(...)[points_input, feats_input] into the first EdgeConv layer.
If we later want to perform a dynamic EdgeConv (we want to update the graph), we simply input z = EdgeConv(...)(y).

To specify the size of the "convolutional filter", make use of the next_neighbors argument (searches for $k$ next neighbors for each cosmic ray).

In [9]:
points_input = layers.Input((500, 3))
feats_input = layers.Input((500, 4))

x = EdgeConv(lambda a: kernel_nn(a, nodes=8), next_neighbors=8)([points_input, feats_input])  # conv with fixed graph
x = layers.Activation("relu")(x)
x = EdgeConv(lambda a: kernel_nn(a, nodes=16), next_neighbors=8)([points_input, x])  # conv with fixed graph
x = layers.Activation("relu")(x)
x = EdgeConv(lambda a: kernel_nn(a, nodes=32), next_neighbors=8)([x, x])  # conv with dynamic graph
x = layers.Activation("relu")(x)
x = layers.GlobalAveragePooling1D(name="embedding")(x)
out = layers.Dense(2, name="classification", activation="softmax")(x)

model = keras.models.Model([points_input, feats_input], out)
print(model.summary())
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 500, 3)]     0                                            
__________________________________________________________________________________________________
input_2 (InputLayer)            [(None, 500, 4)]     0                                            
__________________________________________________________________________________________________
edge_conv (EdgeConv)            (None, 500, 8)       288         input_1[0][0]                    
                                                                 input_2[0][0]                    
__________________________________________________________________________________________________
activation (Activation)         (None, 500, 8)       0           edge_conv[0][0]                  
__________________________________________________________________________________________________
edge_conv_1 (EdgeConv)          (None, 500, 16)      960         input_1[0][0]                    
                                                                 activation[0][0]                 
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 500, 16)      0           edge_conv_1[0][0]                
__________________________________________________________________________________________________
edge_conv_2 (EdgeConv)          (None, 500, 32)      3456        activation_1[0][0]               
                                                                 activation_1[0][0]               
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 500, 32)      0           edge_conv_2[0][0]                
__________________________________________________________________________________________________
embedding (GlobalAveragePooling (None, 32)           0           activation_2[0][0]               
__________________________________________________________________________________________________
classification (Dense)          (None, 2)            66          embedding[0][0]                  
==================================================================================================
Total params: 4,770
Trainable params: 4,434
Non-trainable params: 336
__________________________________________________________________________________________________
None

You can inspect the kernel network using:

In [10]:
model.layers[2].kernel_func.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_3 (InputLayer)            [(None, 8)]          0                                            
__________________________________________________________________________________________________
reshape (Reshape)               (None, 2, 4)         0           input_3[0][0]                    
__________________________________________________________________________________________________
split_layer (SplitLayer)        [(None, 1, 4), (None 0           reshape[0][0]                    
__________________________________________________________________________________________________
reshape_1 (Reshape)             (None, 4)            0           split_layer[0][0]                
__________________________________________________________________________________________________
reshape_2 (Reshape)             (None, 4)            0           split_layer[0][1]                
__________________________________________________________________________________________________
subtract (Subtract)             (None, 4)            0           reshape_1[0][0]                  
                                                                 reshape_2[0][0]                  
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 8)            0           reshape_1[0][0]                  
                                                                 subtract[0][0]                   
__________________________________________________________________________________________________
dense (Dense)                   (None, 8)            64          concatenate[0][0]                
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 8)            32          dense[0][0]                      
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 8)            64          batch_normalization[0][0]        
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 8)            32          dense_1[0][0]                    
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 8)            64          batch_normalization_1[0][0]      
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 8)            32          dense_2[0][0]                    
==================================================================================================
Total params: 288
Trainable params: 240
Non-trainable params: 48
__________________________________________________________________________________________________

The kernel network maps the energies an positions of 2 cosmic rays (the central and the neighbor comsic ray) to 8 features.

The kernel network in the third layer maps from 16 extracted features (of 2 cosmic rays) to 32 new features and looks like this:

In [11]:
model.layers[6].kernel_func.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_5 (InputLayer)            [(None, 32)]         0                                            
__________________________________________________________________________________________________
reshape (Reshape)               (None, 2, 16)        0           input_5[0][0]                    
__________________________________________________________________________________________________
split_layer (SplitLayer)        [(None, 1, 16), (Non 0           reshape[0][0]                    
__________________________________________________________________________________________________
reshape_1 (Reshape)             (None, 16)           0           split_layer[0][0]                
__________________________________________________________________________________________________
reshape_2 (Reshape)             (None, 16)           0           split_layer[0][1]                
__________________________________________________________________________________________________
subtract (Subtract)             (None, 16)           0           reshape_1[0][0]                  
                                                                 reshape_2[0][0]                  
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 32)           0           reshape_1[0][0]                  
                                                                 subtract[0][0]                   
__________________________________________________________________________________________________
dense (Dense)                   (None, 32)           1024        concatenate[0][0]                
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 32)           128         dense[0][0]                      
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 32)           1024        batch_normalization[0][0]        
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 32)           128         dense_1[0][0]                    
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 32)           1024        batch_normalization_1[0][0]      
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 32)           128         dense_2[0][0]                    
==================================================================================================
Total params: 3,456
Trainable params: 3,264
Non-trainable params: 192
__________________________________________________________________________________________________

Train the model

In [12]:
model.compile(loss="binary_crossentropy",
              optimizer=keras.optimizers.Adam(3E-3, decay=1E-4),
              metrics=['acc'])

If you don't have networkx or sklearn install it by executing:

In [13]:
history = model.fit(train_input_data, y_train, batch_size=64, epochs=4)
Epoch 1/4
625/625 [==============================] - 552s 879ms/step - loss: 0.4689 - acc: 0.7555
Epoch 2/4
625/625 [==============================] - 514s 822ms/step - loss: 0.1228 - acc: 0.9608
Epoch 3/4
625/625 [==============================] - 514s 822ms/step - loss: 0.0975 - acc: 0.9684
Epoch 4/4
625/625 [==============================] - 515s 824ms/step - loss: 0.0901 - acc: 0.9719

Visualization of the underlying graph

To inspect the changing neighborhood relation (we used a dynamic layer) of the nodes, we visualize the underlying graph structure.

Note that plotting may take some time, so be a bit patient.

To perform the relative complex plotting, we make use of networkx and sklearn.
If you don't have installed the packages yet, run the cell below.

In [14]:
import sys
!{sys.executable} -m pip install sklearn
!{sys.executable} -m pip install networkx
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: sklearn in /home/jonas/.local/lib/python3.6/site-packages (0.0)
Requirement already satisfied: scikit-learn in /home/jonas/.local/lib/python3.6/site-packages (from sklearn) (0.24.1)
Requirement already satisfied: scipy>=0.19.1 in /home/jonas/.local/lib/python3.6/site-packages (from scikit-learn->sklearn) (1.5.4)
Requirement already satisfied: joblib>=0.11 in /home/jonas/.local/lib/python3.6/site-packages (from scikit-learn->sklearn) (1.0.1)
Requirement already satisfied: numpy>=1.13.3 in /home/jonas/.local/lib/python3.6/site-packages (from scikit-learn->sklearn) (1.19.5)
Requirement already satisfied: threadpoolctl>=2.0.0 in /home/jonas/.local/lib/python3.6/site-packages (from scikit-learn->sklearn) (2.1.0)
WARNING: You are using pip version 21.0.1; however, version 21.1.2 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: networkx in /home/jonas/.local/lib/python3.6/site-packages (2.5.1)
Requirement already satisfied: decorator<5,>=4.3 in /home/jonas/.local/lib/python3.6/site-packages (from networkx) (4.4.2)
WARNING: You are using pip version 21.0.1; however, version 21.1.2 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
In [15]:
import tensorflow.keras.backend as K
from sklearn.neighbors import kneighbors_graph
import networkx as nx

edge_layers = [l for l in model.layers if "edge_conv" in l.name]
coord_mask = [np.sum(np.linalg.norm(inp_d[test_id], axis=-1)) == 500 for inp_d in train_input_data]
assert True in coord_mask, "For plotting the spherical graph at least one input has to have 3 dimensions XYZ"
fig, axes = plt.subplots(ncols=len(edge_layers), figsize=(5 * len(edge_layers), 5))

for i, e_layer in enumerate(edge_layers):
    points_in, feats_in = model.inputs
    coordinates = e_layer.get_input_at(0)
    functor = K.function(model.inputs, coordinates)
    sample_input = [inp[np.newaxis, test_id] for inp in train_input_data]

    if type(e_layer.input) == list:
        layer_points, layer_features = functor(sample_input)
    else:
        layer_points = functor(sample_input)

    layer_points = np.squeeze(layer_points)
    adj = kneighbors_graph(layer_points, e_layer.next_neighbors)
    g = nx.DiGraph(adj)

    for c, s in zip(coord_mask, sample_input):
        if c == True:
            pos = s
            break

    axes[i].set_title("Graph in %s" % e_layer.name)
    nx.draw(g, cmap=plt.get_cmap('viridis'), pos=pos.squeeze()[:, :-1],
            node_size=10, width=0.5, arrowsize=5, ax=axes[i])
    axes[i].axis('equal')