We first design an input layer to build the user (item) attribute graph $\mathcal{A}_u$ ($\mathcal{A}_i$). We calculate two kinds of proximity scores between the nodes - preference proximity and attribute proximity (can be calculated with cosine similarity).
After calculating the overall proximity between two nodes, it becomes a natural choice to build a k-NN graph as adopted in (Monti, Bronstein, and Bresson 2017). Such a method will keep a fixed number of neighbors once the graph is constructed.
In the constructed attribute graph $\mathcal{A}_u$ and $\mathcal{A}_i$, each nodes has an attached multi-hot attribute encoding and a unique one-hot representation denoting its identity. Due to the huge number of users and items in the web-scale recommender systems, the dimensionality of nodes’ one-hot representation is extremely high. Moreover, the multi-hot attribute representation simply combines multiple types of attributes into one long vector without considering their interactive relations. The goal of interaction layer is to reduce the dimensionality for one-hot identity representation and learn the high-order attribute interactions for multi-hot attribute representation. To this end, we first set up a lookup table to transform a node’s one-hot representation into the low-dimensional dense vector. The lookup layers correspond to two parameter matrices $M \in \mathbb{R}^{M×D}$ and $N \in \mathbb{R}^{ N×D}$. Each entry $m_u \in \mathbb{R}^D$ and $n_i \in \mathbb{R}^D$ encodes the user $u$’s preference and the item $i$’s property, respectively. Note that $m_u$ and $n_i$ for cold start nodes are meaningless, since no interaction is observed to train their preference embedding. Inspired by (He and Chua 2017), we capture the high-order attribute interactions with a *Bi-Interactive pooling operation*, in addition to the linear combination operation.
Intuitively, different neighbors have different relations to a node. Furthermore, one neighbor usually has multiple attributes. For example, in a social network, a user’s neighborhood may consist of classmates, family members, colleagues, and so on, and each neighbor may have several attributes such as age, gender, and occupation. Since all these attributes (along with the preferences) are now encoded in the node’s embedding, it is necessary to pay different attentions to different dimensions of the neighbor node’s embedding. However, existing GCN (Kipf and Welling 2017) or GAT (Veliˇckovi´c et al. 2018) structures cannot do this because they are at the coarse granularity. GCN treats all neighbors equally and GAT differentiates the importance of neighbors at the node level. To solve this problem, we design a gated-GNN structure to aggregate the fine-grained neighbor information.
Given a user $u$’s final representation $\tilde{p}_u$ and an item $i$’s final representation $\tilde{q}_i$ after the gated-GNN layer, we model the predicted rating of the user $u$ to the item $i$ as:
$$\hat{R}_{u,i} = MLP([\tilde{p}_u; \tilde{q}_i]) + \tilde{p}_u\tilde{q}_i^T + b_u + b_i + \mu,$$where the MLP function is the multilayer perceptron implemented with one hidden layer, and $b_u$, $b_i$ , and $\mu$ denotes user bias, item bias, and global bias, respectively. The second term is inner product interaction function (Koren, Bell, and Volinsky 2009), and we add the first term to capture the complicated nonlinear interaction between the user and the item.
The cold start problem is caused by the lack of historical interactions for cold start nodes. We view this as a missing preference problem, and solve it by employing the variational autoencoder structure to reconstruct the preference from the attribute distribution.
For the rating prediction loss, we employ the square loss as the objective function:
$$L_{pred} = \sum_{u,i \in \mathcal{T}} (\hat{R}_{u,i} - R_{u,i})^2,$$where $\mathcal{T}$ denotes the set of instances for training, i.e., $\mathcal{T} = {(u, i, r_{u,i}, a_u, a_i)}$, $R_{u,i}$ is ground truth rating in the training set $\mathcal{T}$ , and $\hat{R}_{u,i}$ is the predicted rating.
The reconstruction loss function in eVAE is defined as follows:
$$L_{recon} = − KL(q_\phi(z_u|x_u)||p(z_u)) + \mathbb{E}_{q_\phi}(z_u|x_u)[\log p_θ(x'_u|z_u)] + ||x'_u − m_u||_2,$$where the first two terms are same as those in standard VAE, and the last one is our extension for the approximation part.
The overall loss function then becomes:
$$L = L_{pred} + L_{recon},$$where $L_{pred}$ is the task-specific rating prediction loss, and $L_{recon}$ is the reconstruction loss.
import torch
import torch.nn as nn
import torch.nn.init as init
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable
from torch.utils.tensorboard import SummaryWriter
import numpy as np
import pandas as pd
from torch.utils.data import Dataset, DataLoader
import os
import json
import time
import pickle
import argparse
from collections import OrderedDict
import warnings
warnings.filterwarnings('ignore')
parser = argparse.ArgumentParser()
parser.add_argument("--lr", default=0.0005, type=float,
help="learning rate.")
parser.add_argument("--dropout", default=0.5, type=float,
help="dropout rate.")
parser.add_argument("--batch_size", default=128, type=int,
help="batch size when training.")
parser.add_argument("--gpu", default="0", type=str,
help="gpu card ID.")
parser.add_argument("--epochs", default=20, type=str,
help="training epoches.")
parser.add_argument("--clip_norm", default=5.0, type=float,
help="clip norm for preventing gradient exploding.")
parser.add_argument("--embed_size", default=30, type=int, help="embedding size for users and items.")
parser.add_argument("--attention_size", default=50, type=int, help="embedding size for users and items.")
parser.add_argument("--item_layer1_nei_num", default=10, type=int)
parser.add_argument("--user_layer1_nei_num", default=10, type=int)
parser.add_argument("--vae_lambda", default=1, type=int)
_StoreAction(option_strings=['--vae_lambda'], dest='vae_lambda', nargs=None, const=None, default=1, type=<class 'int'>, choices=None, help=None, metavar=None)
!wget -q --show-progress https://github.com/sparsh-ai/coldstart-recsys/raw/main/data/AGNN/ml100k.zip
!unzip ml100k.zip
ml100k.zip 100%[===================>] 15.31M 96.6MB/s in 0.2s Archive: ml100k.zip creating: ml100k/ inflating: ml100k/neighbor_aspect_extension_2_zscore_warm_uuii.pkl inflating: ml100k/uiinfo.pkl inflating: ml100k/ics_train.dat inflating: ml100k/warm_val.dat inflating: ml100k/neighbor_aspect_extension_2_zscore_ics_uuii_0.20.pkl inflating: ml100k/ucs_train.dat inflating: ml100k/ucs_val.dat inflating: ml100k/neighbor_aspect_extension_2_zscore_ucs_uuii.pkl inflating: ml100k/ics_val.dat inflating: ml100k/warm_train.dat creating: ml100k/source_data/ inflating: ml100k/source_data/item_content.dat extracting: ml100k/source_data/ml-100k.zip inflating: ml100k/source_data/item_url_all.dat extracting: ml100k/source_data/__
def get_data_list(ftrain, batch_size):
f = open(ftrain, 'r')
train_list = []
for eachline in f:
eachline = eachline.strip().split('\t')
u, i, l = int(eachline[0]), int(eachline[1]), float(eachline[2])
train_list.append([u, i, l])
num_batches_per_epoch = int((len(train_list) - 1) / batch_size) + 1
return num_batches_per_epoch, train_list
def get_batch_instances(train_list, user_feature_dict, item_feature_dict, item_director_dict, item_writer_dict, item_star_dict, item_country_dict, batch_size, user_nei_dict, item_nei_dict, shuffle=True):
num_batches_per_epoch = int((len(train_list) - 1) / batch_size) + 1
def data_generator(train_list):
data_size = len(train_list)
user_feature_arr = np.array(list(user_feature_dict.values()))
max_user_cate_size = user_feature_arr.shape[1]
item_genre_arr = np.array(list(item_feature_dict.values())) #len=6 ,0
item_director_arr = np.array(list(item_director_dict.values())) #len=3 ,6
item_writer_arr = np.array(list(item_writer_dict.values())) #len=3, 9
item_star_arr = np.array(list(item_star_dict.values())) #len=3, 12
item_country_arr = np.array(list(item_country_dict.values())) #len=8, 15
item_feature_arr = np.concatenate([item_genre_arr, item_director_arr, item_writer_arr, item_star_arr, item_country_arr], axis=1)
max_item_cate_size = item_feature_arr.shape[1]
item_layer1_nei_num = FLAGS.item_layer1_nei_num
user_layer1_nei_num = FLAGS.user_layer1_nei_num
if shuffle == True:
np.random.shuffle(train_list)
train_list = np.array(train_list)
for batch_num in range(num_batches_per_epoch):
start_index = batch_num * batch_size
end_index = min((batch_num + 1) * batch_size, data_size)
current_batch_size = end_index - start_index
u = train_list[start_index: end_index][:, 0].astype(np.int)
i = train_list[start_index: end_index][:, 1].astype(np.int)
l = train_list[start_index: end_index][:, 2]
i_self_cate = np.zeros([current_batch_size, max_item_cate_size], dtype=np.int)
i_onehop_id = np.zeros([current_batch_size, item_layer1_nei_num], dtype=np.int)
i_onehop_cate = np.zeros([current_batch_size, item_layer1_nei_num, max_item_cate_size], dtype=np.int)
u_self_cate = np.zeros([current_batch_size, max_user_cate_size], dtype=np.int)
u_onehop_id = np.zeros([current_batch_size, user_layer1_nei_num], dtype=np.int)
u_onehop_cate = np.zeros([current_batch_size, user_layer1_nei_num, max_user_cate_size], dtype=np.int)
for index, each_i in enumerate(i):
i_self_cate[index] = item_feature_arr[each_i] #item_self_cate
tmp_one_nei = item_nei_dict[each_i][0]
tmp_prob = item_nei_dict[each_i][1]
if len(tmp_one_nei) > item_layer1_nei_num: #re-sampling
tmp_one_nei = np.random.choice(tmp_one_nei, item_layer1_nei_num, replace=False, p=tmp_prob)
elif len(tmp_one_nei) < item_layer1_nei_num:
tmp_one_nei = np.random.choice(tmp_one_nei, item_layer1_nei_num, replace=True, p=tmp_prob)
tmp_one_nei[-1] = each_i
i_onehop_id[index] = tmp_one_nei #item_1_neigh
i_onehop_cate[index] = item_feature_arr[tmp_one_nei] #item_1_neigh_cate
for index, each_u in enumerate(u):
u_self_cate[index] = user_feature_dict[each_u] # item_self_cate
tmp_one_nei = user_nei_dict[each_u][0]
tmp_prob = user_nei_dict[each_u][1]
if len(tmp_one_nei) > user_layer1_nei_num: # re-sampling
tmp_one_nei = np.random.choice(tmp_one_nei, user_layer1_nei_num, replace=False, p=tmp_prob)
elif len(tmp_one_nei) < user_layer1_nei_num:
tmp_one_nei = np.random.choice(tmp_one_nei, user_layer1_nei_num, replace=True, p=tmp_prob)
tmp_one_nei[-1] = each_u
u_onehop_id[index] = tmp_one_nei # user_1_neigh
u_onehop_cate[index] = user_feature_arr[tmp_one_nei] # user_1_neigh_cate
yield ([u, i, l, u_self_cate, u_onehop_id, u_onehop_cate, i_self_cate, i_onehop_id, i_onehop_cate])
return data_generator(train_list)
class VAE(nn.Module):
def __init__(self, embed_size):
super(VAE, self).__init__()
Z_dim = X_dim = h_dim = embed_size
self.Z_dim = Z_dim
self.X_dim= X_dim
self.h_dim = h_dim
self.embed_size= embed_size
def init_weights(m):
if isinstance(m, nn.Linear):
nn.init.xavier_uniform(m.weight)
if m.bias is not None:
nn.init.constant(m.bias, 0)
# =============================== Q(z|X) ======================================
self.dense_xh = nn.Linear(X_dim, h_dim)
init_weights(self.dense_xh)
self.dense_hz_mu = nn.Linear(h_dim, Z_dim)
init_weights(self.dense_hz_mu)
self.dense_hz_var = nn.Linear(h_dim, Z_dim)
init_weights(self.dense_hz_var)
# =============================== P(X|z) ======================================
self.dense_zh = nn.Linear(Z_dim, h_dim)
init_weights(self.dense_zh)
self.dense_hx = nn.Linear(h_dim, X_dim)
init_weights(self.dense_hx)
def Q(self, X):
h = nn.ReLU()(self.dense_xh(X))
z_mu = self.dense_hz_mu(h)
z_var = self.dense_hz_var(h)
return z_mu, z_var
def sample_z(self, mu, log_var):
mb_size = mu.shape[0]
eps = Variable(torch.randn(mb_size, self.Z_dim)).cuda()
return mu + torch.exp(log_var / 2) * eps
def P(self, z):
h = nn.ReLU()(self.dense_zh(z))
X = self.dense_hx(h)
return X
class AGNN(torch.nn.Module):
def __init__(self, user_size, item_size, gender_size, age_size, occupation_size, genre_size, director_size, writer_size, star_size, country_size, embed_size, attention_size, dropout):
super(AGNN, self).__init__()
self.user_size = user_size
self.item_size = item_size
self.gender_size = gender_size
self.age_size = age_size
self.occupation_size = occupation_size
self.genre_size = genre_size
self.director_size = director_size
self.writer_size = writer_size
self.star_size = star_size
self.country_size = country_size
self.embed_size = embed_size
self.dropout = dropout
self.attention_size = attention_size
def init_weights(m):
if isinstance(m, nn.Linear):
nn.init.xavier_uniform(m.weight)
if m.bias is not None:
nn.init.constant(m.bias, 0)
self.user_embed = torch.nn.Embedding(self.user_size, self.embed_size)
self.item_embed = torch.nn.Embedding(self.item_size, self.embed_size)
nn.init.xavier_uniform(self.user_embed.weight)
nn.init.xavier_uniform(self.item_embed.weight)
self.user_bias = torch.nn.Embedding(self.user_size, 1)
self.item_bias = torch.nn.Embedding(self.item_size, 1)
nn.init.constant(self.user_bias.weight, 0)
nn.init.constant(self.item_bias.weight, 0)
self.miu = torch.nn.Parameter(torch.zeros(1), requires_grad=True)
self.gender_embed = torch.nn.Embedding(self.gender_size, self.embed_size)
self.gender_embed.weight.data.normal_(0, 0.05)
self.age_embed = torch.nn.Embedding(self.age_size, self.embed_size)
self.age_embed.weight.data.normal_(0, 0.05)
self.occupation_embed = torch.nn.Embedding(self.occupation_size, self.embed_size)
self.occupation_embed.weight.data.normal_(0, 0.05)
self.genre_embed = torch.nn.Embedding(self.genre_size, self.embed_size)
self.genre_embed.weight.data.normal_(0, 0.05)
self.director_embed = torch.nn.Embedding(self.director_size, self.embed_size)
self.director_embed.weight.data.normal_(0, 0.05)
self.writer_embed = torch.nn.Embedding(self.writer_size, self.embed_size)
self.writer_embed.weight.data.normal_(0, 0.05)
self.star_embed = torch.nn.Embedding(self.star_size, self.embed_size)
self.star_embed.weight.data.normal_(0, 0.05)
self.country_embed = torch.nn.Embedding(self.country_size, self.embed_size)
self.country_embed.weight.data.normal_(0, 0.05)
#--------------------------------------------------
self.dense_item_self_biinter = nn.Linear(self.embed_size, self.embed_size)
self.dense_item_self_siinter = nn.Linear(self.embed_size, self.embed_size)
self.dense_item_onehop_biinter = nn.Linear(self.embed_size, self.embed_size)
self.dense_item_onehop_siinter = nn.Linear(self.embed_size, self.embed_size)
self.dense_user_self_biinter = nn.Linear(self.embed_size, self.embed_size)
self.dense_user_self_siinter = nn.Linear(self.embed_size, self.embed_size)
self.dense_user_onehop_biinter = nn.Linear(self.embed_size, self.embed_size)
self.dense_user_onehop_siinter = nn.Linear(self.embed_size, self.embed_size)
init_weights(self.dense_item_self_biinter)
init_weights(self.dense_item_self_siinter)
init_weights(self.dense_item_onehop_biinter)
init_weights(self.dense_item_onehop_siinter)
init_weights(self.dense_user_self_biinter)
init_weights(self.dense_user_self_siinter)
init_weights(self.dense_user_onehop_biinter)
init_weights(self.dense_user_onehop_siinter)
self.dense_item_cate_self = nn.Linear(2 * self.embed_size, self.embed_size)
self.dense_item_cate_hop1 = nn.Linear(2 * self.embed_size, self.embed_size)
self.dense_user_cate_self = nn.Linear(2 * self.embed_size, self.embed_size)
self.dense_user_cate_hop1 = nn.Linear(2 * self.embed_size, self.embed_size)
init_weights(self.dense_item_cate_self)
init_weights(self.dense_item_cate_hop1)
init_weights(self.dense_user_cate_self)
init_weights(self.dense_user_cate_hop1)
self.dense_item_addgate = nn.Linear(self.embed_size * 2, self.embed_size)
init_weights(self.dense_item_addgate)
self.dense_item_erasegate = nn.Linear(self.embed_size * 2, self.embed_size)
init_weights(self.dense_item_erasegate)
self.dense_user_addgate = nn.Linear(self.embed_size * 2, self.embed_size)
init_weights(self.dense_user_addgate)
self.dense_user_erasegate = nn.Linear(self.embed_size * 2, self.embed_size)
self.user_vae = VAE(embed_size)
self.item_vae = VAE(embed_size)
#----------------------------------------------------
#concat, mlp
self.FC_pre = nn.Linear(2 * embed_size, 1)
init_weights(self.FC_pre)
"""# dot
self.user_bias = nn.Embedding(self.user_size, 1)
self.item_bias = nn.Embedding(self.item_size, 1)
self.user_bias.weight.data.normal_(0, 0.01)
self.item_bias.weight.data.normal_(0, 0.01)
self.bias = torch.nn.Parameter(torch.rand(1), requires_grad=True)
self.bias.data.uniform_(0, 0.1)"""
self.sigmoid = nn.Sigmoid()
self.tanh = nn.Tanh()
self.relu = nn.ReLU()
self.leakyrelu = nn.LeakyReLU()
self.dropout = nn.Dropout(p=0.2)
def feat_interaction(self, feature_embedding, fun_bi, fun_si, dimension):
summed_features_emb_square = (torch.sum(feature_embedding, dim=dimension)).pow(2)
squared_sum_features_emb = torch.sum(feature_embedding.pow(2), dim=dimension)
deep_fm = 0.5 * (summed_features_emb_square - squared_sum_features_emb)
deep_fm = self.leakyrelu(fun_bi(deep_fm))
bias_fm = self.leakyrelu(fun_si(feature_embedding.sum(dim=dimension)))
nfm = deep_fm + bias_fm
return nfm
def forward(self, user, item, user_self_cate, user_onehop_id, user_onehop_cate, item_self_cate, item_self_director, item_self_writer, item_self_star, item_self_country, item_onehop_id, item_onehop_cate, item_onehop_director, item_onehop_writer, item_onehop_star, item_onehop_country, mode='train'):
uids_list = user.cuda()
sids_list = item.cuda()
if mode == 'train' or mode == 'warm':
user_embedding = self.user_embed(torch.autograd.Variable(uids_list))
item_embedding = self.item_embed(torch.autograd.Variable(sids_list))
if mode == 'ics':
user_embedding = self.user_embed(torch.autograd.Variable(uids_list))
if mode == 'ucs':
item_embedding = self.item_embed(torch.autograd.Variable(sids_list))
batch_size = item_self_cate.shape[0]
cate_size = item_self_cate.shape[1]
director_size = item_self_director.shape[1]
writer_size = item_self_writer.shape[1]
star_size = item_self_star.shape[1]
country_size = item_self_country.shape[1]
user_onehop_size = user_onehop_id.shape[1]
item_onehop_size = item_onehop_id.shape[1]
#------------------------------------------------------GCN-item
# K=2
item_onehop_id = self.item_embed(Variable(item_onehop_id))
item_onehop_cate = self.genre_embed(Variable(item_onehop_cate).view(-1, cate_size)).view(batch_size,item_onehop_size,cate_size, -1)
item_onehop_director = self.director_embed(Variable(item_onehop_director).view(-1, director_size)).view(batch_size, item_onehop_size, director_size, -1)
item_onehop_writer = self.writer_embed(Variable(item_onehop_writer).view(-1, writer_size)).view(batch_size, item_onehop_size, writer_size, -1)
item_onehop_star = self.star_embed(Variable(item_onehop_star).view(-1, star_size)).view(batch_size, item_onehop_size, star_size, -1)
item_onehop_country = self.country_embed(Variable(item_onehop_country).view(-1, country_size)).view(batch_size, item_onehop_size, country_size, -1)
item_onehop_feature = torch.cat([item_onehop_cate, item_onehop_director, item_onehop_writer, item_onehop_star, item_onehop_country], dim=2)
item_onehop_embed = self.dense_item_cate_hop1(torch.cat([self.feat_interaction(item_onehop_feature, self.dense_item_onehop_biinter, self.dense_item_onehop_siinter, dimension=2), item_onehop_id], dim=-1))
# K=1
item_self_cate = self.genre_embed(Variable(item_self_cate))
item_self_director = self.director_embed(Variable(item_self_director))
item_self_writer = self.writer_embed(Variable(item_self_writer))
item_self_star = self.star_embed(Variable(item_self_star))
item_self_country = self.country_embed(Variable(item_self_country))
item_self_feature = torch.cat([item_self_cate, item_self_director, item_self_writer, item_self_star, item_self_country], dim=1)
item_self_feature = self.feat_interaction(item_self_feature, self.dense_item_self_biinter, self.dense_item_self_siinter, dimension=1)
if mode == 'ics':
item_mu, item_var = self.item_vae.Q(item_self_feature)
item_z = self.item_vae.sample_z(item_mu, item_var)
item_embedding = self.item_vae.P(item_z)
item_self_embed = self.dense_item_cate_self(torch.cat([item_self_feature, item_embedding], dim=-1))
item_addgate = self.sigmoid(self.dense_item_addgate(torch.cat([item_self_embed.unsqueeze(1).repeat(1, item_onehop_size, 1), item_onehop_embed], dim=-1))) # 商品的邻居门,控制邻居信息多少作为输入
item_erasegate = self.sigmoid(self.dense_item_erasegate(torch.cat([item_self_embed, item_onehop_embed.mean(dim=1)], dim=-1)))
item_onehop_embed_final = (item_onehop_embed * item_addgate).mean(1)
item_self_embed = (1 - item_erasegate) * item_self_embed
item_gcn_embed = self.leakyrelu(item_self_embed + item_onehop_embed_final) # [batch, embed]
#----------------------------------------------------------GCN-user
# K=2
user_onehop_id = self.user_embed(Variable(user_onehop_id))
user_onehop_gender_emb = self.gender_embed(Variable(user_onehop_cate[:, :, 0]))
user_onehop_age_emb = self.age_embed(Variable(user_onehop_cate[:, :, 1]))
user_onehop_occupation_emb = self.occupation_embed(Variable(user_onehop_cate[:, :, 2]))
user_onehop_feat = torch.cat([user_onehop_gender_emb.unsqueeze(2), user_onehop_age_emb.unsqueeze(2), user_onehop_occupation_emb.unsqueeze(2)], dim=2)
user_onehop_embed = self.dense_user_cate_hop1(torch.cat([self.feat_interaction(user_onehop_feat, self.dense_user_onehop_biinter, self.dense_user_onehop_siinter, dimension=2), user_onehop_id], dim=-1))
# K=1
user_gender_emb = self.gender_embed(Variable(user_self_cate[:, 0]))
user_age_emb = self.age_embed(Variable(user_self_cate[:, 1]))
user_occupation_emb = self.occupation_embed(Variable(user_self_cate[:, 2]))
user_self_feature = torch.cat([user_gender_emb.unsqueeze(1), user_age_emb.unsqueeze(1), user_occupation_emb.unsqueeze(1)], dim=1)
user_self_feature = self.feat_interaction(user_self_feature, self.dense_user_self_biinter, self.dense_user_onehop_siinter, dimension=1)
if mode == 'ucs':
user_mu, user_var = self.user_vae.Q(user_self_feature)
user_z = self.user_vae.sample_z(user_mu, user_var)
user_embedding = self.user_vae.P(user_z)
user_self_embed = self.dense_user_cate_self(torch.cat([user_self_feature, user_embedding], dim=-1))
user_addgate = self.sigmoid(self.dense_user_addgate(torch.cat([user_self_embed.unsqueeze(1).repeat(1, user_onehop_size, 1), user_onehop_embed],dim=-1)))
user_erasegate = self.sigmoid(self.dense_user_erasegate(torch.cat([user_self_embed, user_onehop_embed.mean(dim=1)], dim=-1)))
user_onehop_embed_final = (user_onehop_embed * user_addgate).mean(dim=1)
user_self_embed = (1 - user_erasegate) * user_self_embed
user_gcn_embed = self.leakyrelu(user_self_embed + user_onehop_embed_final)
#--------------------------------------------------norm
item_mu, item_var = self.item_vae.Q(item_self_feature)
item_z = self.item_vae.sample_z(item_mu, item_var)
item_preference_sample = self.item_vae.P(item_z)
user_mu, user_var = self.user_vae.Q(user_self_feature)
user_z = self.user_vae.sample_z(user_mu, user_var)
user_preference_sample = self.user_vae.P(user_z)
recon_loss = torch.norm(item_preference_sample - item_embedding) + torch.norm(user_preference_sample - user_embedding)
kl_loss = torch.mean(0.5 * torch.sum(torch.exp(item_z) + item_mu ** 2 - 1. - item_var, 1)) + \
torch.mean(0.5 * torch.sum(torch.exp(user_z) + user_mu ** 2 - 1. - user_var, 1))
####################################prediction#####################################################
#concat -> mlp
bu = self.user_bias(Variable(uids_list))
bi = self.item_bias(Variable(sids_list))
#pred = (user_gcn_embed * item_gcn_embed).sum(1, keepdim=True) + bu + bi + (self.miu).repeat(batch_size, 1)
tmp = torch.cat([user_gcn_embed, item_gcn_embed], dim=1)
pred = self.FC_pre(tmp) + (user_gcn_embed * item_gcn_embed).sum(1, keepdim=True) + bu + bi + (self.miu).repeat(batch_size, 1)
return pred.squeeze(), recon_loss, kl_loss
def metrics(model, test_dataloader):
label_lst, pred_lst = [], []
rmse, mse, mae = 0,0,0
count = 0
for batch_data in test_dataloader:
user = torch.LongTensor(batch_data[0]).cuda()
item = torch.LongTensor(batch_data[1]).cuda()
label = torch.FloatTensor(batch_data[2]).cuda()
user_self_cate = torch.LongTensor(batch_data[3]).cuda()
user_onehop_id = torch.LongTensor(batch_data[4]).cuda()
user_onehop_cate = torch.LongTensor(batch_data[5]).cuda()
item_self_cate, item_self_director, item_self_writer, item_self_star, item_self_country = torch.LongTensor(
batch_data[6])[:, 0:6].cuda(), torch.LongTensor(batch_data[6])[:, 6:9].cuda(), torch.LongTensor(
batch_data[6])[:, 9:12].cuda(), torch.LongTensor(batch_data[6])[:, 12:15].cuda(), torch.LongTensor(
batch_data[6])[:, 15:].cuda()
item_onehop_id = torch.LongTensor(batch_data[7]).cuda()
item_onehop_cate, item_onehop_director, item_onehop_writer, item_onehop_star, item_onehop_country = torch.LongTensor(
batch_data[8])[:, :, 0:6].cuda(), torch.LongTensor(batch_data[8])[:, :, 6:9].cuda(), torch.LongTensor(
batch_data[8])[:, :, 9:12].cuda(), torch.LongTensor(batch_data[8])[:, :, 12:15].cuda(), torch.LongTensor(
batch_data[8])[:, :, 15:].cuda()
prediction, recon_loss, kl_loss = model(user, item, user_self_cate, user_onehop_id, user_onehop_cate, item_self_cate,
item_self_director, item_self_writer, item_self_star, item_self_country, item_onehop_id,
item_onehop_cate, item_onehop_director, item_onehop_writer, item_onehop_star,
item_onehop_country, mode = mode)
prediction = prediction.cpu().data.numpy()
prediction = prediction.reshape(prediction.shape[0])
label = label.cpu().numpy()
my_rmse = np.sum((prediction - label) ** 2)
my_mse = np.sum((prediction - label) ** 2)
my_mae = np.sum(np.abs(prediction - label))
# my_rmse = torch.sqrt(torch.sum((prediction - label) ** 2) / FLAGS.batch_size)
rmse+=my_rmse
mse+=my_mse
mae+=my_mae
count += len(user)
label_lst.extend(list([float(l) for l in label]))
pred_lst.extend(list([float(l) for l in prediction]))
my_mse = mse/count
my_rmse = np.sqrt(rmse/count)
my_mae = mae/count
return my_rmse, my_mse, my_mae, label_lst, pred_lst
if __name__ == '__main__':
#item cold start
f_info = 'ml100k/uiinfo.pkl'
f_neighbor = 'ml100k/neighbor_aspect_extension_2_zscore_ics_uuii_0.20.pkl'
f_train = 'ml100k/ics_train.dat'
f_test = 'ml100k/ics_val.dat'
f_model = 'ml100k/agnn_ics_'
mode = 'ics'
"""# user cold start
f_info = 'ml100k/uiinfo.pkl'
f_neighbor = 'ml100k/neighbor_aspect_extension_2_zscore_ucs_uuii.pkl'
f_train = 'ml100k/ucs_train.dat'
f_test = 'ml100k/ucs_val.dat'
f_model = 'ml100k/agnn_ucs_'
mode = 'ucs'"""
"""# warm start
f_info = 'ml100k/uiinfo.pkl'
f_neighbor = 'ml100k/neighbor_aspect_extension_2_zscore_warm_uuii.pkl'
f_train = 'ml100k/warm_train.dat'
f_test = 'ml100k/warm_val.dat'
f_model = 'ml100k/agnn_warm_'
mode = 'warm'"""
FLAGS = parser.parse_args(args={})
print("\nParameters:")
print(FLAGS.__dict__)
with open(f_neighbor, 'rb') as f:
neighbor_dict = pickle.load(f)
user_nei_dict = neighbor_dict['user_nei_dict']
item_nei_dict = neighbor_dict['item_nei_dict']
director_num = neighbor_dict['director_num']
writer_num = neighbor_dict['writer_num']
star_num = neighbor_dict['star_num']
country_num = neighbor_dict['country_num']
item_director_dict = neighbor_dict['item_director_dict'] #dict[i]=[x,x,x]
item_writer_dict = neighbor_dict['item_writer_dict'] #dict[i]=[x,x,x]
item_star_dict = neighbor_dict['item_star_dict'] #dict[i]=[x,x,x]
item_country_dict = neighbor_dict['item_country_dict'] #dict[i]=[x,x,x,x,x,x,x,x]
with open(f_info, 'rb') as f:
item_info = pickle.load(f)
user_num = item_info['user_num']
item_num = item_info['item_num']
gender_num = item_info['gender_num']
age_num = item_info['age_num']
occupation_num = item_info['occupation_num']
genre_num = item_info['genre_num']
user_feature_dict = item_info['user_feature_dict'] #gender, age, occupation dict[u]=[x,x,x]
item_feature_dict = item_info['item_feature_dict'] #genre dict[i]=[x,x,x,x,x,x]
print("user_num {}, item_num {}, gender_num {}, age_num {}, occupation_num {}, genre_num {}, director_num {}, writer_num {}, star_num {}, country_num {}, mode {} ".format(user_num, item_num, gender_num, age_num, occupation_num, genre_num, director_num, writer_num, star_num, country_num, mode))
train_steps, train_list = get_data_list(f_train, batch_size=FLAGS.batch_size)
test_steps, test_list = get_data_list(f_test, batch_size=FLAGS.batch_size)
model = AGNN(user_num, item_num, gender_num, age_num, occupation_num, genre_num, director_num, writer_num, star_num, country_num, FLAGS.embed_size, FLAGS.attention_size, FLAGS.dropout)
model.cuda()
loss_function = torch.nn.MSELoss(size_average=False)
optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=FLAGS.lr, weight_decay=0.001)
writer = SummaryWriter() # For visualization
#f_loss_curve = open('tmp_loss_curve.txt', 'w')
best_rmse = 5
count = 0
for epoch in range(FLAGS.epochs):
#tmp_main_loss, tmp_vae_loss = [], []
model.train() # Enable dropout (if have).
start_time = time.time()
train_dataloader = get_batch_instances(train_list, user_feature_dict, item_feature_dict, item_director_dict, item_writer_dict, item_star_dict, item_country_dict, batch_size=FLAGS.batch_size, user_nei_dict=user_nei_dict, item_nei_dict=item_nei_dict, shuffle=True)
for idx, batch_data in enumerate(train_dataloader): #u, i, l, u_self_cate, u_onehop_id, u_onehop_rating, u_onehop_cate, i_self_cate, i_onehop_id, i_onehop_cate
user = torch.LongTensor(batch_data[0]).cuda()
item = torch.LongTensor(batch_data[1]).cuda()
label = torch.FloatTensor(batch_data[2]).cuda()
user_self_cate = torch.LongTensor(batch_data[3]).cuda()
user_onehop_id = torch.LongTensor(batch_data[4]).cuda()
user_onehop_cate = torch.LongTensor(batch_data[5]).cuda()
item_self_cate, item_self_director, item_self_writer, item_self_star, item_self_country = torch.LongTensor(batch_data[6])[:, 0:6].cuda(), torch.LongTensor(batch_data[6])[:, 6:9].cuda(), torch.LongTensor(batch_data[6])[:, 9:12].cuda(), torch.LongTensor(batch_data[6])[:, 12:15].cuda(), torch.LongTensor(batch_data[6])[:, 15:].cuda()
item_onehop_id = torch.LongTensor(batch_data[7]).cuda()
item_onehop_cate, item_onehop_director, item_onehop_writer, item_onehop_star, item_onehop_country = torch.LongTensor(batch_data[8])[:, :, 0:6].cuda(), torch.LongTensor(batch_data[8])[:, :, 6:9].cuda(), torch.LongTensor(batch_data[8])[:, :, 9:12].cuda(), torch.LongTensor(batch_data[8])[:, :, 12:15].cuda(), torch.LongTensor(batch_data[8])[:, :, 15:].cuda()
model.zero_grad()
prediction, recon_loss, kl_loss = model(user, item, user_self_cate, user_onehop_id, user_onehop_cate, item_self_cate, item_self_director, item_self_writer, item_self_star, item_self_country, item_onehop_id, item_onehop_cate, item_onehop_director, item_onehop_writer, item_onehop_star, item_onehop_country, mode='train')
label = Variable(label)
main_loss = loss_function(prediction, label)
loss = main_loss + FLAGS.vae_lambda * (recon_loss + kl_loss)
loss.backward()
# nn.utils.clip_grad_norm(model.parameters(), FLAGS.clip_norm)
optimizer.step()
writer.add_scalar('data/loss', loss.data, count)
count += 1
tmploss = torch.sqrt(loss / FLAGS.batch_size)
print(50 * '#')
print('epoch: ', epoch, ' ', tmploss.detach())
model.eval()
print('time = ', time.time() - start_time)
test_dataloader = get_batch_instances(test_list, user_feature_dict, item_feature_dict, item_director_dict, item_writer_dict, item_star_dict, item_country_dict, batch_size=FLAGS.batch_size, user_nei_dict=user_nei_dict, item_nei_dict=item_nei_dict, shuffle=False)
rmse, mse, mae, label_lst, pred_lst = metrics(model, test_dataloader)
print('test rmse,mse,mae: ', rmse,mse,mae)
"""if (rmse < best_rmse):
best_rmse = rmse
f_name = f_model + str(best_rmse)[:7] + '.dat' #f_model + str(best_rmse)[:7] + '.dat'
#torch.save(model, f_name)
f = open(f_name, 'w')
res_dict = {}
res_dict['label'] = label_lst
res_dict['pred'] = pred_lst
json.dump(res_dict, f)
f.close()
print('save result ok')"""
Parameters: {'lr': 0.0005, 'dropout': 0.5, 'batch_size': 128, 'gpu': '0', 'epochs': 20, 'clip_norm': 5.0, 'embed_size': 30, 'attention_size': 50, 'item_layer1_nei_num': 10, 'user_layer1_nei_num': 10, 'vae_lambda': 1} user_num 944, item_num 1683, gender_num 2, age_num 7, occupation_num 21, genre_num 19, director_num 1112, writer_num 2016, star_num 2568, country_num 128, mode ics ################################################## epoch: 0 tensor(0.5626, device='cuda:0') time = 47.153045654296875 test rmse,mse,mae: 1.0325273649289848 1.066112759327193 0.8309432697873491 ################################################## epoch: 1 tensor(0.5689, device='cuda:0') time = 43.06915545463562 test rmse,mse,mae: 1.0266010076226462 1.0539096288518324 0.824870026105042 ################################################## epoch: 2 tensor(0.6943, device='cuda:0') time = 42.44387221336365 test rmse,mse,mae: 1.0571549817251507 1.1175766553863038 0.8638132363016552 ################################################## epoch: 3 tensor(0.5527, device='cuda:0') time = 42.13542699813843 test rmse,mse,mae: 1.0517702752306248 1.1062207118587042 0.8609723428611776 ################################################## epoch: 4 tensor(0.5443, device='cuda:0') time = 43.24214553833008 test rmse,mse,mae: 1.0358337215744027 1.0729514987506772 0.8453118500108566 ################################################## epoch: 5 tensor(0.4815, device='cuda:0') time = 42.43221974372864 test rmse,mse,mae: 1.0231939566804946 1.0469258729874857 0.8253297993686741 ################################################## epoch: 6 tensor(0.4968, device='cuda:0') time = 42.93929886817932 test rmse,mse,mae: 1.0359929845205296 1.0732814639757542 0.8411421427834342 ################################################## epoch: 7 tensor(0.4895, device='cuda:0') time = 42.229517459869385 test rmse,mse,mae: 1.0532774718633056 1.1093934327347565 0.8607855100322045 ################################################## epoch: 8 tensor(0.5074, device='cuda:0') time = 42.55834674835205 test rmse,mse,mae: 1.0247800932391622 1.050174239499266 0.824709580819818 ################################################## epoch: 9 tensor(0.4578, device='cuda:0') time = 41.558250427246094 test rmse,mse,mae: 1.0922955741421183 1.19310962129046 0.8990770569855391 ################################################## epoch: 10 tensor(0.5202, device='cuda:0') time = 42.752477407455444 test rmse,mse,mae: 1.0405772957327133 1.0828011083944065 0.8449710432355982 ################################################## epoch: 11 tensor(0.5196, device='cuda:0') time = 41.88001823425293 test rmse,mse,mae: 1.0647690947606663 1.1337332251574486 0.8694145861863434 ################################################## epoch: 12 tensor(0.5833, device='cuda:0') time = 41.569355964660645 test rmse,mse,mae: 1.0307747052978313 1.0624964930818313 0.832099199058856 ################################################## epoch: 13 tensor(0.5642, device='cuda:0') time = 41.10500383377075 test rmse,mse,mae: 1.0283506332487717 1.05750502490315 0.8281871831344915 ################################################## epoch: 14 tensor(0.5756, device='cuda:0') time = 41.294716119766235 test rmse,mse,mae: 1.05585361314044 1.1148268523817215 0.8628918800500965 ################################################## epoch: 15 tensor(0.5247, device='cuda:0') time = 41.94802975654602 test rmse,mse,mae: 1.0287578659232348 1.0583427466989286 0.831378873784134 ################################################## epoch: 16 tensor(0.5186, device='cuda:0') time = 41.139464139938354 test rmse,mse,mae: 1.0345978113820935 1.070392631316618 0.8337778758005447 ################################################## epoch: 17 tensor(0.4468, device='cuda:0') time = 42.24388003349304 test rmse,mse,mae: 1.027209256353785 1.0551588563388956 0.8190214309314482 ################################################## epoch: 18 tensor(0.4668, device='cuda:0') time = 41.94647932052612 test rmse,mse,mae: 1.0637774120911032 1.1316223824752447 0.8678730945240749 ################################################## epoch: 19 tensor(0.4666, device='cuda:0') time = 41.875648021698 test rmse,mse,mae: 1.063158955390531 1.1303069644270851 0.8644698513033106
END