Article directory
The AutoRec model to be introduced in this article was proposed by the Australian National University in 2015. It combines the idea of AutoEncoder with the idea of Collaborative Filter, and proposes a simple neural network with a single hidden layer. Recommended model.
It can be said that the proposal of this model opened the prelude to using deep learning to solve the recommendation system problem, and provided ideas for the construction of complex deep learning networks.
The original paper has only 2 pages, which is very concise and clear. It is more suitable for learning as an introductory model of the deep learning recommendation system. The original address is here .
Technology Exchange
Technology must learn to share and communicate, and it is not recommended to work behind closed doors. A person can go fast, a group of people can go farther.
This article is shared and organized by fans of the planet technology group. The source code, data, and technical exchanges of the article can be obtained by adding the exchange group . The group has more than 2,000 members. The best way to add notes is: source + interest direction, which is convenient for finding like-minded friends .
Method ①, Add WeChat ID: pythoner666, Remarks: from CSDN + Remarks
Method ②, WeChat search official account: Python learning and data mining, background reply: add group
foreword
This article will introduce the basic principles of the AutoRec model, including the network model, loss function, recommendation process, experimental results, etc., and will give the code based on PyTorch.
Introduction to AutoRec Models
The AutoRec model is similar to MLP (multi-layer perceptron), which is a standard 3-layer (including input layer) neural network, except that it combines the idea of autoencoder (AutoEncoder) and collaborative filtering (Collaborative Filtering). In fact, to be more precise, the AutoRec model is a standard self-encoder structure. Its basic principle is to use the co-occurrence matrix in collaborative filtering to complete the self-encoding of item vectors or user vectors. Then use the self-encoding results to get the user's ratings for all items, and the results can be used for item recommendation after sorting.
Here is a brief introduction to autoencoders:
An autoencoder is an unsupervised method for compressing data dimensions and expressing data features. It is a type of neural network that tries to copy the input to the output after training. The autoencoder consists of an encoder and a decoder, and the structure is as follows:
The following is the overall model block diagram of AutoRec:
Based on the item-based AutoRec model, you can see that the entire model has only 3 layers. The blue dots represent hidden layer neurons, and the red boxes represent the input of the model.
loss function
First, the loss function of the self-encoder is given, as follows:
The loss function of the AutoRec model takes into account the restrictions on parameters, so L2 regularization is added to prevent overfitting, and the loss function changes to:
AutoRec-based recommendation process
Experimental comparison
The author conducted comparative experiments on MovieLens 1M and 10M, and Netflix data respectively. The evaluation index is RMSE, which is the root mean square error. Compared with U-RBM, I-RBM, BiasedMF, LLORMA algorithms respectively. The result is as follows:
Comparative experiment results 1
Comparative experiment results 2
The author also conducted ablation experiments to verify the impact of choosing different activation functions on the final experimental results.
Ablation experiment
In addition, the influence of the number of neurons in different hidden layers on the experimental results was also evaluated. It can be seen that as the number of neurons in the hidden layer increases, the RMSE decreases steadily.
code practice
The code is written based on PyTorch, mainly including data preprocessing and loading file dataloader.py, network model definition network.py, trainer trainer.py, and test file autorec_test.py.
The data preprocessing part is relatively simple. The test data is the MovieLens 1M data set. The co-occurrence matrix is mainly defined, and the data set is divided into a training set and a test set.
Part of the code is as follows:
import torch
import numpy as np
import torch.utils.data as Data
def dataProcess(filename, num_users, num_items, train_ratio):
fp = open(filename, 'r')
lines = fp.readlines()
num_total_ratings = len(lines)
user_train_set = set()
user_test_set = set()
item_train_set = set()
item_test_set = set()
train_mask_r = np.zeros((num_users, num_items))
test_mask_r = np.zeros((num_users, num_items))
random_perm_idx = np.random.permutation(num_total_ratings)
train_idx = random_perm_idx[0:int(num_total_ratings * train_ratio)]
test_idx = random_perm_idx[int(num_total_ratings * train_ratio):]
''' Train '''
for itr in train_idx:
line = lines[itr]
user, item, rating, _ = line.split("::")
user_idx = int(user) - 1
item_idx = int(item) - 1
train_r[user_idx][item_idx] = int(rating)
train_mask_r[user_idx][item_idx] = 1
user_train_set.add(user_idx)
item_train_set.add(item_idx)
''' Test '''
for itr in test_idx:
line = lines[itr]
user, item, rating, _ = line.split("::")
user_idx = int(user) - 1
item_idx = int(item) - 1
test_r[user_idx][item_idx] = int(rating)
test_mask_r[user_idx][item_idx] = 1
user_test_set.add(user_idx)
item_test_set.add(item_idx)
return train_r, train_mask_r, test_r, test_mask_r, user_train_set, item_train_set, user_test_set, item_test_set
def Construct_DataLoader(train_r, train_mask_r, batchsize):
torch_dataset = Data.TensorDataset(torch.from_numpy(train_r), torch.from_numpy(train_mask_r))
return Data.DataLoader(dataset=torch_dataset, batch_size=batchsize, shuffle=True)
Part of the code of the network model is relatively simple, basically two fully connected layers plus a Sigmoid activation function.
code show as below:
import torch
import numpy as np
import torch.nn as nn
class AutoRec(nn.Module):
"""
基于物品的AutoRec模型
"""
def __init__(self, config):
super(AutoRec, self).__init__()
self._num_items = config['num_items']
self._hidden_units = config['hidden_units']
self._lambda_value = config['lambda']
self._config = config
self._encoder = nn.Sequential(
nn.Linear(self._num_items, self._hidden_units),
nn.Sigmoid()
)
self._decoder = nn.Sequential(
nn.Linear(self._hidden_units, self._num_items)
)
def forward(self, input):
return self._decoder(self._encoder(input))
def loss(self, res, input, mask, optimizer):
cost = 0
temp = 0
cost += ((res - input) * mask).pow(2).sum()
rmse = cost
for i in optimizer.param_groups:
for j in i['params']:
if j.data.dim() == 2:
temp += torch.t(j.data).pow(2).sum()
cost += temp * self._config['lambda'] * 0.5
return cost, rmse
def recommend_user(self, r_u, N):
"""
:param r_u: 单个用户对所有物品的评分向量
:param N: 推荐的商品个数
"""
predict = self.forward(torch.from_numpy(r_u).float())
predict = predict.detach().numpy()
indexs = np.argsort(-predict)[:N]
return indexs
def recommend_item(self, user, test_r, N):
"""
:param r_u: 所有用户对物品i的评分向量
:param N: 推荐的商品个数
"""
recommends = np.array([])
for i in range(test_r.shape[1]):
predict = self.forward(test_r[:, i])
recommends.append(predict[user])
indexs = np.argsot(-recommends)[:N]
return recommends[indexs]
def evaluate(self, test_r, test_mask_r, user_test_set, user_train_set, item_test_set, item_train_set):
test_r_tensor = torch.from_numpy(test_r).type(torch.FloatTensor)
test_mask_r_tensor = torch.from_numpy(test_mask_r).type(torch.FloatTensor)
res = self.forward(test_r_tensor)
for user in unseen_user_test_list:
for item in unseen_item_test_list:
if test_mask_r[user, item] == 1:
res[user, item] = 3
mse = ((res - test_r_tensor) * test_mask_r_tensor).pow(2).sum()
RMSE = mse.detach().cpu().numpy() / (test_mask_r == 1).sum()
RMSE = np.sqrt(RMSE)
print('test RMSE : ', RMSE)
def saveModel(self):
torch.save(self.state_dict(), self._config['model_name'])
def loadModel(self, map_location):
state_dict = torch.load(self._config['model_name'], map_location=map_location)
self.load_state_dict(state_dict, strict=False)
The test code mainly includes model training, randomly selects 3 users and recommends 5 products, and evaluates the RMSE indicator on the test set.
code show as below:
import torch
from AutoRec.trainer import Trainer
from AutoRec.network import AutoRec
from AutoRec.dataloader import dataProcess
autorec_config = \
{
'train_ratio': 0.9,
'num_epoch': 200,
'batch_size': 100,
'optimizer': 'adam',
'adam_lr': 1e-3,
'l2_regularization':1e-4,
'num_users': 6040,
'num_items': 3952,
'hidden_units': 500,
'lambda': 1,
'device_id': 2,
'use_cuda': False,
'data_file': '../Data/ml-1m/ratings.dat',
'model_name': '../Models/AutoRec.model'
}
if __name__ == "__main__":
train_r, train_mask_r, test_r, test_mask_r, \
user_train_set, item_train_set, user_test_set, item_test_set = \
dataProcess(autorec_config['data_file'], autorec_config['num_users'], autorec_config['num_items'], autorec_config['train_ratio'])
autorec = AutoRec(config=autorec_config)
autorec.loadModel(map_location=torch.device('cpu'))
print("用户1推荐列表: ",autorec.recommend_user(test_r[0], 5))
print("用户2推荐列表: ",autorec.recommend_user(test_r[9], 5))
print("用户3推荐列表: ",autorec.recommend_user(test_r[23], 5))
autorec.evaluate(test_r, test_mask_r, user_test_set=user_test_set, user_train_set=user_train_set, \
item_test_set=item_test_set, item_train_set=item_train_set)
The test results are as follows:
Summarize
The AutoRec model is the pioneering work of the deep learning method used in the recommendation system. It uses a single hidden layer autoencoder to generalize user and item ratings, so that the model has a certain generalization and expression ability, and slightly increases the model. Complexity, the performance effect will be very obvious.
reference
- "Deep Learning Recommender System" -- Zhe Wang
- https://blog.csdn.net/quiet_girl/article/details/84401029
- https://zhuanlan.zhihu.com/p/163673436
- https://zhuanlan.zhihu.com/p/159087297
- https://github.com/NeWnIx5991/AutoRec-for-CF/blob/master/autorec.py