Deep learning recommendation system (3) NeuralCF and its application on the ml-1m movie data set

Deep learning recommendation system (3) NeuralCF and its application on the ml-1m movie data set

In 2016, with the introduction of a large number of excellent deep learning models such as Microsoft's Deep Crossing, Google's Wide&Deep, FNN, and PNN, the recommendation system has fully entered the era of deep learning, and it is still mainstream today. The recommended model mainly has the following two developments:

  • Compared with traditional machine learning models, deep learning models have stronger expressive capabilities and can mine more hidden patterns in data.

  • The deep learning model structure is very flexible and can be flexibly adjusted according to business scenarios and data characteristics to perfectly fit the model with the application scenarios.

The deep learning recommendation model takes the multi-layer perceptron (MLP) as the core and evolves by changing the neural network structure.

insert image description here

Traditional recommendation algorithms have two fundamental ideas:

  • One is the characterization of users and items, that is, how to better represent user characteristics and item characteristic information. The latent semantic model (MF) belongs to this idea. It uses the embedding idea to represent users and items, and then uses the product of vectors to express the user's preference for items.

  • The second idea is feature intersection, that is, considering the interactive information between features to enrich the expressive ability of the data. The family of Factorization Machines (FM) is dedicated to solving this problem.

The recommendation algorithm of deep learning will still use these two fundamental ideas.

  • Some deep learning algorithms still focus on user and item representation. AutoRec和Deep Crossing模型is an evolution in the complexity and number of layers of neural networks, and these two models are also solutions to using deep learning to solve recommendation problems from the perspective of user and item representation, because they do not deliberately study the relationship between features. Interaction.

  • Some deep learning algorithms specialize in the idea of ​​feature crossover. Neural CF模型和PNN模型Focus more on studying the way features intersect.

  • Of course, we will also pay attention to deep learning.

1 Principle of NeuralCF model

1.1 Matrix factorization model MF

矩阵分解It is based on the collaborative filtering co-occurrence matrix, using denser latent vectors to represent users and items, mining the implicit interests and implicit characteristics of users and items, and making up for the lack of ability of the collaborative filtering model to handle sparse matrices to a certain extent. .

The learning process of latent vectors in matrix decomposition can be regarded as a simple neural network representation in deep learning. User vectors and item vectors can be regarded as embedding methods, and the final rating value (predicted value) is the user The "similarity" after the inner product of vector and item vector. This inner product operation can be regarded as a calculation in a neural unit.

insert image description here

In the process of actually using matrix decomposition to train and evaluate models, it is often found that the model is prone to underfitting. The reason is 因为矩阵分解的模型结构相对比较简单,特别是“输出层”(也被称为“Scoring 层”),无法对优化目标进行有效的拟合. This requires the model to have stronger expressive capabilities. Inspired by this motivation, researchers from the National University of Singapore proposed the NeuralCF model.

1.2 Structure of NeuralCF model

An important improvement is actually to 多层的神经网络+输出层replace the inner product operation in matrix decomposition with one

  • This will allow the user vector and item vector to cross more fully and obtain more valuable feature combination information.

  • The second is to introduce more nonlinear features to make the model more expressive.

insert image description here

1.2.1 Details of the NeuralCF model

insert image description here

The NeuralCF model can be regarded as a general NCF framework, because there are many ways to cross vectors.

  • If this is an inner product operation, then this NCF framework becomes an ordinary GMF

  • If this is a multi-layer neural network, then this framework becomes an MLP network

1.2.2 NeuralCF hybrid model

For the NCF framework, there are the above two example models GMF and MLP based on different crossover methods. The former models feature crossover in a linear manner, while the latter models feature crossover in a nonlinear manner.

NeuralCF hybrid models 整合了上面提出的原始 NeuralCF 模型和以元素积为互操作的广义矩阵分解模型. This gives the model stronger feature combination and nonlinear capabilities.

insert image description here

1.2.3 Advantages and disadvantages of NeuralCF model

The NeuralCF model actually proposes a model framework, which is based on the two Embedding layers of user vector and item vector, uses different interoperability layers to perform cross-combination of features, and can flexibly splice different interoperability layers. From here we can see the advantages of deep learning in building recommendation models:利用神经网络理论上能够拟合任意函数的能力,灵活地组合不同的特征,按需增加或减少模型的复杂度。

NeuralCF models also have limitations. Since it is constructed based on the idea of ​​collaborative filtering, the NeuralCF model does not introduce more other types of features, which undoubtedly wastes other valuable information in practical applications.

1.3 NeuralCF model code implementation

NeuralCF hybrid models 整合了原始 NeuralCF 模型和以元素积为互操作的广义矩阵分解模型.

1.3.1 Realization of GMF model

import torch
import torch.nn as nn

class GMF(nn.Module):

    def __init__(self, num_users, num_items, latent_dim):
        super(GMF, self).__init__()
        self.MF_Embedding_User = nn.Embedding(num_embeddings=num_users, embedding_dim=latent_dim)
        self.MF_Embedding_Item = nn.Embedding(num_embeddings=num_items, embedding_dim=latent_dim)

        self.linear = nn.Linear(latent_dim, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, inputs):
        # 这个inputs是一个批次的数据, 所以后面的操作切记写成inputs[0], [1]这种, 这是针对某个样本了, 我们都是对列进行的操作

        # 先把输入转成long类型
        inputs = inputs.long()

        # 用户和物品的embedding
        MF_Embedding_User = self.MF_Embedding_User(inputs[:, 0])  
        MF_Embedding_Item = self.MF_Embedding_Item(inputs[:, 1])

        # 两个隐向量点积
        predict_vec = torch.mul(MF_Embedding_User, MF_Embedding_Item)

        # liner
        linear = self.linear(predict_vec)
        output = self.sigmoid(linear)

        return output

if __name__ == '__main__':
    # 创建GMF模型
    model = GMF(num_users=50,num_items=20,latent_dim=10)
    print(model)
    # 创建测试数据,批次大小为1,特征为2(user_id,item_id)
    x = torch.rand(size=(1, 2), dtype=torch.float32)
    print(model(x))
GMF(
  (MF_Embedding_User): Embedding(50, 10)
  (MF_Embedding_Item): Embedding(20, 10)
  (linear): Linear(in_features=10, out_features=1, bias=True)
  (sigmoid): Sigmoid()
)
tensor([[0.4614]], grad_fn=<SigmoidBackward0>)

1.3.2 Realization of MLP model

import torch
import torch.nn as nn
import torch.nn.functional as F



class MLP(nn.Module):

    def __init__(self, num_users, num_items, layers=[20, 64, 32, 16]):
        super(MLP, self).__init__()

        # embedding层
        self.MLP_Embedding_User = nn.Embedding(num_embeddings=num_users, embedding_dim=layers[0]//2)
        self.MLP_Embedding_Item = nn.Embedding(num_embeddings=num_items, embedding_dim=layers[0]//2)

        # 全连接网络
        self.dnn_network = nn.ModuleList(
            [
                nn.Linear(layer[0], layer[1]) for layer in list(zip(layers[:-1],layers[1:]))
            ]
        )

        self.linear = nn.Linear(layers[-1],1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, inputs):
        # 这个inputs是一个批次的数据, 所以后面的操作切记写成inputs[0], [1]这种, 这是针对某个样本了, 我们都是对列进行的操作

        # 先把输入转成long类型
        inputs = inputs.long()

        # 用户和物品的embedding
        MLP_Embedding_User = self.MLP_Embedding_User(inputs[:, 0])  
        MLP_Embedding_Item = self.MLP_Embedding_Item(inputs[:, 1])

        # 两个隐向量堆叠起来
        x = torch.cat([MLP_Embedding_User, MLP_Embedding_Item], dim=-1)

        # 全连接网络
        for linear in self.dnn_network:
            x = linear(x)
            x = F.relu(x)


        x = self.linear(x)
        output = self.sigmoid(x)

        return output





if __name__ == '__main__':
    # 创建模型
    net = MLP(num_users=50,num_items=20)
    print(net)
    # 创建测试数据,批次大小为1,特征为2(user_id,item_id)
    x = torch.rand(size=(1, 2), dtype=torch.float32)
    print(net(x))
MLP(
  (MLP_Embedding_User): Embedding(50, 10)
  (MLP_Embedding_Item): Embedding(20, 10)
  (dnn_network): ModuleList(
    (0): Linear(in_features=20, out_features=64, bias=True)
    (1): Linear(in_features=64, out_features=32, bias=True)
    (2): Linear(in_features=32, out_features=16, bias=True)
  )
  (linear): Linear(in_features=16, out_features=1, bias=True)
  (sigmoid): Sigmoid()
)
tensor([[0.4674]], grad_fn=<SigmoidBackward0>)

1.3.3 Implementation of NeuralCF model

import torch
import torch.nn as nn
import torch.nn.functional as F


class NeuralCF(nn.Module):

    def __init__(self, num_users, num_items, latent_dim, layers=[20, 64, 32, 16]):
        super(NeuralCF, self).__init__()
        # embedding层
        self.MF_Embedding_User = nn.Embedding(num_embeddings=num_users, embedding_dim=latent_dim)
        self.MF_Embedding_Item = nn.Embedding(num_embeddings=num_items, embedding_dim=latent_dim)

        # embedding层
        self.MLP_Embedding_User = nn.Embedding(num_embeddings=num_users, embedding_dim=layers[0] // 2)
        self.MLP_Embedding_Item = nn.Embedding(num_embeddings=num_items, embedding_dim=layers[0] // 2)

        # 全连接网络
        self.dnn_network = nn.ModuleList(
            [
                nn.Linear(layer[0], layer[1]) for layer in list(zip(layers[:-1], layers[1:]))
            ]
        )
        self.linear = nn.Linear(layers[-1], latent_dim)

        # 合并之后
        self.linear2 = nn.Linear(2 * latent_dim, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, inputs):

        # TODO 1、左边的GMF模型
        # 这个inputs是一个批次的数据, 所以后面的操作切记写成inputs[0], [1]这种, 这是针对某个样本了, 我们都是对列进行的操作

        # 先把输入转成long类型
        inputs = inputs.long()

        # MF模型的计算 用户和物品的embedding
        MF_Embedding_User = self.MF_Embedding_User(inputs[:, 0])  
        MF_Embedding_Item = self.MF_Embedding_Item(inputs[:, 1])
        # 两个向量点积过一个全连接
        mf_vec = torch.mul(MF_Embedding_User, MF_Embedding_Item)


        # TODO 2、右边的MLP模型
        # MLP 模型的计算
        MLP_Embedding_User = self.MLP_Embedding_User(inputs[:, 0])
        MLP_Embedding_Item = self.MLP_Embedding_Item(inputs[:, 1])
        # 两个隐向量堆叠起来
        x = torch.cat([MLP_Embedding_User, MLP_Embedding_Item], dim=-1)
        # 全连接网络
        for linear in self.dnn_network:
            x = linear(x)
            x = F.relu(x)
        # 输出纬度和GMF输出的相同
        mlp_vec = self.linear(x)

        # TODO 3、合并两个
        vector = torch.cat([mf_vec, mlp_vec], dim=-1)

        # liner
        linear = self.linear2(vector)
        output = self.sigmoid(linear)

        return output



if __name__ == '__main__':
    net = NeuralCF(num_users=50,num_items=20,latent_dim=10)
    print(net)
    # 创建测试数据,批次大小为1,特征为2(user_id,item_id)
    x = torch.rand(size=(1, 2), dtype=torch.float32)
    print(net(x))
NeuralCF(
  (MF_Embedding_User): Embedding(50, 10)
  (MF_Embedding_Item): Embedding(20, 10)
  (MLP_Embedding_User): Embedding(50, 10)
  (MLP_Embedding_Item): Embedding(20, 10)
  (dnn_network): ModuleList(
    (0): Linear(in_features=20, out_features=64, bias=True)
    (1): Linear(in_features=64, out_features=32, bias=True)
    (2): Linear(in_features=32, out_features=16, bias=True)
  )
  (linear): Linear(in_features=16, out_features=10, bias=True)
  (linear2): Linear(in_features=20, out_features=1, bias=True)
  (sigmoid): Sigmoid()
)
tensor([[0.5512]], grad_fn=<SigmoidBackward0>)

2 Application of Neural CF on ml-1m movie data set

The dataset used is the MovieLen movie rating dataset (processed).

Download address: https://github.com/hexiangnan/neural_collaborative_filtering

2.1 Data preprocessing

import scipy.sparse as sp
import numpy as np


# filename为test.rating的数据   类似于测试集的制作
def load_rating_file_as_list(filename):
    ratingList = []
    with open(filename, "r") as f:
        line = f.readline()
        while line is not None and line != "":
            arr = line.split("\t")
            user, item = int(arr[0]), int(arr[1])
            ratingList.append([user, item])     # 用户名 电影名
            line = f.readline()
    return ratingList

# test.negative
def load_negative_file(filename):
    negativeList = []
    with open(filename, "r") as f:
        line = f.readline()
        while line is not None and line != "":
            arr = line.split("\t")
            negatives = []
            for x in arr[1:]:
                negatives.append(int(x))
            negativeList.append(negatives)
            line = f.readline()
    return negativeList


def load_rating_file_as_matrix(filename):
    """
    Read .rating file and Return dok matrix.
    The first line of .rating file is: num_users\t num_items
    """
    # Get number of users and items
    num_users, num_items = 0, 0   # 这俩记录用户编号和物品编号里面的最大值,用来构建稀疏矩阵
    with open(filename, "r") as f:
        line = f.readline()
        while line is not None and line != "":
            arr = line.split("\t")
            u, i = int(arr[0]), int(arr[1])
            num_users = max(num_users, u)
            num_items = max(num_items, i)
            line = f.readline()
    # Construct matrix
    # dok_matrix可以高效地逐渐构造稀疏矩阵。 存储是稀疏存储 toarray()
    mat = sp.dok_matrix((num_users + 1, num_items + 1), dtype=np.float32)
    with open(filename, "r") as f:
        line = f.readline()
        while line is not None and line != "":
            arr = line.split("\t")
            user, item, rating = int(arr[0]), int(arr[1]), float(arr[2])
            if rating > 0:
                mat[user, item] = 1.0
            line = f.readline()
    return mat   # 0,1矩阵, 如果评过分就是1, 否则是0


class Dataset():
    def __init__(self, path):
        #  将【电影-用户-评分】数据转换为0,1稀疏矩阵,如果评过分,那么就为1,否则为0
        #  这个矩阵的行数是用户数目, 列数是商品数目, 1代表某个用户对某个电影感兴趣
        self.trainMatrix = load_rating_file_as_matrix(path + '.train.rating')

        # 测试集正样本
        # 将测试数据【用户名,电影名】封装到list中
        # 6040个元素,每个元素(userID, ItemID)的格式
        self.testRatings = load_rating_file_as_list(path + '.test.rating')

        # 测试集负样本
        # 6040个 每个元素的长度均为99个  这个和上面的testRating对应, 即每个用户评分电影里面有一个正的, 99个负的
        self.testNegatives = load_negative_file(path + '.test.negative')
        assert len(self.testRatings) == len(self.testNegatives)

    def Getdataset(self):
        return (self.trainMatrix, self.testRatings, self.testNegatives)



if __name__ == '__main__':
    # 开始导入原数据并进行处理
    '''
处理过的电影数据集:
数据集地址:https://github.com/hexiangnan/neural_collaborative_filtering
    
processed datasets: MovieLens 1 Million (ml-1m)

train.rating:

Train file.
Each Line is a training instance: userID\t itemID\t rating\t timestamp (if have)
test.rating:

Test file (positive instances).
Each Line is a testing instance: userID\t itemID\t rating\t timestamp (if have)
test.negative

Test file (negative instances).
Each line corresponds to the line of test.rating, containing 99 negative samples.
Each line is in the format: (userID,itemID)\t negativeItemID1\t negativeItemID2 ...
    '''
    path = 'Data/ml-1m'
    dataset = Dataset(path)
    train, testRatings, testNegatives = dataset.Getdataset()

    for item in testNegatives:
        print(len(item))

2.2 Load data

import os
os.environ['KMP_DUPLICATE_LIB_OK']="TRUE"

import torch
from torch.utils.data import DataLoader, Dataset, TensorDataset

import numpy as np
import torch.nn as nn


from torchkeras import summary

import warnings
warnings.filterwarnings('ignore')

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
from MovieDataSet import Dataset

# 1、加载处理好的数据
path = 'Data/ml-1m'
dataset = Dataset(path)
train, testRatings, testNegatives = dataset.Getdataset()
num_users, num_items = train.shape
num_users, num_items # (6040, 3706)
# 制作数据   用户打过分的为正样本, 用户没打分的为负样本, 负样本这里采用的采样的方式
def get_train_instances(train, num_negatives):
    user_input, item_input, labels = [], [], []
    num_items = train.shape[1]
    for (u, i) in train.keys():  # train.keys()是打分的用户和商品
        # positive instance
        user_input.append(u)
        item_input.append(i)
        labels.append(1)

        # negative instance
        for t in range(num_negatives):
            j = np.random.randint(num_items)
            while (u, j) in train:
                j = np.random.randint(num_items)
            #print(u, j)
            user_input.append(u)
            item_input.append(j)
            labels.append(0)
    return user_input, item_input, labels
def get_train(train, num_negatives=4, batch_size=64):
    user_input, item_input, labels = get_train_instances(train, num_negatives)
    train_x = np.vstack([user_input, item_input]).T
    labels = np.array(labels)
    # 构建成Dataset和DataLoader
    train_dataset = TensorDataset(torch.tensor(train_x), torch.tensor(labels).float())
    dl_train = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

    return dl_train

dl_train = get_train(train)

2.3 Create a model

# 2、创建模型
from _03_NeuralCF import NeuralCF

num_factors =8
layers = [num_factors*2, 64, 32, 16]
net = NeuralCF(num_users, num_items, num_factors, layers)

summary(net, input_shape=(2,))
==========================================================================
Layer (type)                            Output Shape              Param #
==========================================================================
Embedding-1                                  [-1, 8]               48,320
Embedding-2                                  [-1, 8]               29,648
Embedding-3                                  [-1, 8]               48,320
Embedding-4                                  [-1, 8]               29,648
Linear-5                                    [-1, 64]                1,088
Linear-6                                    [-1, 32]                2,080
Linear-7                                    [-1, 16]                  528
Linear-8                                     [-1, 8]                  136
Linear-9                                     [-1, 1]                   17
Sigmoid-10                                   [-1, 1]                    0
==========================================================================
Total params: 159,785
Trainable params: 159,785
Non-trainable params: 0
--------------------------------------------------------------------------
Input size (MB): 0.000008
Forward/backward pass size (MB): 0.001175
Params size (MB): 0.609531
Estimated Total Size (MB): 0.610714
--------------------------------------------------------------------------

2.4 Model evaluation function

import torch
import numpy as np
import heapq


# Global variables that are shared across processes
_model = None
_testRatings = None
_testNegatives = None
_K = None

# HitRation
# 就是在99个负样本中,和正样本(即item_id)一样的个数
def getHitRatio(ranklist, gtItem):
    for item in ranklist:
        if item == gtItem:
            return 1
    return 0

# NDCG
'''
当我们检索【推荐排序】,网页返回了与推荐排序相关的链接列表。
列表可能会是[A,B,C,G,D,E,F],也可能是[C,F,A,E,D],现在问题来了,当系统返回这些列表时,怎么评价哪个列表更好

NDCG就是用来评估排序结果的。搜索和推荐任务中比较常见。
具体可参考:
  https://zhuanlan.zhihu.com/p/448686098
'''
def getNDCG(ranklist, gtItem):
    for i in range(len(ranklist)):
        item = ranklist[i]
        if item == gtItem:
            return np.log(2) / np.log( i + 2)
    return 0


def try_gpu(i=0):
    if torch.cuda.device_count() >= i + 1:
        return torch.device(f'cuda:{
      
      i}')
    return torch.device('cpu')



def eval_one_rating(idx):   # 一次评分预测


    device = try_gpu()

    rating = _testRatings[idx]
    items = _testNegatives[idx]
    u = rating[0]
    gtItem = rating[1]
    # 将1个正样本添加到列表末尾,最终有1个正样本和99个负样本
    items.append(gtItem)

    # Get prediction scores
    map_item_score = {
    
    }
    users = np.full(len(items), u, dtype='int32')

    test_data = torch.tensor(np.vstack([users, np.array(items)]).T).to(device)
    predictions = _model(test_data)
    for i in range(len(items)):
        item = items[i]
        map_item_score[item] = predictions[i].data.cpu().numpy()[0]
    items.pop()

    # Evaluate top rank list
    ranklist = heapq.nlargest(_K, map_item_score, key=lambda k: map_item_score[k])  # heapq是堆排序算法, 取前K个
    hr = getHitRatio(ranklist, gtItem)
    ndcg = getNDCG(ranklist, gtItem)
    return hr, ndcg

def evaluate_model(model, testRatings, testNegatives, K):
    """
    Evaluate the performance (Hit_Ratio, NDCG) of top-K recommendation
    Return: score of each test rating.
    """
    global _model
    global _testRatings
    global _testNegatives
    global _K

    _model = model
    _testNegatives = testNegatives
    _testRatings = testRatings
    _K = K

    hits, ndcgs = [], []
    for idx in range(len(_testRatings)):
        (hr, ndcg) = eval_one_rating(idx)
        hits.append(hr)
        ndcgs.append(ndcg)
    return hits, ndcgs




if __name__ == '__main__':
    from MovieDataSet import Dataset
    from _01_GMF import GMF


    path = 'Data/ml-1m'
    dataset = Dataset(path)
    train, testRatings, testNegatives = dataset.Getdataset()

    model = GMF(num_users=train.shape[0],num_items=train.shape[1],latent_dim=10)

    hits, ndcgs = evaluate_model(model,testRatings,testNegatives,10)
    print(hits)
    print(ndcgs)
    # 初始评分
    hr, ndcg = np.array(hits).mean(), np.array(ndcgs).mean()
    print('Init: HR=%.4f, NDCG=%.4f' % (hr, ndcg))
# 先导入模型评估函数
from _01_model_evalute import evaluate_model

topK = 10
# 计算出初始的评估
(hits, ndcgs) = evaluate_model(net, testRatings, testNegatives, topK)

hr, ndcg = np.array(hits).mean(), np.array(ndcgs).mean()
print('Init: HR=%.4f, NDCG=%.4f' %(hr, ndcg))
Init: HR=0.1030, NDCG=0.0463

2.5 Model training

# 这两个类可以参考
# https://blog.csdn.net/qq_44665283/article/details/130598697?spm=1001.2014.3001.5502
from AnimatorClass import Animator
from TimerClass import Timer


def train_ch(net, dl_train, testRatings, testNegatives, num_epochs=10, lr=0.001, topK=10):
    print('training on', device)
    net.to(device)

    # 模型训练
    best_hr, best_ndcg, best_iter = 0, 0, -1
    log_step_freq = 10000

    loss_func = nn.BCELoss()
    optimizer = torch.optim.Adam(params=net.parameters(), lr=lr)
    # 绘制动态图
    animator = Animator(xlabel='epoch', xlim=[1, num_epochs],legend=['train loss', 'test hr', 'test ndcg'],figsize=(8.0, 6.0))
    timer, num_batches = Timer(), len(dl_train)

    for epoch in range(num_epochs):
        # 训练阶段
        net.train()
        loss_sum = 0.0
        for step, (features, labels) in enumerate(dl_train, 1):
            timer.start()
            features, labels = features.to(device), labels.to(device)
            # 梯度清零
            optimizer.zero_grad()
            # 正向传播
            predictions = net(features)
            loss = loss_func(predictions, labels.unsqueeze(1))
            # 反向传播求梯度
            loss.backward()
            optimizer.step()
            timer.stop()

            loss_sum += loss.item()

            if step % log_step_freq == 0:
                animator.add(epoch + step / num_batches,(loss_sum/step,None, None))


        # 验证阶段
        net.eval()
        (hits, ndcgs) = evaluate_model(net, testRatings, testNegatives, topK)
        hr, ndcg = np.array(hits).mean(), np.array(ndcgs).mean()
        animator.add(epoch + 1 ,(None, hr, ndcg))

        if hr > best_hr:
            best_hr, best_ndcg, best_iter = hr, ndcg, epoch
            torch.save(net.state_dict(), 'Pre_train/m1-1m_NeuralCF.pkl')

        info = (epoch, loss_sum/step, hr, ndcg)
        print(("\nEPOCH = %d, loss = %.3f, hr = %.3f, ndcg = %.3f") % info)

    print(f'{
      
      num_batches * num_epochs / timer.sum():.1f} examples/sec on {
      
      str(device)}')

insert image description here

注:

​ The NeuralCF model can also be pre-trained first. Pre-training is to use the parameters that have been trained directly on NeuralCF, and then train. Here, we examine the grasp of the parameters of each layer of the structure. The steps are as follows:

  • ​ First, establish the GMF and MLP models and import the saved parameters.
  • ​ Establish the NeuralCF model, obtain the corresponding layers, and obtain the corresponding parameters.

关键步骤如下

old_param = neural_mf.state_dict()

old_param['MF_Embedding_User.weight'] = gmf.state_dict().get('MF_Embedding_User.weight')
old_param['MF_Embedding_Item.weight'] = gmf.state_dict().get('MF_Embedding_Item.weight')
old_param['MLP_Embedding_User.weight'] = mlp.state_dict().get('MLP_Embedding_User.weight')
old_param['MLP_Embedding_Item.weight'] = mlp.state_dict().get('MLP_Embedding_Item.weight')

for i in range(3):
    old_param['dnn_network.' + str(i) + '.weight'] = mlp.state_dict().get('dnn_network.' + str(i) + '.weight')
    old_param['dnn_network.' + str(i) + '.bias'] = mlp.state_dict().get('dnn_network.' + str(i) + '.bias')

Guess you like

Origin blog.csdn.net/qq_44665283/article/details/132613183