Table of contents
Preface
This project uses a matrix decomposition algorithm to conduct in-depth analysis of data that players have played. Its goal is to select the most suitable game for the player from many games to achieve a relatively accurate game recommendation system.
First, the project will collect and analyze game data that players have played, including game name, game duration, game ratings and other information. These data form a large user-game interaction matrix, in which each row represents a player, each column represents a game, and the values in the matrix represent the interaction between the player and the game.
Next, the project uses a matrix decomposition algorithm to approximately replace the sparse user-game matrix with two small matrices - the feature-game matrix and the user-feature matrix. This decomposition process maps players and games into a latent feature space, allowing potential relationships between players and games to be inferred.
Once the model is trained, the system can predict the games a player might like based on their gaming history. This prediction is based on how similar the player is to other players and how similar the game is to other games. As a result, the system can provide personalized game recommendations for each player, taking into account their gaming preferences and historical behavior.
Overall, the goal of this project is to provide a more accurate game recommendation system through matrix decomposition and latent factor models. This kind of personalized recommendation can improve players' gaming experience, and also help gaming platforms provide better game promotion and increase user stickiness.
overall design
This part includes the overall system structure diagram and system flow chart.
Overall system structure diagram
The overall structure of the system is shown in the figure.
System flow chart
The system flow is shown in the figure.
Operating environment
This part includes the Python environment, TensorFlow environment, and PyQt5 environment.
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133148686#_38
Module implementation
This project includes 4 modules: data preprocessing, model construction, model training and saving, and model testing. The function introduction and related codes of each module are given below.
1. Data preprocessing
The data set comes from Kaggle, and the link address is https://www.kaggle.com/tamber/steam-video-games . This data set includes the user’s ID, game name, whether to purchase or play, and game duration. Among them: a total of It contains 12,393 users and involves 5,155 games. steam-video-games
Place the dataset in a folder under the Jupyter working path .
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133148686#1__97
2. Model construction
After the data is loaded into the model, the model structure needs to be defined and the loss function optimized.
1) Define the model structure
Using the matrix decomposition algorithm, the user-game sparse matrix is approximately replaced by two small matrices - the feature-game matrix and the user-feature matrix.
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133151049#1_54
2) Optimize the loss function
The L2 norm is often used in the loss function of matrix factorization algorithms. Therefore, the L2 norm is also introduced into the loss function of this project to avoid overfitting. Optimize model parameters using Adagrad optimizer.
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133151049#2_91
3. Model training and saving
Since the data set used in this project lists the game's DLC (Downloadable Content, subsequent downloadable content) as another game, therefore, when calculating the accuracy, the DLC and the game itself are judged to be the same game and the same series. The games can also be judged to be the same one.
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133151049#3__105
1) Model training
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133151049#1_148
2) Model saving
In order to facilitate the use of the model, the training results need to be saved using Joblib.
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133151049#2_187
4. Model application
The first is to make the layout of the page, obtain and check the input data; the second is to match the obtained data with the previously saved model to achieve the application effect.
1) Make a page
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133151109#1_80
2) Model import and call
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133151109#2_290
3) Model application code
See the blog for details: https://blog.csdn.net/qq_31136513/article/details/133151109#3_341
System test
This part includes training accuracy, test results and model application.
1. Training accuracy
The accuracy on the training set reaches more than 81%, as shown in the figure.
2. Test effect
Substitute the data into the model for testing, use the accuracy calculation function in the above steps, and compare the recommended games with the actual purchased games.
The relevant code is as follows:
import numpy as np
n_examples = 5
users = np.random.choice(test_users_idx, size=n_examples, replace=False)
rec_games = np.argsort(-rec)
for user in users:
purchase_history = np.where(train_matrix[user, :] != 0)[0]
recommendations = rec_games[user, :]
new_recommendations = recommendations[~np.in1d(recommendations, purchase_history)][:k]
print('给id为{0}的玩家推荐的游戏如下: '.format(idx2user[user]))
print(','.join([idx2game[game] for game in new_recommendations]))
print('玩家实际购买游戏如下: ')
print(','.join([idx2game[game] for game in np.where(test_matrix[user, :] != 0)[0]]))
precision = 100 * precision_at_k(new_recommendations, np.where(test_matrix[user, :] != 0)[0])
print('准确率: {:.2f}%'.format(precision))
print('\n')
The test set output results are shown in the figure.
3. Model application
This section includes program usage instructions and test results.
1) Program usage instructions
Open the program and the initial interface is as shown in the figure.
The interface is divided into 5 drop-down input boxes and 6 buttons. Select the game through input or options, and click the "Please enter the game time" button, as shown in the figure.
If the corresponding game name is entered correctly, you can enter the game time, as shown in the figure.
If it is incorrect, a dialog box will pop up asking for correct input, as shown in the figure.
When all data is entered correctly, click “推荐开始”
the button and a dialog box will pop up giving recommended games, as shown in the figure.
If there is data that has not been entered, a dialog box will pop up, as shown in the figure.
2) Test results
The test results are shown in the figure.
Related other blogs
Project source code download
For details, please see my blog resource download page
Download other information
If you want to continue to understand the learning routes and knowledge systems related to artificial intelligence, you are welcome to read my other blog " Heavyweight | Complete Artificial Intelligence AI Learning - Basic Knowledge Learning Route, all information can be downloaded directly from the network disk without following any routines.》
This blog refers to Github’s well-known open source platform, AI technology platform and experts in related fields: Datawhale, ApacheCN, AI Youdao and Dr. Huang Haiguang, etc., which has nearly 100G of related information. I hope it can help all my friends.