Python- linear regression model to predict the Jedi survive Player Rankings

Copyright: As this article have questions, please contact the author micro letter kxymxzs, welcome harassment! https://blog.csdn.net/MG_ApinG/article/details/86366684

The game's official website: Kaggle Jedi survive datasets

Game Description: (Jedi survive) in PUBG game, every game has a maximum of 100 players, the player can have on the team according to the ranking at the end of the game and how many other teams were eliminated Shihai alive. In the game, players can get different types of ammunition, the recovery was knocked down but not kill teammates, driving a vehicle, swimming, running, shooting and so on.

To the official website of the data set contains a large number of anonymous PUBG game statistics, the data format for each line that contains a player's game statistics. Data from all types of game: single row, double, team and custom (not guarantee that every game has 100 players, up to four players per team).

Competition requirements must create a model, based on their statistical data to predict the final race rankings of players, from 1 (first place) to 0 (the last one).

Tools: Python3

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
from sklearn.model_selection import cross_val_predict
import matplotlib.pyplot as plt

filePath01 = r'F://temp_data/train_V2.csv'
data01 = pd.read_csv(filePath01)
data01.head()  # 读取前五行数据,如果是最后五行,用data.tail()
print(data01.shape)  # 看数据的维度
X = data01.drop(['rank'], axis=1)
y = data01[:, ['rank']]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=2333)

linreg = LinearRegression()
linreg.fit(X_train, y_train)
# 查看需要的模型系数结果
print(linreg.intercept_)
print(linreg.coef_)

# 模型拟合测试集
y_pred = linreg.predict(X_test)

# 用scikit-learn计算均方差MSE
print("MSE:", metrics.mean_squared_error(y_test, y_pred))
# 用scikit-learn计算均方根差RMSE
print("RMSE:", np.sqrt(metrics.mean_squared_error(y_test, y_pred)))
# 得到了MSE或者RMSE,如果我们用其他方法得到了不同的系数,需要选择模型时,就用MSE较小时对应的参数。

# 通过交叉验证来持续优化模型,这里采用10折交叉验证,即cv=10
predicted = cross_val_predict(linreg, X, y, cv=10)
# 用scikit-learn计算MSE
print("MSE:", metrics.mean_squared_error(y, predicted))
# 用scikit-learn计算RMSE
print("RMSE:", np.sqrt(metrics.mean_squared_error(y, predicted)))

# 画图观察结果  真实值和预测值的变化关系
fig, ax = plt.subplots()
ax.scatter(y, predicted)
ax.plot([y.min(), y.max()], [y.min(), y.max()], 'k--', lw=4)
ax.set_xlabel('Measured')
ax.set_ylabel('Predicted')
plt.show()


If you have questions, please contact me, welcome harassment:

                                                             

 

Guess you like

Origin blog.csdn.net/MG_ApinG/article/details/86366684