1.数据处理
1.首先加载数据查看数据详情
2. 查看数据是否又NAN的项。
3 对有空的项进行填充
4 将数据分成训练书和测试数据
5 .对数据归一化处理
2.模型预测
多层感知机-回归模型
集成回归
线性回归
svm回归
knn 回归
决策树回归
决策树回归
随机森林回归
Adaboost回归
gbrt 回归
bagging 回归
3.模型整合
使用k折交叉验证
使用了几个较好的模型进行处理
使用result数据结果
import warnings
warnings.filterwarnings("ignore")
from sklearn import preprocessing
from sklearn.neural_network import MLPRegressor
from sklearn.ensemble import GradientBoostingRegressor
from sklearn import ensemble
import pandas as pd
import math
from sklearn.model_selection import KFold
df = pd.read_excel("xxx.xlsx",encoding='utf8',index_col=0)
df=df.fillna(method='ffill')
data = df.values.astype('float')
x = data[:,1:]
y = data[:,0]
for i in range(len(y)):
y[i] = math.log(y[i])
kf = KFold(n_splits=5,shuffle=True)
for train_index,test_index in kf.split(x):
train_x = x[train_index]
test_x = x[test_index]
train_y = y[train_index]
test_y = y[test_index]
ss_x = preprocessing.StandardScaler()
train_x = ss_x.fit_transform(train_x)
test_x = ss_x.transform(test_x)
ss_y = preprocessing.StandardScaler()
train_y = ss_y.fit_transform(train_y.reshape(-1,1))
test_y = ss_y.transform(test_y.reshape(-1,1))
model_mlp = MLPRegressor(solver='lbfgs',hidden_layer_sizes=(20,20,20),random_state=1)
model_mlp.fit(train_x,train_y.ravel())
mlp_score = model_mlp.score(test_x,test_y.ravel())
print("sklearn多层感知器-回归模型得分",mlp_score)
model_gbr = GradientBoostingRegressor(learning_rate=0.1)
model_gbr.fit(train_x,train_y.ravel())
gbr_score = model_gbr.score(test_x,test_y.ravel())
print("sklearn集成-回归模型得分",gbr_score)
model_br=ensemble.BaggingRegressor()
model_br.fit(train_x,train_y)
model_brscore = model_br.score(test_x,test_y)
print("sklearn bagging 回归模型得分",model_brscore)
model_rfr=ensemble.RandomForestRegressor(n_estimators=20)
model_rfr.fit(train_x,train_y)
model_rfrscore = model_rfr.score(test_x,test_y)
print("sklearn 随机森林回归模型得分",model_rfrscore)
model_br=ensemble.BaggingRegressor()
model_br.fit(train_x,train_y)
model_brscore = model_br.score(test_x,test_y)
print("sklearn bagging 回归模型得分",model_brscore)
使用处理过年限的数据的得分
参考:https://www.jianshu.com/p/f92d9ac14692