xgboost训练简单模型

Xgboost官网:http://xgboost.readthedocs.io/en/latest/python/python_intro.html

import xgboost as xgb

rawData = [[2,4],[3,4], [1,2], [4,5], [7,8]]
label = [6,7,3,9,15]

dtrain = xgb.DMatrix(rawData, label=label)
deval = xgb.DMatrix([[3,5],[3,6]], label=[8,9])

param = {'max_depth': 2, 'eta': 1, 'silent': 1, 'objective': 'reg:linear'}
# param['nthread'] = 4
# param['eval_metric'] = 'auc'

evallist = [(deval, 'eval'), (dtrain, 'train')]


num_round = 10
bst = xgb.train(param, dtrain, num_round, evallist)

bst.save_model('0001.model')

dtest = xgb.DMatrix([[2,4], [7,8]])
ypred = bst.predict(dtest)

print(ypred)

参数:
max_depth:树最大深度
eta:迭代步长
silent:是否打印额外信息
objective:训练目标
booster:树类型
gamma:子树生成最小loss
tree_method:树生成类型
lambda:L2正则项,防止过拟合
alpha:L1正则项,防止过拟合
subsample:样本测样,防止过拟合
colsample_bytree:建树特征抽样,防止过拟合
colsample_bylevel:子树特征抽样,防止过拟合

参数调优总有一个极限,决定模型精度的最终还是样本特征,一个好的样本特征能够让模型很容易学到样本规律。

猜你喜欢

转载自blog.csdn.net/witsmakemen/article/details/80806948
今日推荐