Improved Random Forest Regression Algorithm Based on Seagull Algorithm-with Code

Improved Random Forest Regression Algorithm Based on Seagull Algorithm-with Code


Abstract: In order to improve the regression prediction accuracy of random forest data, the seagull search algorithm is used to optimize the parameters of the number of trees and the minimum number of leaf points in the random forest.

1. Dataset

The data information is as follows:

data.mat contains input data and output data

The input data dimension is: 2000*2

The output data dimension is 2000*1

So the data input dimension of the RF model is 2; the output dimension is 1.

2. RF model

For random forest, please refer to relevant machine learning books.

3. RF based on seagull algorithm optimization

The specific principle of the seagull search algorithm refers to the blog: https://blog.csdn.net/u011835903/article/details/107535864

The optimization parameters of the seagull algorithm are the number of trees in the RF and the minimum number of leaf nodes. The fitness function is the mean square error (MSE) of RF on the training set and test set, and the lower the mean square error MSE, the better.
finteness = MSE [ predict ( train ) ] + MSE [ predict ( test ) ] finteness = MSE[predict(train)] + MSE[predict(test)]finteness=MSE[predict(train)]+MSE[predict(test)]

4. Test results

The data division information is as follows: The number of training sets is 1900 groups, and the number of test sets is 100 groups

Seagull parameters are set as follows:

%% 定义海鸥优化参数
pop=20; %种群数量
Max_iteration=30; %  设定最大迭代次数
dim = 2;%维度,即树个数和最小叶子点树
lb = [1,1];%下边界
ub = [50,20];%上边界
fobj = @(x) fun(x,Pn_train,Tn_train,Pn_test,Tn_test);

insert image description here
insert image description here
insert image description here

From the MSE results, the improved Seagull-RF is significantly better than the unimproved results.

5. Matlab code

6. Python code

Guess you like

Origin blog.csdn.net/u011835903/article/details/130514998