Matlab 对数据按类别进行分层抽样

Matlab 对数据按类别进行分层抽样

function [train_x,train_y,test_x,test_y] = getdata()
[data,~] = xlsread("XXX.xlsx");
data(find(isnan(data)))=0;    % 去掉数据中的NaN
labels = data(:,end);   % 最后一列是标签列
train_x=[];
train_y=[];
test_x=[];
test_y=[];

%% 层次抽样 在数据的每一类中,按一定比例抽取数据,构成训练集,剩下的作为测试集
scala = 0.7;  % 每一类中,训练集抽取的比例
for label=1:length(unique(labels))
    cate = find(labels==label);
    half = int32(length(cate)*scala);
    train = cate(randperm(length(cate),half));  %当前类下,抽取的训练集的所在行
    test = setdiff(cate,train);   % 当前类下,剩余的也就是测试集的所在行
    train_x = [train_x;data(train,1:end-1)];
    train_y = [train_y;labels(train)];
    test_x = [test_x;data(test,1:end-1)];
    test_y = [test_y;labels(test)];
end
end

每篇小附录:
机器学习与人工智能顶级期刊JMLR

猜你喜欢

转载自blog.csdn.net/qq_46523755/article/details/105536932