wikipedia数据集 预处理
from scipy.io import loadmat
path_to_mat = "wikipedia_info/raw_features.mat"
matstuff = loadmat(path_to_mat)
I_tr = matstuff["I_tr"] #2173*128
T_tr = matstuff["T_tr"] #2173*10
I_te = matstuff["I_te"] #693*128
T_te = matstuff["T_te"] #693*10
get_truth = lambda x: [int(i.split("\t")[-1])-1 for i in open(x).read().split("\n")[:-1]]
train_truth = get_truth("wikipedia_info/trainset_txt_img_cat.list")
test_truth = get_truth("wikipedia_info/testset_txt_img_cat.list")
lambda的使用:
https://blog.csdn.net/zjuxsl/article/details/79437563
MsCoCo数据集使用
https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoDemo.ipynb