DatawhaleAI Summer Camp - Baseline Understanding of CV Direction

DatawhaleAI Summer Camp - Baseline Understanding of CV Direction

The goal of this summer camp is to complete a competition task in the direction of CV, please write a little about your experience.
Competition Link: Brain PET Image Analysis and Disease Prediction Challenge.
The baseline given by the organizer is divided into the following steps. There is a feeling of "a small sparrow with all internal organs", which involves the data that each AI model needs to solve , models, and optimization problems.
1. Data preparation and preprocessing (data)
2. Feature extraction and model training (model, optimization)
Next, I will introduce my learning experience and experience from two aspects.

Data Preparation and Preprocessing (Data)

In terms of data preprocessing, use glob.glob()the function to read the file path, and then use np.random.shuffle()the function to scramble the read path.

In this study, I was exposed to the processing of medical images for the first time, and I got a preliminary understanding of the nibabel library and images in .nii format.

When reading an image in .nii format, use the following format:

import nibabel as nib
img = nib.load(path)

After reading in img, print out its content to get the following information:

<class 'nibabel.nifti1.Nifti1Image'>
data shape (128, 128, 63, 1)
affine: 
[[  2.05940509   0.           0.         128.        ]
 [  0.           2.05940509   0.         128.        ]
 [  0.           0.           2.42500019  63.        ]
 [  0.           0.           0.           1.        ]]
metadata:
<class 'nibabel.nifti1.Nifti1Header'> object, endian='>'
sizeof_hdr      : 348
data_type       : b''
db_name         : b'041_S_1425'
extents         : 16384
session_error   : 0
regular         : b'r'
dim_info        : 0
dim             : [  4 128 128  63   1   0   0   0]
intent_p1       : 0.0
intent_p2       : 0.0
intent_p3       : 0.0
intent_code     : none
datatype        : uint16
bitpix          : 16
slice_start     : 0
pixdim          : [1.        2.059405  2.059405  2.4250002 0.        0.        0.
 0.       ]
vox_offset      : 0.0
scl_slope       : nan
scl_inter       : nan
slice_end       : 0
slice_code      : unknown
xyzt_units      : 2
cal_max         : 0.0
cal_min         : 0.0
slice_duration  : 0.0
toffset         : 0.0
glmax           : 32767
glmin           : 0
descrip         : b''
aux_file        : b''
qform_code      : scanner
sform_code      : unknown
quatern_b       : 0.0
quatern_c       : 0.0
quatern_d       : 0.0
qoffset_x       : 128.0
qoffset_y       : 128.0
qoffset_z       : 63.0
srow_x          : [0. 0. 0. 0.]
srow_y          : [0. 0. 0. 0.]
srow_z          : [0. 0. 0. 0.]
intent_name     : b''
magic           : b'n+1'

It can be seen that the amount of information contained in the .nii file format is very large. Among them, what is more relevant to our experiment is data shapethe information and the data itself. From the former, we can know the dimension information of the data, that is, 128*128*63*1. The four dimensions represent the length, width, and number of channels respectively. Baseline Information on how to read the data itself, that is img.dataobj(or through img.get_data()a function) for data acquisition.

Feature extraction and model training (model training)

In traditional machine learning, the quality of the information extracted during feature extraction often affects the final result of the model. The baseline provides some pixel features extracted from images for model training. The following features are given for computation:

feat = [
        (random_img != 0).sum(),               # 非零像素的数量
        (random_img == 0).sum(),               # 零像素的数量
        random_img.mean(),                     # 平均值
        random_img.std(),                      # 标准差
        len(np.where(random_img.mean(0))[0]),  # 在列方向上平均值不为零的数量
        len(np.where(random_img.mean(1))[0]),  # 在行方向上平均值不为零的数量
        random_img.mean(0).max(),              # 列方向上的最大平均值
        random_img.mean(1).max()               # 行方向上的最大平均值
    ]

After the feature, the data category to which the current image belongs is added. I think this technique is worth learning.
The code given is as follows:

 # 根据路径判断样本类别('NC'表示正常,'MCI'表示异常)
    if 'NC' in path:
        return feat + ['NC']
    else:
        return feat + ['MCI']

Among them feat + ['NC'], such a method can add a list item after the original feature, so that the length of the list is +1, and the list becomes the [..., ..., ... , 'NC']format after the addition. This is the first time I've learned that lists can be used like this.

In model training, using sklearnthis classic machine learning package, we only need to define the model model, and then perform data fitting and prediction on the test set respectively by model.fit()and model.predict()to get the result of model prediction, which is very useful for beginners. friendly.
The Logistic regression model is used for training in the baseline, and its code is as follows:

from sklearn.linear_model import LogisticRegression
m = LogisticRegression(max_iter=1000)
m.fit(
    np.array(train_feat)[:, :-1].astype(np.float32),  # 特征
    np.array(train_feat)[:, -1]                       # 类别
)

At the same time, I used the random forest classifier for comparison, and found that with the n_estimatorsincrease of parameters, the performance of the model has improved to a certain extent.

from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=2000,max_depth=20, n_jobs=-1, oob_score=True, random_state=10)

model.fit(
    np.array(train_feat)[:, :-1].astype(np.float32),  # 特征
    np.array(train_feat)[:, -1]                       # 类别
    )

Of course, since the baseline and design features are relatively basic, the prediction results we get are relatively general, and even not as high as the accuracy of coin tossing. It will gradually improve in the next practice.

Guess you like

Origin blog.csdn.net/code_zhao/article/details/131865784