nnUNet 源码细节记录

nnUNet代码量还是很大的，许久不看再使用就有很多记不清楚了，一些细节的小点记录如下：

一实验前的数据准备阶段：

先把4D的image分成3D的数据，以_0000.nii.gz _0001.nii.gz ...结尾

cropping 操作

1 gt_segmentation从crop操作开始存在的，就是原始的label文件复制过去而已。

【此部分代码 preprocessing/cropping.py】
output_folder_gt = os.path.join(self.output_folder, "gt_segmentations")
       
maybe_mkdir_p(output_folder_gt)

for j, case in enumerate(list_of_files):
    if case[-1] is not None:
        shutil.copy(case[-1], output_folder_gt)

2 properties也是从crop操作开始的，crop将图像非有效区域切除。将array和properties从nii.gz文件中提取了出来，分别保存在npz和pkl文件中。

- 如果存在seg文件：，npy中保存的array.shape为[modality+1, d, h, w]。modality为图像的模态数，1为分割label。【我开始还在猜想是不是当需要多监督时，把多监督的label在最开始作为通道放进去是不是就可以。后来发现不可以，这样操作代码前后有矛盾，我觉得作者这块代码也没有改好，label文件必须是3D的数据，再通多array[None]赋予额外的一个维度，不然code中有一处assert会报错】并且此时seg矩阵中已经存在等于-1的值。

- 如果不存在seg文件：使用nonzerosmask作为seg，并且，值为[-1，0]。【这种情况不会出现了，因为seg没有为None的情况，前边早就报错了】

保存的properties信息：

properties["original_size_of_raw_data"] = np.array(data_itk[0].GetSize())[[2, 1, 0]]
properties["original_spacing"] = np.array(data_itk[0].GetSpacing())[[2, 1, 0]]
properties["list_of_data_files"] = data_files
properties["seg_file"] = seg_file

properties["itk_origin"] = data_itk[0].GetOrigin()
properties["itk_spacing"] = data_itk[0].GetSpacing()
properties["itk_direction"] = data_itk[0].GetDirection()

properties["crop_bbox"] = bbox
properties['classes'] = np.unique(seg)
properties["size_after_cropping"] = data[0].shape

### 后续又陆续加入了：
### 'use_nonzero_mask_for_norm', 'size_after_resampling', 'spacing_after_resampling', 
### 'classes_in_slice_per_axis', 'number_of_voxels_per_class'

基于crop处理的数据进行dataset analysis

3 crop文件夹中的props_per_case_file.pkl文件中保存的是，['has_classes', 'only_one_region', 'volume_per_class', 'region_volume_per_class']：【seg的analysis】

'has_classes': 为seg中包含的class，包括-1.
'only_one_region': 为所有类别和分别每个类别是否只包含一个连通域
'volume_per_class': 为每个类别的真实世界的体积（体素个数×prob（spacing））
'region_volume_per_class': 为每个类别中，每个连通域的体积

4 crop文件夹中的intensityproperties.pkl保存的是：【image本身的analysis】

results[mod_id]['local_props'] = props_per_case# 每个case的下列指标，下边的是全局所有数据的指标
results[mod_id]['median'] = median
results[mod_id]['mean'] = mean
results[mod_id]['sd'] = sd
results[mod_id]['mn'] = mn
results[mod_id]['mx'] = mx
results[mod_id]['percentile_99_5'] = percentile_99_5
results[mod_id]['percentile_00_5'] = percentile_00_5
### mod_id指不同的模态

5 dataset_properties.pkl集大成者，包含了上边两个文件中的内容。

dataset_properties['all_sizes'] = sizes # size after cropping
dataset_properties['all_spacings'] = spacings # original spacing reverted
dataset_properties['segmentation_props_per_patient'] = segmentation_props_per_patient
dataset_properties['class_dct'] = class_dct  # {int: class name} 0，1，2...
dataset_properties['all_classes'] = all_classes ### >0 [1,2,3...]
dataset_properties['modalities'] = modalities  # {idx: modality name}
dataset_properties['intensityproperties'] = intensityproperties
dataset_properties['size_reductions'] = size_reductions  # {patient_id: size_reduction=after/before}

一切准备就绪，开始真正地计划数据

6 self.determine_whether_to_use_mask_for_norm 这个函数是干什么的呢？

决定一个属性，properties['use_nonzero_mask_for_norm']，也是这个函数将crop文件夹下的所有case对应的pkl文件都加上了这个属性。

【目前来看这个主要是影响：1）非CT模态：正则化时统计量的计算方式，是不是只考虑前景；data[c][mask == 0] = 0

2)CT模态：data[c][seg[-1] < 0] = 0】

首先，如果模态是CT的话，这个属性就是False；如果不是CT，则需要判断一下，判断标准为：是否有一半的数据size_reduction小于3/4，是的话为True，否的话为False。

ps：此属性是个dict，每个模态对用一个True或者False

7 self.get_target_spacing()取的中间平均值，

np.percentile(np.vstack(spacings), TARGET_SPACING_PERCENTILE, 0) # TARGET_SPACING_PERCENTILE = 50

先算一些数据集的中间平均spacing啊，resample之后的中间平均shape啊之类的，然后就可以去根据configuration文件中的设置去迭代求取网络结构的一些参数了。【所以你可以修改configuration.py中的值来调整网络结构的设计】

其中，源码中会将spacing最大的一个axis放在第一位；并且do_dommy_2D_data_aug？？也是这里求得的。

plan = {
                'batch_size': batch_size,
                'num_pool_per_axis': network_num_pool_per_axis,
                'patch_size': input_patch_size,
                'median_patient_size_in_voxels': new_median_shape,
                'current_spacing': current_spacing, # 经过transposed操作了
                'original_spacing': original_spacing, # 亦经过transposed操作了
                'do_dummy_2D_data_aug': do_dummy_2D_data_aug,
                'pool_op_kernel_sizes': pool_op_kernel_sizes,
                'conv_kernel_sizes': conv_kernel_sizes,
            }

8 3D在plans的时候判断需不需要第二阶段很简单，就是判断前边迭代求得的input patch size体素占整个图的比例：

if np.prod(self.plans_per_stage[-1]['median_patient_size_in_voxels'], dtype=np.int64) /
architecture_input_voxels_here < HOW_MUCH_OF_A_PATIENT_MUST_THE_NETWORK_SEE_AT_STAGE0:
    more = False
else:
    more = True

9 先指定的high_res的plan，根据第8条，在确定要不要指定low_res的plan，也是一点一点儿来的，spacing一点一点儿涨，系数1.01。

10 制定好两阶段plans之后，确定normalization方式：

normalization_schemes = self.determine_normalization_scheme()
# 返回也是一个dict，值为CT nonCT
# 有个后处理，不过好像去掉了
plans = {'num_stages': len(list(self.plans_per_stage.keys())),
         'num_modalities': num_modalities,
         'modalities': modalities,
         'normalization_schemes': normalization_schemes,
         'dataset_properties': self.dataset_properties,
         'list_of_npz_files': self.list_of_cropped_npz_files,
         'original_spacings': spacings,
         'original_sizes': sizes,
         'preprocessed_data_folder': self.preprocessed_output_folder,
         'num_classes': len(all_classes),
         'all_classes': all_classes,
         'base_num_features': Generic_UNet.BASE_NUM_FEATURES_3D,
         'use_mask_for_norm': use_nonzero_mask_for_normalization,
         'keep_only_largest_region': only_keep_largest_connected_component,
         'min_region_size_per_class': min_region_size_per_class,
         'min_size_per_class': min_size_per_class,
         'transpose_forward': self.transpose_forward,
         'transpose_backward': self.transpose_backward,
         'data_identifier': self.data_identifier,  ### ......nnUNet
         'plans_per_stage': self.plans_per_stage}

指定计划完毕，开始处理

11 先将gt_segmentation文件夹从crop文件夹下复制过来

data, seg, properties = self.resample_and_normalize(data, target_spacing, properties, seg, force_separate_z)

resample_patient（）然后再normalization

2D和3D基本一样，只是在操作是，保留了第0 axis的spacing（默认为depth维度）。只对，2、3维度进行resample。

12 处理完成后，对每个数据的有效信息进行统计：

如在不同的方向上（3个），每个类别的有效层面。

classes_in_slice = OrderedDict()
for axis in range(3):
    other_axes = tuple([i for i in range(3) if i != axis])
    classes_in_slice[axis] = OrderedDict()
    for c in all_classes:
        valid_slices = np.where(np.sum(seg_map == c, axis=other_axes) > 0)[0]
        classes_in_slice[axis][c] = valid_slices # props['classes_in_slice_per_axis']

还有每个类别的体素个数：

number_of_voxels_per_class = OrderedDict()
for c in all_classes:
    number_of_voxels_per_class[c] = np.sum(seg_map == c)

二实验训练部分

13 run_training.py 开始会通过 get_default_configuration 函数来获取一些实验配置变量，如：

plans_file        ： # plans的应的pkl文件
output_folder_name： # 输出结果的路径
dataset_directory ： # join(preprocessing_output_dir, task) # 对应处理后的数据文件夹
batch_dice        ： # 2d 3d_fullres 3d_cascade_fullres为True，3d_lowres为False???
stage             ： # 使用第几个plans，code中会核验你输入的arg是否合理.四个不同的network【'2d', '3d_lowres', '3d_fullres', '3d_cascade_fullres'】
trainer_class     ： # trainer对应的类

14 nnUNetTrainerCascadeFullRes这个类在初始化的时候会定义lowres预测结果的文件夹，其他的初始化操作都使用在父类nnUNetTrainer中定义的，就是定义一些self属性，供后边赋值并使用【nnUNet这块写的挺好的，大家后续可以借鉴这种代码结构】：

if self.output_folder is not None:：
    self.folder_with_segs_prev_stage = join(network_training_output_dir, "3d_lowres", task, previous_trainer + "__" + plans_identifier, "pred_next_stage")
    self.folder_with_segs_from_prev_stage_for_train = join(self.dataset_directory, "segs_prev_stage")

15 nnUNetTrainerCascadeFullRes 类下的函数：

self.initialize：

1 process_plans：
    super(nnUNetTrainerCascadeFullRes, self).process_plans(plans) # 将初始化类时定义的self属性赋值
    self.num_input_channels += (self.num_classes - 1)  # for seg from prev stage ？？？ -1
2 setup_DA_params：
    # 设置一些和数据增广相关的参数
    其中在nnUNetTrainer中：
        self.data_aug_params['selected_seg_channels'] = [0]
    另外，在nnUNetTrainerCascadeFullRes中：
        self.data_aug_params['selected_seg_channels'] = [0, 1] # 覆盖了父函数中的定义
        self.data_aug_params['all_segmentation_labels'] = list(range(1, self.num_classes)) # ？？？
        self.data_aug_params['move_last_seg_chanel_to_data'] = True # ？？？
        self.data_aug_params['advanced_pyramid_augmentations'] = True # ？？？
3 if training：
    如果 self.folder_with_segs_from_prev_stage_for_train 这个文件夹存在就给删了。然后新建一个，将self.folder_with_segs_from_prev_stage文件夹中的npz文件复制到新建的文件夹中去.
    然后把self.folder_with_segs_from_prev_stage_for_train 赋值给 self.folder_with_segs_from_prev_stage，一波操作很复杂啊，说白了就是将最新的low_Res文件夹下的结果搞到high_Res训练中来
    然后就是定义dataset和dataloader，cascade模式的和其他的不同的就是has_Prev_stage=True, pad_mode不太一样。

cascadeTrainer类剩下的操作，除了还有一个validate，就和普通的Trainer类一样了。

16 对于dataset_loading其中的force_fg：【force_foreground】

3D：会先随机选择一个前景类，再在前景类的voxel中选一个作为起始点【当然要判断这个点的有效性】

2D：虽然细节操作会有些不同，但是思路和3D说的一样，只不过用到了12条里提到的props['classes_in_slice_per_axis']

17 对于augmentation说明：

1 DataChannelSelectionTransform：
    选择数据中的哪些channels，在augmentation_params中的一项，默认为None，则不采用此操作。
2 SegChannelSelectionTransform：
    如上操作，选取使用那些channels，不同的是有一个keep_discarded_seg布尔参数，用来确定是否在
dict中保存不使用的channels。在普通的nnUNetTrainer中为[0]，cascade中为[0，1]，因为在dataloading
中已经将上一个stage的pred结果cat到了seg的通道中；？疑问：合适并到了data中呢？
*3 Convert3DTo2DTransform
    会将channel维和depth维度暂时reshape成channel*depth维度，并记录原始shape到dict中。
    # 当dummy_2D为True时才使用
*4 SpatialTransform：
    这个就厉害了，我们常用的空间增广，elastic，rotation，scale。
    # 分sample和channel分别处理的，也就是前两个维度两个循环分别处理单个数据。求得仿射变换直接
mapping。
*5 Convert2DTo3DTransform：
    空间变换完再切回原来的维度，2D->3D。
*6 GammaTransform：
    对data的gamma变换，所以dict中的dict还是只放image信息比较好。其他的都扔到seg中吧。
*7 MirrorTransform：
    顾名思义
*8 MaskTransform：
    和数据预处理时的类似，把data中seg小于0的位置置0。
    # 这里需要指定mask在seg中的idx，代码中默认为0，即第一个。

9 RemoveLabelTransform：
    改变seg中的一些值，如：将-1 都变为0背景。

*10 MoveSegAsOneHotToData：
    cascade特供，回答第2条中的疑问，将需要的上一个stage的pred转为one-hot形式cat到data数据中，
并删除seg中对应的channel。pred默认在channel 1。
    ## 还有两个aug没写

11 RenameTransform：
    改变dict中的key，如：将seg改为target。
12 NumpyToTensor：
    改变dict中数据的类型，将array改为tensor，类型默认为‘float’

18 训练中统计量的说明：

train_losses_epoch = []     # 用于记录训练集每个iteration返回的loss;【函数内变量】
self.all_tr_losses = []     # 用于记录每个epoch，所有train iterations的loss均值
val_losses = []             # 用于记录validation集每个iteration返回的loss【函数内变量】
self.all_val_losses = []    # 用于记录每个epoch，所有valid iterations的loss均值

self.train_loss_MA = None ## 如果你有多个任务监督，这里记录其中你主任务就行，因为这个主要是负
    # 责安排后边饰实验的
    # 如果是None，赋值self.all_tr_losses[-1]；
    # 不为None之后，赋值 0.93*self.train_loss_MA + 0.07 *self.all_tr_losses[-1]

￥￥￥￥￥￥￥￥￥￥￥￥￥￥￥ on_epoch_end ￥￥￥￥￥￥￥￥￥￥￥￥￥￥￥￥￥￥

############################ finish_online_evaluation #######################
# 和 run_iteration 里边的 run_online_evaluation 搭配使用
self.online_eval_tp = []  # 应该是每个case的所有类别的tp，然后在0 axis上sum，
self.online_eval_fp = []  # 应该是每个case的所有类别的fp
self.online_eval_fn = []  # 应该是每个case的所有类别的fn
global_dc_per_class       # 每个类别算dice [dice， dice， dice...]

self.all_val_eval_metrics # 保存，所有类别的平均dice， 是个list；会被用于更新
                          # update_eval_criterion_MA

# 结束后会赋值
# self.online_eval_foreground_dc = [] # 这个好像没有用一直
# self.online_eval_tp = []
# self.online_eval_fp = []
# self.online_eval_fn = []

############################ plot_progress ############################
简单的画图：
self.all_tr_losses
self.all_val_losses

### maybe_update_lr ###
如果是ReduceLROnPlateau这种，会用到self.train_loss_MA

############################ maybe_save_checkpoint ############################

############################ update_eval_criterion_MA ############################
# 和 self.train_loss_MA 类似，记录主任务的指标就行
self.val_eval_criterion_MA = None:
    # 刚开始为None时，如果len(self.all_val_eval_metrics) == 0的话，设置为 
    # -self.all_val_losses[-1]，否则，就用self.all_val_eval_metrics[-1] #。。。最后一类的
    # metric？？？【二分类是不妨碍，但是多分类时，这里不是很合理】

    # 不为None时，和self.train_loss_MA类似， 0.9*旧的+0.1*对应的新的

############################ manage_patience ############################
# 每个epoch除了一些最好指标的记录，变化的依据主要是每个epoch的 self.train_loss_MA 和 
# self.val_eval_criterion_MA
# 主要是判断，当lr低于 self.lr_threshold 时，何时停止。【lr高于 self.lr_threshold 时，都继续
# 训练。。。及时调整lr的机制在lr_scheduler函数内部实现了，这里没有体现】

未完待续......