Brain PET Image Analysis and Disease Prediction Challenge-DataWhaleAI Training Camp Phase 1 CV Direction

Introduction to competition questions

As the title says, the title is Brain PET Image Analysis and Disease Prediction Challenge, which belongs to the CV direction of the first phase of the DataWhale training camp. We are in the long game. For the competition, in the training camp, DataWhale mainly trained the students in three stages. They are baseline explanation and testing, CNN method, and method based on Baidu Feipian.
The competition link is as follows: Competition link

Note: All codes will be uploaded to github later depending on the situation.

Personal plan

The author has a little basic knowledge, so there is no summary of the previous content in the training camp, just a brief understanding (such as setting up the environment, using flying paddles, etc.). However, since it is my first time to participate in a similar competition and I have almost no practical experience, the ideas in this article are basically based on the ideas on the Internet and the ideas of group friends [1]. On the basis of the CNN method, I use new bing to slow down myself . Write slowly and adjust parameters slowly. The final effect is just more than the CNN method. The effect is as follows
Insert image description here

Data processing (denoising)

Since the image visualization initially seemed to have some noise, some basic denoising methods were used, as follows, but the results did not seem to have much effect.

    # Convert the image into a gray scale image
    image = np.array(image)
    gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)   
    # Denoise the image
    gray = cv2.fastNlMeansDenoising(gray)
    # Denoise the image using median filter
    gray = cv2.medianBlur(gray, 3)
    # Denoise the image using Gaussian filter
    gray = cv2.GaussianBlur(gray,(3,3),0)

Data processing (adaptive cropping)

For adaptive cropping of denoised images, the main idea is to extract the edges of the image content, then crop the image according to the edges, then align the short and long sides of the image, and use padding to obtain a square image. To facilitate subsequent resize operations without distortion. The specific code is as follows

def crop_brain_contour(image, plot=False):
    """
    This function takes an image and returns the image with only the brain part. 
    """
    # Convert the image into a gray scale image
    image = np.array(image)
    gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)   
    # Denoise the image
    gray = cv2.fastNlMeansDenoising(gray)
    # Denoise the image using median filter
    gray = cv2.medianBlur(gray, 3)
    # Denoise the image using Gaussian filter
    gray = cv2.GaussianBlur(gray,(3,3),0)
    # Threshold the image
    ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU) 
    # 使用三角形方法计算阈值
    # ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
    # # 使用自适应阈值化(均值)
    # thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)
    # # 使用自适应阈值化(高斯)
    # thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)  
    # Find the contours of the image
    contours, hierarchy = cv2.findContours(thresh.astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2:] 
    # Select the biggest contour
    cnt = max(contours, key=cv2.contourArea) 
    # Get the coordinates of the bounding box of the contour
    x, y, w, h = cv2.boundingRect(cnt)   
    # Crop the image using the coordinates of the bounding box
    crop = image[y:y+h, x:x+w]

    # 计算需要填充的边框大小
    top = bottom = left = right = 0
    if h > w:
        diff = h - w
        left = right = diff // 2
    else:
        diff = w - h
        top = bottom = diff // 2
    # 使用cv2.copyMakeBorder函数来填充图像
    bordered_image = cv2.copyMakeBorder(crop, top, bottom, left, right, cv2.BORDER_CONSTANT, value=0)
    if plot:
        # 获取当前时间并将其格式化为字符串
        timestamp = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
        # 构造文件名
        filename = f'cropped_image_{
      
      timestamp}.png'
        # 构造文件路径
        filepath = os.path.join('images', filename)
        pil_img = Image.fromarray(bordered_image)
        pil_img.save(filepath)      
    return bordered_image

Data augmentation (multiple random slicing)

Since the initial data is in nii three-dimensional format, the CNN code provided by the training camp only extracts some of the channels, that is, converted into pictures through slicing. Then in order to adapt to the model input, I changed the number of channels to 3, so there was a problem. Originally there were 60 (assumed), then I lost a lot of information, so when randomly sampling, I randomized a few more times, This can reduce losses to a certain extent, and at the same time achieve the purpose of data enhancement. This step has a relatively obvious optimization effect . The implementation code is as follows:

DATA_CACHE = {
    
    }
class XunFeiDataset(Dataset):
    def __init__(self, img_path, transform=None, num_slices=1):
        self.img_path = img_path
        self.num_slices = num_slices
        if transform is not None:
            self.transform = transform
        else:
            self.transform = None
    
    def __getitem__(self, index):
        if self.img_path[index] in DATA_CACHE:
            img = DATA_CACHE[self.img_path[index]]
        else:
            img = nib.load(self.img_path[index]) 
            img = img.dataobj[:,:,:, 0]
            DATA_CACHE[self.img_path[index]] = img

        # 多次随机选择多个通道
        images = []
        for _ in range(self.num_slices):
            idx = np.random.choice(range(img.shape[-1]), 3)
            slice_img = img[:, :, idx]

            # 将三维数组转换为image
            slice_img = slice_img.astype(np.uint8)
            pil_img = Image.fromarray(slice_img)
            cropped_img = crop_brain_contour(pil_img)
            slice_img = np.array(cropped_img)

            slice_img = slice_img.astype(np.float32)

            if self.transform is not None:
                slice_img = self.transform(image=slice_img)['image']

            slice_img = slice_img.transpose([2,0,1])
            images.append(slice_img)

        images = np.stack(images)
               
        return images, torch.from_numpy(np.array(int('NC' in self.img_path[index])))
    
    def __len__(self):
        return len(self.img_path)

Adjust parameters (learning rate, number of iterations, etc.)

In terms of model selection, I tried resnet18, resnet50, resnet101, efficientnet_b7, efficientnetv2_l. Generally speaking, resnet50 and efficientnet_b7 have better effects, and other differences are not particularly big, but there will be some over-fitting and under-fitting phenomena. code show as below:

class XunFeiNet(nn.Module):
    def __init__(self):
        super(XunFeiNet, self).__init__()
        
        model = timm.create_model('tf_efficientnet_b7_ns', pretrained=True)
        # model = timm.create_model('tf_efficientnetv2_l', pretrained=True)
        model.classifier = nn.Linear(model.classifier.in_features, 2)
        self.dropout = nn.Dropout(p=0.5)
        self.efficientnet = model
        
    def forward(self, img):        
        out = self.efficientnet(img)
        out = self.dropout(out)
        return out

In terms of learning rate, a smaller learning rate was chosen because it would be difficult to reduce the loss, which cannot be solved well for the time being. Using the AdamW optimizer, the attenuation coefficient is set. Generally speaking, this part may vary from person to person and is for reference only. The specific code is as follows

model = XunFeiNet()
# model = CLIPnet(num_classes=2)
# print(model)
device = 'cuda:1'
model = model.to(device)
# criterion = nn.CrossEntropyLoss().to(device)
weights = [0.15, 0.85] # 指定每个类别的权重
weights = torch.tensor(weights).to(device)
criterion = nn.CrossEntropyLoss(weight=weights).to(device)
optimizer = torch.optim.AdamW(model.parameters(), 0.00001, weight_decay=0.0001)
# optimizer = torch.optim.SGD(model.parameters(), 0.01)

In addition, for the loss loss design, due to multiple submissions of competition results, it was found that there should be fewer MCI categories in the test. During training, the model training acc is often high, but there are more MCIs in the prediction results. Therefore, for Loss has been weighted and optimized, so that the model pays more attention to the learning of NC categories. At the same time, the L2 regularization term is also used during loss training. Judging from the results, there is a certain optimization, but the effect is not very obvious.
The relevant code is as follows:

def train(train_loader, model, criterion, optimizer,l2_lambda=0.000001):
    model.train()
    train_loss = 0.0
    for i, (input, target) in enumerate(train_loader):
        input = input.to(device, non_blocking=True)
        target = target.to(device, non_blocking=True)
        output = model(input)
        loss = criterion(output, target)
        l2_norm = sum(p.pow(2.0).sum() for p in model.parameters())
        loss += l2_lambda * l2_norm
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        # if i % 20 == 0:
        #     print(loss.item())      
        train_loss += loss.item()
    return train_loss/len(train_loader)

Summarize

There are also some other code details, which I won’t go into details. I will refine them later when I have time. Mainly after a week of fiddling with them, I just barely surpassed the results of most students’ direct full NC results, so I feel that I don’t have a good solution to share. The above is similar to the reproduction and reflection of other people's plans for your reference. Overall, the focus of the code should still be on data preprocessing and data enhancement. In addition, I originally wanted to optimize from a multi-modal direction, such as combining frequency domain and pixel domain information, or image and text information, and using CLIP and other models to conduct experiments, but the time and level are limited (mainly because the group meeting pressure is too great, pendulum. jpg), so I didn’t have time to do it. I will add it after doing it later, and it should have a certain effect.

In short, this is the first experience of experiencing a competition. It can be regarded as a practical experience of the process of a similar competition. I have finally done what I didn’t do in college several years ago. Although there are always twists and turns, I am finally moving forward. Keep working hard.

Reference links and literature:

1. A reference plan for seniors

Guess you like

Origin blog.csdn.net/weixin_44496128/article/details/132012237