1 twin Network (Siamese Network)

Twins network is mainly used to measure the degree of similarity of the two inputs. Twin neural network has two inputs (Input1 and Input2), the two inputs feed into the two neural networks (Network1 and Network2), both the neural network are mapped to a new input space, is formed in a new input space representation (representation). Loss of calculation by the evaluation of the similarity between two inputs. Specific reference

Twins network is actually equivalent to only one network, because the two neural networks (Network1 and Network2) structure weights are the same. If two different weights or structures, called pseudo twin neural network (pseudo-siamese network).

loss twins networks have several options:

Contrastive Loss (Siamese traditional use);
Triplet loss（详见 Deep metric learning using Triplet network）；
Softmax loss: to convert into a binary classification problem, the absolute difference is about two outputs mapped to a node;
Other losses, such as cosine loss, exp function, Euclidean distance and so on.

2 item code

Based on the ORL data set, to verify the face (Face Verification) using a twin network.

Project code can be divided into four elements:

Prelude: Import associated libraries, define parameters and an auxiliary function;
Data Preparation: prepare the data set, and packaged into a dataset and Dataloader;
Ready model: build the model, custom loss function;
Training: Training and plot the results;
Test: visualize model results.

2.1 prelude

First import the relevant library and define the relevant functions and parameters.

import torch
import torchvision
import torch.nn as nn
from torch import optim
import torch.nn.functional as F
import torchvision.transforms as transforms
from torch.utils.data import DataLoader,Dataset
import matplotlib.pyplot as plt
import torchvision.utils
import numpy as np
import random
from PIL import Image
import PIL.ImageOps 
print(torch.__version__)  #1.1.0
print(torchvision.__version__)  #0.3.0


#定义一些超参
train_batch_size = 32        #训练时batch_size
train_number_epochs = 50     #训练的epoch

def imshow(img,text=None,should_save=False): 
    #展示一幅tensor图像，输入是(C,H,W)
    npimg = img.numpy() #将tensor转为ndarray
    plt.axis("off")
    if text:
        plt.text(75, 8, text, style='italic',fontweight='bold',
            bbox={'facecolor':'white', 'alpha':0.8, 'pad':10})
    plt.imshow(np.transpose(npimg, (1, 2, 0))) #转换为(H,W,C)
    plt.show()    

def show_plot(iteration,loss):
    #绘制损失变化图
    plt.plot(iteration,loss)
    plt.show()

2.2 Preparing Data

2.2.1 ORL face data set

ORL face data set contains a total of 40 people of 400 different images are created by the Olivetti Research Laboratory in Cambridge, England in April 1992 to April 1994 period. This dataset contains the directory 40, and 10 images per directory, each one representing a different person. All images are stored in the format PGM, grayscale, the image size of width 92 and a height of 112. For each image in a directory, these images at different times, different lighting, different facial expressions (eyes open / eyes closed, smile / no smile), and the facial features (glasses / without glasses) environment acquisition of. All images are taken in a dark uniform background, the face imaging is positive (with some slight cornering).

The training dataset download folder contains 37 individual images thereof, the remaining three people in an image folder testing, leaving a subsequent test when in use.

2.2.2 Custom Dataset and DataLoader

Custom Dataset need to achieve __ getitem __ and __ len __ function. Read each pair of images, label indicates the degree of difference, 0 represents the same person, 1 is not the same person.

#自定义Dataset类，__getitem__(self,index)每次返回(img1, img2, 0/1)
class SiameseNetworkDataset(Dataset):
    
    def __init__(self,imageFolderDataset,transform=None,should_invert=True):
        self.imageFolderDataset = imageFolderDataset    
        self.transform = transform
        self.should_invert = should_invert
        
    def __getitem__(self,index):
        img0_tuple = random.choice(self.imageFolderDataset.imgs) #40个类别中任选一个
        should_get_same_class = random.randint(0,1) #保证同类样本约占一半
        if should_get_same_class:
            while True:
                #直到找到同一类别
                img1_tuple = random.choice(self.imageFolderDataset.imgs) 
                if img0_tuple[1]==img1_tuple[1]:
                    break
        else:
            while True:
                #直到找到非同一类别
                img1_tuple = random.choice(self.imageFolderDataset.imgs) 
                if img0_tuple[1] !=img1_tuple[1]:
                    break

        img0 = Image.open(img0_tuple[0])
        img1 = Image.open(img1_tuple[0])
        img0 = img0.convert("L")
        img1 = img1.convert("L")
        
        if self.should_invert:
            img0 = PIL.ImageOps.invert(img0)
            img1 = PIL.ImageOps.invert(img1)

        if self.transform is not None:
            img0 = self.transform(img0)
            img1 = self.transform(img1)
        
        return img0, img1, torch.from_numpy(np.array([int(img1_tuple[1]!=img0_tuple[1])],dtype=np.float32))
    
    def __len__(self):
        return len(self.imageFolderDataset.imgs)
    
    
    
#定义文件dataset
training_dir = "./data/faces/training/"  #训练集地址
folder_dataset = torchvision.datasets.ImageFolder(root=training_dir)

#定义图像dataset
transform = transforms.Compose([transforms.Resize((100,100)), #有坑，传入int和tuple有区别
                                transforms.ToTensor()])
siamese_dataset = SiameseNetworkDataset(imageFolderDataset=folder_dataset,
                                        transform=transform,
                                        should_invert=False)

#定义图像dataloader
train_dataloader = DataLoader(siamese_dataset,
                            shuffle=True,
                            batch_size=train_batch_size)

2.2.3 visualization datasets

Of course, we must first look at the data usually specific sawed. In actual operation this piece of code may be omitted.

vis_dataloader = DataLoader(siamese_dataset,
                        shuffle=True,
                        batch_size=8)
example_batch = next(iter(vis_dataloader)) #生成一批图像
#其中example_batch[0] 维度为torch.Size([8, 1, 100, 100])
concatenated = torch.cat((example_batch[0],example_batch[1]),0) 
imshow(torchvision.utils.make_grid(concatenated, nrow=8))
print(example_batch[2].numpy())

Note torchvision.utils.make_grid Usage: The number of images makes up an image. Internal mechanism is paved tensor grid-like, wherein the input must be a four-dimensional tensor (B, C, H, W ). Follow-up also need to call numpy () and transpose (), and then plt display.

# https://pytorch.org/docs/stable/_modules/torchvision/utils.html#make_grid
torchvision.utils.make_grid(tensor, nrow=8, padding=2, normalize=False, range=None, scale_each=False, pad_value=0)

#示例
t = torchvision.utils.make_grid(concatenated, nrow=8)
concatenated.size()  #torch.Size([16, 1, 100, 100])
t.size() #torch.Size([3, 206, 818]) 对于(batch,1,H,W)的tensor，重复三个channel，详见官网文档源码

2.3 Preparation model

Custom models and loss of function.

#搭建模型
class SiameseNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.cnn1 = nn.Sequential(
            nn.ReflectionPad2d(1),
            nn.Conv2d(1, 4, kernel_size=3),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(4),
            
            nn.ReflectionPad2d(1),
            nn.Conv2d(4, 8, kernel_size=3),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(8),

            nn.ReflectionPad2d(1),
            nn.Conv2d(8, 8, kernel_size=3),
            nn.ReLU(inplace=True),
            nn.BatchNorm2d(8),
        )

        self.fc1 = nn.Sequential(
            nn.Linear(8*100*100, 500),
            nn.ReLU(inplace=True),

            nn.Linear(500, 500),
            nn.ReLU(inplace=True),

            nn.Linear(500, 5))

    def forward_once(self, x):
        output = self.cnn1(x)
        output = output.view(output.size()[0], -1)
        output = self.fc1(output)
        return output

    def forward(self, input1, input2):
        output1 = self.forward_once(input1)
        output2 = self.forward_once(input2)
        return output1, output2
    
    
#自定义ContrastiveLoss
class ContrastiveLoss(torch.nn.Module):
    """
    Contrastive loss function.
    Based on: http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
    """

    def __init__(self, margin=2.0):
        super(ContrastiveLoss, self).__init__()
        self.margin = margin

    def forward(self, output1, output2, label):
        euclidean_distance = F.pairwise_distance(output1, output2, keepdim = True)
        loss_contrastive = torch.mean((1-label) * torch.pow(euclidean_distance, 2) +
                                      (label) * torch.pow(torch.clamp(self.margin - euclidean_distance, min=0.0), 2))

        return loss_contrastive

2.4 Training

net = SiameseNetwork().cuda() #定义模型且移至GPU
criterion = ContrastiveLoss() #定义损失函数
optimizer = optim.Adam(net.parameters(), lr = 0.0005) #定义优化器

counter = []
loss_history = [] 
iteration_number = 0


#开始训练
for epoch in range(0, train_number_epochs):
    for i, data in enumerate(train_dataloader, 0):
        img0, img1 , label = data
        #img0维度为torch.Size([32, 1, 100, 100])，32是batch，label为torch.Size([32, 1])
        img0, img1 , label = img0.cuda(), img1.cuda(), label.cuda() #数据移至GPU
        optimizer.zero_grad()
        output1,output2 = net(img0, img1)
        loss_contrastive = criterion(output1, output2, label)
        loss_contrastive.backward()
        optimizer.step()
        if i % 10 == 0 :
            iteration_number +=10
            counter.append(iteration_number)
            loss_history.append(loss_contrastive.item())
    print("Epoch number: {} , Current loss: {:.4f}\n".format(epoch,loss_contrastive.item()))
    
show_plot(counter, loss_history)

2.5 Test

Now with the testing folder image three people were tested Note: The model never seen these three individual images.

#定义测试的dataset和dataloader

#定义文件dataset
testing_dir = "./data/faces/testing/"  #测试集地址
folder_dataset_test = torchvision.datasets.ImageFolder(root=testing_dir)

#定义图像dataset
transform_test = transforms.Compose([transforms.Resize((100,100)), 
                                     transforms.ToTensor()])
siamese_dataset_test = SiameseNetworkDataset(imageFolderDataset=folder_dataset_test,
                                        transform=transform_test,
                                        should_invert=False)

#定义图像dataloader
test_dataloader = DataLoader(siamese_dataset_test,
                            shuffle=True,
                            batch_size=1)


#生成对比图像
dataiter = iter(test_dataloader)
x0,_,_ = next(dataiter)

for i in range(10):
    _,x1,label2 = next(dataiter)
    concatenated = torch.cat((x0,x1),0)
    output1,output2 = net(x0.cuda(),x1.cuda())
    euclidean_distance = F.pairwise_distance(output1, output2)
    imshow(torchvision.utils.make_grid(concatenated),'Dissimilarity: {:.2f}'.format(euclidean_distance.item()))

Reference

Pytorch practiced hand Item three: Twins Network (Siamese Network)