Python Meta-Learning - Implementation of General Artificial Intelligence Chapter 2 Reading Notes

Code of this book: https://github.com/sudharsan13296/Hands-On-Meta-Learning-With-Python
This book’s ISBN number: 9787115539670

Insert image description here

Chapter 1: Meta-learning

Chapter 2: Metric-based single-sample learning algorithm—Twin Network

2.1 What is twin network

One-shot learning only learns one training instance in each category. Twin networks are mainly used in applications with fewer data points in each category, and they can learn from fewer data points.

Siamese networks are roughly composed of two symmetric neural networks, which haveSame weights and architecture, and are finally connected together by the energy function E.

Insert image description here
We feed image X1 to network A and image X2 to network B. The function of both networks is to generate embeddings for input images, that isfeature vector. We then feed these embeddings to the energy function, which gives us thesimilarity. The energy function can be basically any similarity measure, such asEuclidean distance and cosine similarity

2.1.1 Architecture of twin network

Insert image description here
We use Euclidean distance as the energy function. If X1 and X2 are similar, then the value of E will be smaller. If the input values ​​are not similar, then the value of E will be large.

The inputs of the twin network (X1, different).
Insert image description here
In the above formula, Y represents the true label, which is 1 when the two input values ​​are similar and 0 when the two input values ​​are dissimilar. E is the energy function, which can be any distance metric. The variable margin is used to save the constraint, that is, when two input values ​​​​are not similar, if theirIf the distance is greater than the margin value, then there will be no loss
当 E < margin 时,此时Y=0,Loss = (margin-E)^2 ,神经网络以最小化该损失为目标,那么E就会朝着大于margin的方向优化,对于两个不相似的向量,距离越远越好

2.1.2 Application of Twin Network

The goal of the signature verification task is to identify the authenticity of the signature. The twin network is trained with positive sample signature pairs and negative sample signature pairs. Features are extracted from the signature using a convolutional network, and then identified by measuring the distance between the two feature vectors. similarity. When a new signature appears, these features are extracted and compared withStored signer feature vectorA comparison is made, and if the distance is less than a certain threshold, the signature is accepted as authentic, otherwise the signature is rejected.
存储各个类别样本的特征向量,然后用于对比特征

2.2 Using Siamese Network for Face Recognition

Twin network requirementsInput values ​​are paired with labels, the data must be created this way. We randomly take two images from the same folder and label them as positive sample pairs; we take one image from each of the two folders and label them as negative sample pairs. As shown in the figure, the photos of the positive sample pairs show the same person, while the photos of the negative sample pairs show different people.
正样本是同一类别,负样本是不同类别
Insert image description here
We feed one image of the image pair into network A and the other image into network B (actually the same network). The role of these two networks is justExtract feature vectors, and then put the output of the two networksFeature vector input energy function, used to measure similarity (using Euclidean distance as an energy function). Therefore, we train the network by inputting image pairs to learn the semantic similarity between them.

Define the data set
We have 20,000 data points, 10,000 of which are positive sample pairs, and the remaining 10,000 are negative sample pairs. The positive sample pairs and negative sample pairs are spliced ​​into complete data:

X = np.concatenate([x_geuine_pair, x_imposite_pair], axis=0)/255
Y = np.concatenate([y_genuine, y_imposite], axis=0)

X.shape
(2000, 2, 1 ,56 ,46)
1*56*46 是图像通道和尺寸,2表示一对数据,2000是总数据对数

Y.shape
(2000, 1)

To build the twin network
, first, define the base network, which is basically a convolutional network used for feature extraction. An input image is convolved to obtain a feature map, and then flattened into a vector, which is passed through two linear functions (the outputs are 128 and 50), and finally get a feature vector of length 50.
F1(x) = (1, 50)

Feed an image pair into the base network F1()and it will return embeddings, i.e. feature vectors:
feat_vecs_a = F1(img_a)
feat_vecs_b = F1(img_b)

feat_vecs_a and feat_vecs_b are the feature vectors of the image pair. Next, input these feature vectors into the energy function, calculate the distance between them, and use the Euclidean distance as the energy function:

def euclidean_distance(vects):
	x, y = vects
	return K.sqrt(K.sum(K.square(x - y), axis=1, keepdims=True))

Next, the loss function is defined as the contrastive_loss function:

# y_true是标签,y_pred是每一对的距离
# K.maximum(margin - y_pred, 0) 表示当负样本的距离超过 margin 那么不再执行优化
def contrastive_loss(y_true, y_pred):
	margin = 1
	return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))

Summarize

Composition of the data set:
Insert image description here
overall algorithm process:
Insert image description here

think

The architecture of the twin network is only suitable for determining the similarity of two categories of things, and cannot determine the similarity of multiple categories at the same time.

Guess you like

Origin blog.csdn.net/qq_56039091/article/details/127308730