[Deep Learning_4.4] Face Recognition Algorithm Model Construction-Face Verification and Face Recognition

Face recognition is divided into two categories:

Face Verification means, is this face xxx?

Face Recognition means, who is this face?

FaceNet is a neural network that encodes a face photo into 128 sets of vectors, and determines whether it is the same person by comparing the 128 sets of vectors of two photos.

In this example, the trained ConvNet activation function is used, and the channel first form is applied (m, nc, nw, nh)

Encode a photo of a face into a 128-dimensional vector

The FaceNet model requires a lot of data and time to train. The trained FaceNet primary model is directly used in this program. The Inception neural network architecture is based on Szegedy et al.

The model inputs m face photos of 96*96*3 and outputs a vector (m, 128)

The last layer of the neural network uses 128 neurons and outputs 128 as a vector

Model training goals:

If the two pictures are of the same person, the distance is very small. Otherwise, the distance is huge.

The first step, define Triplet loss

The principle of the triplet loss method is to push the photos of two identical people together (Anchor and Positive) , and pull the photos of different people separately (Anchor, Negative)

Loss function:

Implementation steps:

Code:

def triplet_loss(y_true, y_pred, alpha = 0.2):
"""
Implementation of the triplet loss as defined by formula (3)

Arguments:
y_true -- true labels, required when you define a loss in Keras, you don't need it in this function.
y_pred -- python list containing three objects:
anchor -- the encodings for the anchor images, of shape (None, 128)
positive -- the encodings for the positive images, of shape (None, 128)
negative -- the encodings for the negative images, of shape (None, 128)

Returns:
loss -- real number, value of the loss
"""

anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]

### START CODE HERE ### (≈ 4 lines)
# Step 1: Compute the (encoding) distance between the anchor and the positive, you will need to sum over axis=-1
pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)),axis=-1)
# Step 2: Compute the (encoding) distance between the anchor and the negative, you will need to sum over axis=-1
neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)),axis=-1)
# Step 3: subtract the two previous distances and add alpha.
basic_loss = tf.add(tf.subtract(pos_dist,neg_dist), alpha)
# Step 4: Take the maximum of basic_loss and 0.0. Sum over the training examples.
loss = tf.reduce_sum(tf.maximum(basic_loss, 0.))

The second step, load the trained model and Triplet loss

Since the FaceNet training volume is too large, load the trained model directly:

FRmodel.compile(optimizer = 'adam', loss = triplet_loss, metrics = ['accuracy'])
load_weights_from_FaceNet(FRmodel)

Face Varification Implementation

The third step, load database data

database = {}
database["danielle"] = img_to_encoding("images/danielle.png", FRmodel)
database["younes"] = img_to_encoding("images/younes.jpg", FRmodel)
database["tian"] = img_to_encoding("images/tian.jpg", FRmodel)
database["andrew"] = img_to_encoding("images/andrew.jpg", FRmodel)
database["kian"] = img_to_encoding("images/kian.jpg", FRmodel)
database["dan"] = img_to_encoding("images/dan.jpg", FRmodel)
database["sebastiano"] = img_to_encoding("images/sebastiano.jpg", FRmodel)
database["bertrand"] = img_to_encoding("images/bertrand.jpg", FRmodel)
database["kevin"] = img_to_encoding("images/kevin.jpg", FRmodel)
database["felix"] = img_to_encoding("images/felix.jpg", FRmodel)
database["benoit"] = img_to_encoding("images/benoit.jpg", FRmodel)
database["arnaud"] = img_to_encoding("images/arnaud.jpg", FRmodel)

the fourth step,

Encode the input photo (output a 128-dimensional vector)

Determine the distance between the input photo and the photo in the database

If the distance is less than 0.7, open the door

Code:

def verify(image_path, identity, database, model):
"""
Function that verifies if the person on the "image_path" image is "identity".

Arguments:
image_path -- path to an image
identity -- string, name of the person you'd like to verify the identity. Has to be a resident of the Happy house.
database -- python dictionary mapping names of allowed people's names (strings) to their encodings (vectors).
model -- your Inception model instance in Keras

Returns:
dist -- distance between the image_path and the image of "identity" in the database.
door_open -- True, if the door should open. False otherwise.
"""

### START CODE HERE ###

# Step 1: Compute the encoding for the image. Use img_to_encoding() see example above. (≈ 1 line)
encoding = img_to_encoding(image_path, model)

# Step 2: Compute distance with identity's image (≈ 1 line)
dist = np.linalg.norm(encoding-database[identity])

# Step 3: Open the door if dist < 0.7, else don't open (≈ 3 lines)
if dist < 0.7:
print("It's " + str(identity) + ", welcome home!")
door_open = True
else:
print("It's not " + str(identity) + ", please go away")
door_open = False

Face Recognition Implementation

1. Encode the target image and output a 128-dimensional vector

2. Find the image with the smallest distance from the target image in the database

a. Initialize the variable min_dist to a large enough value, such as 100

b. Loop through the distance between the image in the database and the target image, find the image less than the min_dist distance, and find the image with the smallest distance from the target image after traversal

Code:

def who_is_it(image_path, database, model):
"""
Implements face recognition for the happy house by finding who is the person on the image_path image.

Arguments:
image_path -- path to an image
database -- database containing image encodings along with the name of the person on the image
model -- your Inception model instance in Keras

Returns:
min_dist -- the minimum distance between image_path encoding and the encodings from the database
identity -- string, the name prediction for the person on image_path
"""

### START CODE HERE ###

## Step 1: Compute the target "encoding" for the image. Use img_to_encoding() see example above. ## (≈ 1 line)
encoding = img_to_encoding(image_path, model)

## Step 2: Find the closest encoding ##

# Initialize "min_dist" to a large value, say 100 (≈1 line)
min_dist = 100

# Loop over the database dictionary's names and encodings.
for (name, db_enc) in database.items():

# Compute L2 distance between the target "encoding" and the current "emb" from the database. (≈ 1 line)
dist = np.linalg.norm(encoding-db_enc)

# If this distance is less than the min_dist, then set min_dist to dist, and identity to name. (≈ 3 lines)
if dist <min_dist:
min_dist = dist
identity = name

### END CODE HERE ###

if min_dist > 0.7:
print("Not in the database.")
else:
print ("it's " + str(identity) + ", the distance is " + str(min_dist))

return min_dist, identity

Refer to Andrew Ng's deep learning course.

[Deep Learning_4.4] Face Recognition Algorithm Model Construction-Face Verification and Face Recognition

Guess you like