Test of face recognition technology (facenet) based on tensorflow

The application of face recognition is very extensive and progress is very fast. For example, the evaluation results of LFW are already close to 99.9%.

Uni-Ubi 60 0.9900 ± 0.0032
FaceNet 62 0.9963 ± 0.0009
Baidu 64 0.9977 ± 0.0006
AuthenMetric65 0.9977 ± 0.0009
MMDFR 67 0.9902 ± 0.0019
CW-DNA-1 70 0.9950 ± 0.0022
Faceall71 0.9967 ± 0.0007
JustMeTalk72 0.9887 ± 0.0016
Facevisa 74 0.9955 ± 0.0014
pose+shape+expression augmentation75 0.9807 ± 0.0060
ColorReco 76 0.9940 ± 0.0022
Asaphus 77 0.9815 ± 0.0039
I would give 78 0.9968 ± 0.0009
Dahua-FaceImage 80 0.9978 ± 0.0007
Easen Electron81 0.9978 ± 0.0006
Skytop Gaia82 0.9630 ± 0.0023
CNN-3DMM estimation83 0.9235 ± 0.0129
Samtech Facequest 84 0.9971 ± 0.0018
XYZ Robot87 0.9895 ± 0.0020
THU CV-AI Lab 88 0.9973 ± 0.0008
dlib 90 0.9938 ± 0.0027
Aureus91 0.9920 ± 0.0030
YouTube Lab, Tencent 63 0.9980 ± 0.0023
Orion Star92 0.9965 ± 0.0032
Yuntu WiseSight 93 0.9943 ± 0.0045
PingAn AI Lab 89 0.9980 ± 0.0016
Turing123 94 0.9940 ± 0.0040
Hisign95 0.9968 ± 0.0030
VisionLabs V2.038 0.9978 ± 0.0007
Deepmark 96 0.9923 ± 0.0016
Force Infosystems97 0.9973 ± 0.0028
ReadSense98 0.9982 ± 0.0007

      在上述模型中,有许多是商业公司的排名,所以呢,基本上很少有开源的东西。此处只对谷歌的facenet进行测试。

      FaceNet的架构如下所示:

    

      从上面可以看出,没有使用softmax层,而直接利用L2层正则化输出,获取其图像表示,即特征抽象层。而深度学习的框架可以使用现有的成熟模型,如tensorflow slim中的每一种模型。

      而最后一个Triplet Loss则是采用了三元组的损失函数。其代码如下所示      

def triplet_loss(anchor, positive, negative, alpha):
    """Calculate the triplet loss according to the FaceNet paper
    
    Args:
      anchor: the embeddings for the anchor images.
      positive: the embeddings for the positive images.
      negative: the embeddings for the negative images.
  
    Returns:
      the triplet loss according to the FaceNet paper as a float tensor.
    """
    with tf.variable_scope('triplet_loss'):
        pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), 1)
        neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), 1)
        
        basic_loss = tf.add(tf.subtract(pos_dist,neg_dist), alpha)
        loss = tf.reduce_mean(tf.maximum(basic_loss, 0.0), 0)
      
    return loss

从上面代码可以看出,三元组其实就是三个样例,如(anchor, pos, neg),利用距离关系来判断。即在尽可能多的三元组中,使得anchor和pos正例的距离,小于anchor和neg负例的距离。

       其学习优化如下图所示:

    

      测试:(代码见:https://github.com/davidsandberg/facenet)

       

        由于facenet无需限制人脸对齐,但是代码中提供了MTCNN的对齐,而且在LFW评分中也发现经过对齐的分数能够提高一个档次。

        利用提供的代码,在LFW上进行EVAL,发现其精度高达99.2%

        

        

         当然,还有更高的。



另外,程序中还提供了进行两张图片距离的比较的代码,进行调试,结果如下:


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324889589&siteId=291194637