一、资源

二、名词解释

（1）Structural SIMilarity (SSIM)：结构相似性，是一种衡量两幅图像相似度的指标；

（2）Fréchet Inception Distance（FID)：FID表示生成图像的多样性和质量，FID越小，则图像多样性越好，质量也越好；

（3）Rank@n：搜索结果中最靠前（置信度最高）的n张图有正确结果的概率；

（4）mean Average Precision(mAP)：平均精度均值、多分类任务中的平均精度AP（Average Precision）求和再取平均；

二、摘要

abstract：Person re-identification (re-id) remains challenging due to significant intra-class variations across different cameras. Recently, there has been a growing interest in using generative models to augment training data and enhance the invariance to input changes. The generative pipelines in existing methods, however, stay relatively separate from the discriminative re-id learning stages. Accordingly, re-id models are often trained in a straightforward manner on the generated data. In this paper, we seek to improve learned re-id embeddings by better leveraging the generated data. To this end, we propose a joint learning framework that couples re-id learning and data generation end-to-end. Our model involves a generative module that separately encodes each person into an appearance code and a structure code, and a discriminative module that shares the appearance encoder with the generative module. By switching the appearance or structure codes, the generative module is able to generate high-quality cross-id composed images, which are online fed back to the appearance encoder and used to improve the discriminative module. The proposed joint learning framework renders significant improvement over the baseline without using generated data, leading to the stateof-the-art performance on several benchmark datasets.

不同摄像头的显著区别使得行人重识别仍然面临挑战。近来，使用生成模型来增加训练数据并强化输入变化的不确定性引起了越来越多的关注。然而，多数现有的方法，生成训练数据阶段和重识别阶段是分离的。因此，重识别模型直接训练人工产生的数据。本文，我们试图通过更好的利用产生的数据来优化重识别模块。为此，我们提出一个端到端组合重识别学习阶段和数据产生阶段的联合学习框架。我们的模型涉及将行人编码成外观特征和结构特征的生成模型和与生成模型共享外观编码器的判别模块。通过交换外观特征或者结构特征，生成模型能够生成高质量的交叉ID的合成图像，并在线反馈外观编码器用于优化判别模块。提出的联合学习框架在基准线上呈现很大的提升，其性能在一些标准数据集的测试结果中表现最佳。

三、框架

联合学习框架分为图像生成模块和行人识别的判别模块。生成模型对行人进行编码，编码成appearance code（外观= [身份信息]+ [衣服+鞋子+手机+包包等] 编码）与structure code（姿态+头发+脸型+背景编码）。生成模型通过交换行人间的外观特征与结构特征生成高质量混合图像，如两个人换了衣服，然后通过在线反馈的方式优化判别模块。利用这种联合框架产生的数据，相对于基线有了很大的提升。