paper：Conditional Generative Adversarial Nets

主要内容：原始的GAN是无监督的，本文想通过添加额外的信息y，来增加模型的能力，想法很直接，实现也很简单，应该是最早的一篇向GAN添加y的paper。

重要句子：

Adversarial nets have the advantages that Markov chains are never needed, only backpropagation is used to obtain gradients, no inference is required during learning, and a wide variety of factors and interactions can easily be incorporated into the model.
In an unconditioned generative model, there is no control on modes of the data being generated. However, by conditioning the model on additional information it is possible to direct the data generation process.
it remains challenging to scale such models to accommodate an extremely large number of predicted output categories. A second issue is that much of the work to date has focused on learning one-to-one mappings from input to output. However, many interesting problems are more naturally thought of as a probabilistic one-to-many mapping.For instance in the case of image labeling there may be many different tags that could appropriately applied to a given image,
and different (human) annotators may use different (but typically synonymous or related) terms to describe the same image.（关注相关领域的challenges和主流模型的问题，这很重要。巧妙的灵感和突破全世界都多年难见，工作的大部分重点还是在一步步解决这些问题，尤其是学习的起步阶段，这更加重要，有助于统筹领域知识，将知识形成体系）
One way to help address the first issue is to leverage additional information from other modalities: for instance, by using natural language corpora to learn a vector representation for labels in which geometric relations are semantically meaningful. When making predictions in such spaces, we benefit from the fact that when prediction errors we are still often ‘close’ to the truth (e.g. predicting ’table’ instead of ’chair’), and also from the fact that we can naturally make predictive generalizations to labels that were not seen during training time.（有时候很好奇为什么连MNIST这么简单的数据集都不能实现100%的精度，这对人来说多么简单，是模型将简单的东西复杂化了还是缺乏联想？还是其东西？---想太多，不如看的多，呵呵。这里我觉得’table’ instead of ’chair’的变换挺有意思的，机器学习的本质是generalizations，这种类似‘联想’的方式很直接的增加了generalizations，对人来说想象力在学习中也是极其重要的，不是吗？）
y could be any kind of auxiliary information, such as class labels or data from other modalities. We can perform the conditioning by feeding y into the both the discriminator and generator as additional input layer.

paper：Conditional Generative Adversarial Nets

猜你喜欢