(few-shot)few-shot learning Classification overview 小样本综述

小样本学习：基础 [blog]

1. 相关问题

半监督学习(Semi-supervised Learning) 是同时从有标注和无标注的数据中学习最优假设 . 正样本半监督学习(Positive-unlabeled learning) 是一类特殊的半监督学习, 只有正样本和无标注样本可以使用. 主动学习(active learning) 则由算法选择可以提供有用信息(informative)的无标注数据由专家给出标注, 反馈给模型学习. 所以 FSL 既可以是监督学习, 也可以是半监督学习, 取决于从有限的监督信息中可以获得哪种数据.
不平衡学习(Imbalance Learning) 从类别 yy 的分布严重倾斜的数据集中学习. FSL 任务中每类数据本来就很少.
迁移学习(Transfer Learning) 是把知识从数据量丰富的源域迁移到数据量不足的目标域. 领域自适应(domain adaption) 就是一种迁移学习问题. 另一种相关的迁移学习叫 零样本学习(zero-shot learning), 它需要在已有类别中学习特征, 并根据已有特征的组合来判断新的从未见过的类别. 根据 FSL 的定义, FSL 不一定是迁移学习, 但当我们使用额外监督信息来提升目标数据集的表现时, 这本质上就是一种迁移学习, 只不过源域中的数据可能也不多.
元学习(Meta Learning)/学会学习(Learning to Learn) 通过提供的数据集和元学习器(meta-learner)从其他地方学到的知识在新的任务 TT 上提升表现 PP . 具体来说, 元学习器在许多任务中学习元信息(一般知识), 并能够使用任务相关的信息快速泛化到新的任务上. 许多 FSL 都是元学习方法, 使用元学习器作为先验信息.

2. 未来的发展方向 [ref]

小样本物体检测：目前的小样本学习大多集中在图片分类任务上，物体检测任务今年才刚刚开始起步。相比于图片分类，物体检测在实际应用中更重要、更实用。
无监督小样本学习：小样本学习的传统设定是从大量的有标签的基类数据迁移知识到小样本的新类任务上。在实际很多场景中，大量有标签的基类数据也是难以获得的，能否利用大量无标签的基类数据（或者少量有标签的基类数据+大量无标签的基类数据）来做小样本学习？
联邦学习：当一个用户本地的数据非常少时，如何利用其它用户的数据，在保证隐私的情况下来帮助训练模型？或者当某个用户的数据分布式存储在多个设备上且每个设备的数据都很少的情况下，如何用最小的通信代价将分布式的数据进行联合训练？
小样本学习在更多领域的应用：小样本文本分类，推荐系统冷启动（对仅有少量交互的新用户如何做推荐）等。

3. 选取近期高质量论文

[AAAI2020] ([paper] [code])Learning from the Past: Continual Meta-Learning via Bayesian Graph Modeling

[ICCV2019] ([paper] )PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment

[ICLR2019] ([paper] [code]) A CLOSER LOOK AT FEW-SHOT CLASSIFICATION

4. Related

Continual Meta-Learning via Bayesian Graph Modeling

[Gao et al. 2019] Gao, T.; Han, X.; Liu, Z.; and Sun, M.
2019. Hybrid attention-based prototypical networks for
noisy few-shot relation classification. In AAAI.

[Hu et al. 2019] Hu, T.; Yang, P.; Zhang, C.; Yu, G.; Mu, Y.;
and Snoek, C. G. M. 2019. Attention-based multi-context
guiding for few-shot semantic segmentation. In AAAI

[Ma and Zhang 2019] Ma, T., and Zhang, A. 2019. Affinitynet: Semi-supervised few-shot learning for disease type prediction. In AAAI.

[Hariharan and Girshick 2017] Hariharan, B., and Girshick,
R. B. 2017. Low-shot visual recognition by shrinking and
hallucinating features. In ICCV.

As an optimizer that gathers gradient flows from different tasks

[Finn, Abbeel, and Levine 2017] Finn, C.; Abbeel, P.; and
Levine, S. 2017. Model-agnostic meta-learning for fast
adaptation of deep networks. In ICML

[Nichol, Achiam, and Schulman 2018] Nichol, A.; Achiam,
J.; and Schulman, J. 2018. On first-order meta-learning
algorithms. CoRR abs/1803.02999

[Lee and Choi 2018] Lee, Y., and Choi, S. 2018. Gradientbased meta-learning with learned layerwise metric and subspace. In ICML.

As an classification weight generator that hallucinates classifiers for novel classes

[Qiao et al. 2018] Qiao, S.; Liu, C.; Shen, W.; and Yuille,
A. L. 2018. Few-shot image recognition by predicting parameters from activations. In CVPR.

[Rusu et al. 2016] Rusu, A. A.; Rabinowitz, N. C.; Desjardins, G.; Soyer, H.; Kirkpatrick, J.; Kavukcuoglu, K.; Pascanu, R.; and Hadsell, R. 2016. Progressive neural networks.
CoRR abs/1606.04671

[Gidaris and Komodakis 2019] Gidaris, S., and Komodakis,
N. 2019. Generating classification weights with GNN denoising autoencoders for few-shot learning. In CVPR.

As a Metric that measures similarity between the query and support examples

[Vinyals et al. 2016] Vinyals, O.; Blundell, C.; Lillicrap, T.;
Kavukcuoglu, K.; and Wierstra, D. 2016. Matching networks for one shot learning. In NeurIPS.

[Snell, Swersky, and Zemel 2017] Snell, J.; Swersky, K.; and
Zemel, R. S. 2017. Prototypical networks for few-shot learning. In NeurIPS.

[Battaglia et al. 2018] Battaglia, P. W.; Hamrick, J. B.; Bapst,
V.; Sanchez-Gonzalez, A.; Zambaldi, V. F.; Malinowski, M.;
Tacchetti, A.; et al. 2018. Relational inductive biases, deep
learning, and graph networks. CoRR abs/1806.01261

Graph structure

[Garcia and Bruna 2018] Garcia, V., and Bruna, J. 2018.
Few-shot learning with graph neural networks. In ICLR.

[Kim et al. 2019] Kim, J.; Kim, T.; Kim, S.; and Yoo, C. D.
2019. Edge-labeling graph neural network for few-shot
learning. In CVPR.

[Liu et al. 2019] Liu, Y.; Lee, J.; Park, M.; Kim, S.; Yang,
E.; Hwang, S. J.; and Yang, Y. 2019. Learning to propagate labels: Transductive propagation network for few-shot
learning. In ICLR.

[Li et al. 2019b] Li, W.; Xu, J.; Huo, J.; Wang, L.; Gao, Y.;
and Luo, J. 2019b. Distribution consistency based covariance metric networks for few-shot learning. In AAAI.

Catastrophic Forgetting:
[Kemker et al. 2018] Kemker, R.; McClure, M.; Abitino, A.;
Hayes, T. L.; and Kanan, C. 2018. Measuring catastrophic
forgetting in neural networks. In AAAI.

Insufficient Robustness:
[Zhang et al. 2019] Zhang, Y.; Pal, S.; Coates, M.; and
¨ Ustebay, D. 2019. Bayesian graph convolutional neural networks for semi-supervised classification. In AAAI.

Meta-Learning

Optimization-based methods:

Either learn a good parameter initialization or leverage an optimizer as the meta-learner to adjust the model weights.

[Ravi and Larochelle 2017] Ravi, S., and Larochelle, H.
2017. Optimization as a model for few-shot learning. In
ICLR.

[Finn, Xu, and Levine 2018] Finn, C.; Xu, K.; and Levine,
S. 2018. Probabilistic model-agnostic meta-learning. In
NeurIPS

[Yoon et al. 2018] Yoon, J.; Kim, T.; Dia, O.; Kim, S.; Bengio, Y.; and Ahn, S. 2018. Bayesian model-agnostic metalearning. In NeurIPS, 7343–7353.

[Li et al. 2017] Li, Z.; Zhou, F.; Chen, F.; and Li, H. 2017.
Meta-sgd: Learning to learn quickly for few shot learning.
CoRR

[Nichol, Achiam, and Schulman 2018] Nichol, A.; Achiam,
J.; and Schulman, J. 2018. On first-order meta-learning
algorithms. CoRR abs/1803.02999

[Lee and Choi 2018] Lee, Y., and Choi, S. 2018. Gradientbased meta-learning with learned layerwise metric and subspace. In ICML.

[Li et al. 2017] Li, Z.; Zhou, F.; Chen, F.; and Li, H. 2017.
Meta-sgd: Learning to learn quickly for few shot learning.
CoRR.

[Rusu et al. 2019] Rusu, A. A.; Rao, D.; Sygnowski, J.;
Vinyals, O.; Pascanu, R.; Osindero, S.; and Hadsell, R.
2019. Meta-learning with latent embedding optimization.
In ICLR

Generation based methods:

Learn to augment few-shot data with a generative meta-learner or learn to predict classificatioin weights for classification.

[Wang et al. 2018] Wang, Y.; Girshick, R. B.; Hebert, M.;
and Hariharan, B. 2018. Low-shot learning from imaginary
data. In CVPR.

[Rusu et al. 2016] Rusu, A. A.; Rabinowitz, N. C.; Desjardins, G.; Soyer, H.; Kirkpatrick, J.; Kavukcuoglu, K.; Pascanu, R.; and Hadsell, R. 2016. Progressive neural networks.
CoRR abs/1606.04671

[Qiao et al. 2018] Qiao, S.; Liu, C.; Shen, W.; and Yuille,
A. L. 2018. Few-shot image recognition by predicting parameters from activations. In CVPR.

[Gidaris and Komodakis 2019] Gidaris, S., and Komodakis,
N. 2019. Generating classification weights with GNN denoising autoencoders for few-shot learning. In CVPR.

Metric based methods:

Learning a proper distance metrics as the meta-learner.

[Vinyals et al. 2016] Vinyals, O.; Blundell, C.; Lillicrap, T.;
Kavukcuoglu, K.; and Wierstra, D. 2016. Matching networks for one shot learning. In NeurIPS.

[Snell, Swersky, and Zemel 2017] Snell, J.; Swersky, K.; and
Zemel, R. S. 2017. Prototypical networks for few-shot learning. In NeurIPS.

[Ren et al. 2018] Ren, M.; Triantafillou, E.; Ravi, S.; Snell,
J.; Swersky, K.; Tenenbaum, J. B.; Larochelle, H.; and
Zemel, R. S. 2018. Meta-learning for semi-supervised fewshot classification. In ICLR.

[Bertinetto et al. 2019] Bertinetto, L.; Henriques, J. F.; Torr,
P. H. S.; and Vedaldi, A. 2019. Meta-learning with differentiable closed-form solvers. In ICLR

[Sung et al. 2018] Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.;
Torr, P. H. S.; and Hospedales, T. M. 2018. Learning to
compare: Relation network for few-shot learning. In CVPR.

[Yan, Zhang, and He 2019] Yan, S.; Zhang, S.; and He, X.
2019. A dual attention network with semantic embedding
for few-shot learning. In AAAI.

[Li et al. 2019a] Li, H.; Eigen, D.; Dodge, S.; Zeiler, M.; and
Wang, X. 2019a. Finding task-relevant features for few-shot
learning by category traversal. In CVPR

[Kim et al. 2019] Kim, J.; Kim, T.; Kim, S.; and Yoo, C. D.
2019. Edge-labeling graph neural network for few-shot
learning. In CVPR.

A CLOSER LOOK AT FEW-SHOT CLASSIFICATION

Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. Matching networks for one
shot learning. In Advances in Neural Information Processing Systems (NIPS), 2016

Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In
Advances in Neural Information Processing Systems (NIPS), 2017.

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation
of deep networks. In Proceedings of the International Conference on Machine Learning (ICML),
2017.

Sachin Ravi and Hugo Larochelle. Optimization as a model for few-shot learning. In Proceedings
of the International Conference on Learning Representations (ICLR), 2017

Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip HS Torr, and Timothy M Hospedales.
Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Victor Garcia and Joan Bruna. Few-shot learning with graph neural networks. In Proceedings of the
International Conference on Learning Representations (ICLR), 2018.

Hang Qi, Matthew Brown, and David G Lowe. Low-shot learning with imprinted weights. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

initialization based methods

Sachin Ravi and Hugo Larochelle. Optimization as a model for few-shot learning. In Proceedings
of the International Conference on Learning Representations (ICLR), 2017

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation
of deep networks. In Proceedings of the International Conference on Machine Learning (ICML),
2017.

metric learning methods

Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. Matching networks for one
shot learning. In Advances in Neural Information Processing Systems (NIPS), 2016

Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In
Advances in Neural Information Processing Systems (NIPS), 2017.

hallucination based methods

Antreas Antoniou, Amos Storkey, and Harrison Edwards. Data augmentation generative adversarial
networks. In Proceedings of the International Conference on Learning Representations Workshops (ICLR Workshops), 2018

Bharath Hariharan and Ross Girshick. Low-shot visual recognition by shrinking and hallucinating
features. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.

Spyros Gidaris and Nikos Komodakis. Dynamic few-shot visual learning without forgetting. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Hang Qi, Matthew Brown, and David G Lowe. Low-shot learning with imprinted weights. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018