LeCun and Tian Yuandong co-authored the 70-page "Self-Supervised Learning" Compendium

Source | Heart of the Machine Wechat ID: almosthuman2014

"About self-supervised learning, everything you want to know but dare not ask is here." Yann LeCun, Turing Award winner and chief scientist of Meta artificial intelligence, just sent such a tweet.

In the tweet, LeCun introduced a "Cookbook" (very practical, operable, and a recipe-like paper) co-authored by him, researcher at the Meta Artificial Intelligence Research Institute, research manager Tian Yuandong, and others. This Cookbook has a total of 70 pages, covering the definition, importance, origin, family, training deployment method, and expansion method of self-supervised learning. It is a rare learning material. "If you want to study self-supervised learning, you'd better read this book." Tian Yuandong added.

Paper link: https://arxiv.org/pdf/2304.12210v1.pdf

Self-supervised learning can be regarded as an "ideal state" of machine learning, where the model learns directly from unlabeled data without labeling the data. It mainly uses the auxiliary task (pretext task) to mine its own supervision information from large-scale unsupervised data, and trains the network through the supervision information of this structure, so that it can learn valuable representations for downstream tasks. The advantage of self-supervised learning is that it can utilize a large amount of unlabeled data for training without manual labeling. In this way, a lot of manpower and time costs can be saved, and more data can be used for training, thereby improving the performance of the model.

It is well known that Yann LeCun has been an active proponent of self-supervised learning. In recent years, Meta has published a series of papers on self-supervised learning. LeCun firmly believes that self-supervised learning is a necessary prerequisite for AI systems that can help AI systems build models of the world to acquire human-like abilities such as rationality, common sense, and the ability to transfer skills and knowledge from one environment to another. ability. The success of GPT-4 strongly demonstrates the effectiveness of self-supervised learning. However, Yann LeCun is not optimistic about the autoregressive method (predicting the next word) adopted by the GPT family, but prefers to build a "world model".

The cookbook has been well-received on social media.

What is self-supervised learning? Why is it so important?

In 2021, Yann LeCun et al published a blog titled "Self-supervised learning: The dark matter of intelligence". In a blog post, they call self-supervised learning (SSL) the "dark matter of intelligence," a promising avenue for advancing machine learning.

Self-supervised learning (SSL) is fundamental to the success of deep learning in natural language processing, bringing advances from automatic machine translation to large language models trained on web-scale unlabeled text corpora. In computer vision, it pushes new boundaries of data size, such as the SEER model trained on 1 billion images. SSL methods for computer vision have been able to match or in some cases outperform models trained on labeled data, even on competitive benchmarks such as ImageNet. SSL has also been successfully applied to other modalities such as video, audio, and time series.

Self-supervised learning defines an auxiliary task based on unlabeled input to produce descriptive, understandable representations. In natural language, a common SSL goal is to mask a word in text and predict surrounding words. This goal of predicting the context around a word encourages the model to capture the relationship between words in the text without any labels. The same SSL model representation can be used for a range of downstream tasks such as cross-language text translation, summarization, and even text generation, among many others. In computer vision, similar goals exist in models such as MAE or BYOL learning to predict occluded content blocks in images or representations. Other SSL targets encourage two views of the same image, for example formed by adding color or cropping, to map to similar representations.

The ability to train on large amounts of unlabeled data brings many benefits. Where traditional supervised learning methods train on a specific task, which is usually known in advance based on available labeled data, SSL learns useful general representations across many tasks. SSL is particularly useful in domains such as medicine, where labeling is expensive or the specific task cannot be known in advance. There is also evidence that SSL models can learn representations that are more robust to adversarial examples, label corruption, and input perturbations, and that are fairer than supervised models. Therefore, SSL is an area of ​​increasing interest. However, like cooking, the method of training SSL is a delicate art with a high barrier to entry.

Why Write a Self-Supervised Learning Cookbook

While researchers are familiar with many of the components of SSL, methods for successfully training SSL involve a dizzying array of choices, from auxiliary tasks to training hyperparameters. SSL research has high barriers to entry, including:

1. High computational cost;

2. The lack of fully transparent papers detailing the complex implementations required to fully realize the potential of SSL;

3. Lack of unified SSL professional vocabulary and theoretical viewpoints.

Since SSL establishes a different paradigm from traditional reconstruction-based unsupervised learning methods such as (denoising, variational) autoencoders, our vocabulary for understanding SSL under a unified framework is limited. In fact, attempts to unify SSL methods under a single framework didn't begin until last year. Since there is not a common basis to describe the different components of SSL methods, it becomes more challenging for researchers to study SSL methods. At the same time, SSL research is in dire need of new researchers to deploy it in the real world. However, there are still many unsolved mysteries about SSL's generalization guarantees, fairness, and robustness against adversarial attacks and even self-variation. These issues are critical to the reliability of the SSL method.

Furthermore, empirically driven SSL comes with many variable parts (mainly hyperparameters), which are key properties that may affect the final representation and are not necessarily accounted for in detail in published work. That said, to start working on SSL methods, it is first necessary to conduct an exhaustive empirical investigation of these methods to fully grasp the impact and behavior of all these components. Such experiential blind spots are highly limiting, as they require significant computational resources and pre-existing practical experience. All in all, SOTA performance comes from seemingly different but overlapping approaches, with little existing theoretical research and widespread real-world deployment of such models. Therefore, we need a cookbook that unifies this technology and its related methods. This is crucial to lowering the research threshold of SSL.

The goal of the researchers is to lay the foundation of SSL research in the form of a cookbook and present the latest methods related to SSL, thereby lowering the threshold of SSL research.

For example, to cook successfully, you must first learn the basic techniques: chopping, sautéing, etc. The researcher begins with Chapter 2, which introduces the basic techniques of self-supervised learning using a common vocabulary. Specifically, they describe systematic approaches as well as theoretical threads linking their goals in a unified perspective. Researchers highlight key concepts in concept boxes, such as loss terms or training objectives.

Next, "chefs" must learn to skillfully apply these techniques to form "delicious dishes," which requires learning existing recipes, combining ingredients and evaluating dishes. In Chapter 3, the researchers present practical considerations for a successful SSL approach, discussing common training methods, including hyperparameter selection, how to assemble components such as network architectures and optimizers, and how to evaluate SSL methods.

They also share practical tips from some excellent researchers on common training configurations and avoiding pitfalls. I hope this cookbook will serve as a practical basis for your successful training and exploration of self-supervised learning.

See the original paper for more details.

Reference link: https://zhuanlan.zhihu.com/p/66063089

Guess you like

Origin blog.csdn.net/lqfarmer/article/details/130432522