This article is a series of LLM articles aimed at the translation of "Self-Alignment with Instruction Backtranslation".
Automatic alignment of instruction detranslation
Summary
We propose a scalable approach to building high-quality instruction-following language models by automatically labeling human-written text with corresponding instructions. Our method, named instruction detranslation, starts with a language model fine-tuned on a small amount of seed data and a given web corpus. The seed model is used to build training examples by generating instruction hints for web documents (self-augmentation) and then selecting high-quality examples from these candidates (self-curation). This data is then used to fine-tune a stronger model. Fine-tuning LLaMa in two iterations of our method yields a model that outperforms all other LLaMa-based models on the Alpaca leaderboard, does not rely on distilled data, and demonstrates efficient self-calibration.
1 Introduction
2 methods
3 experiments
4 insufficient
5 related work
6 Conclusion
We propose a scalable approach to fine-tuning large language models to follow instructions. Our approach exploits large amounts of unlabeled data by developing an iterative self-training algorithm, which we refer to as instruction detranslation. Our method uses the model itself to augment and curate high-quality training examples to improve its own performance. On the Alpaca leaderboard, our fine-tuned model outperforms all other non-distilled instruction-following models while using fewer human-annotated examples. Future work should further extend this approach by considering larger unlabeled corpora, which our analysis suggests should yield further gains.