Real-time tracking of scientific research trends | New papers selected on 9.20 from Baichuan, Google DeepMind and other institutions

As a scientific researcher, you need to search and browse a large amount of academic literature every day to obtain the latest scientific and technological progress and research results.

However, traditional retrieval and reading methods can no longer meet the needs of scientific researchers.

AMiner AI is a literature knowledge tool that integrates retrieval, reading, and knowledge Q&A. Help you quickly improve the efficiency of retrieval and reading papers, obtain the latest research trends in the field, and make scientific research work more comfortable.
Insert image description here

If you want to have an in-depth conversation about a certain paper, you can directly copy the paper link to the browser or go directly to the AMiner AI page: https://www.aminer.cn/chat/g/explain

List of selected new papers on September 20, 2023:

1. Language Modeling Is Compression

The paper illustrates language modeling as a compression technique and highlights the potential of large language models for both prediction and compression. The paper also points out that treating the prediction problem as a compression problem can provide new insights into scaling laws, tokenization and contextual learning. The authors also demonstrated the compression capabilities of large language models, such as Chinchilla 70B, which compressed ImageNet images to 43.4% of their original size and LibriSpeech audio samples to 16.4% of their original size, exceeding Effect of domain-specific compressors PNG (58.5%) or FLAC (30.3%). Finally, the authors also show that prediction and compression are equivalent, and conditional generation models can be built using any compressor (such as gzip).

Paper link:
https://www.aminer.cn/pub/650a566d3fda6d7f067ece3e/?f=cs

2. OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch

An open source model called OpenBA is introduced, which is a bilingual asymmetric seq2seq model with 15 billion parameters. This model is intended to contribute a variant of Large Language Models (LLMs) to the China-oriented open source model community. The authors trained the OpenBA model from scratch by employing effective and efficient techniques and adopting a three-stage training strategy. Their solution achieves very competitive performance while using only 38 billion markers, better than LLaMA-70B on the BELEBELE benchmark, BLOOM-176B on the MMLU benchmark, and GLM on the C-Eval (hard) benchmark -130B is better. The report also provides the main details of training similar models, including pre-training data processing, bilingual Flan data collection, empirical observations that inspired the model architecture design, training objectives at different stages, and other enhancement techniques.

Paper link:
https://www.aminer.cn/pub/650a566d3fda6d7f067ece64/?f=cs

3. Multimodal Foundation Models: From Specialists to General-Purpose Assistants

This paper provides an overview of the taxonomy and evolution of multimodal base models and focuses on the process of moving from specialized models to general assistants. The research areas cover five core themes and are divided into two categories. (i) First, we provide an overview of some areas that have been extensively studied: multimodal base models specialized for specific purposes, including two topics - learning visual backbones for visual understanding and text-to-image generation. method. (ii) We then introduce recent progress in the field of exploratory, open-ended research: multimodal base models aimed at becoming universal assistants, including three topics—unified vision models inspired by large language models, multimodal End-to-end training of modal language models and multimodal tools chained to language models. The target audience of the paper is researchers, graduate students, and professionals in the computer vision and visual-linguistic multimodal communities who are eager to understand the basic knowledge and recent advances in multimodal basic models.

Paper link:
https://www.aminer.cn/pub/650a56593fda6d7f067ea000/?f=cs

4. Baichuan 2: Open Large-scale Language Models

The article illustrates two issues. First, large-scale language models are often closed-source or have limited capabilities in other languages. Second, the authors propose Baichuan 2, a family of large-scale multilingual language models with 7 billion and 13 billion parameters and trained from scratch on 26 trillion tokens. Baichuan 2 meets or exceeds similar-scale model performance with other open source models on public benchmarks such as MMLU, CMMLU, GSM8K, and HumanEval. Additionally, Baichuan 2 excels in verticals such as medicine and law. The author will publish all pre-trained model checkpoints so that the research community can better understand the training dynamics of Baichuan 2. Therefore, this article solves the problems of closed source and limited language capabilities, and introduces the advantages of Baichuan 2.

Paper link:
https://www.aminer.cn/pub/650a566d3fda6d7f067eccc7/?f=cs

5. SlimPajama-DC: Understanding Data Combinations for LLM Training

This paper aims to use SlimPajama to train a large language model and understand the impact of various data combinations (such as online text, Wikipedia, GitHub, books) on model training. SlimPajama is a dataset that has undergone rigorous deduplication and multi-source dataset merging, sorting out 627B tokens from the extensive 1.2T token RedPajama dataset contributed by Together. We call our study SlimPajama-DC, an empirical analysis aimed at revealing the basic features and best practices of using SlimPajama in large language model training. In our study of SlimPajama, two important observations emerged: (1) Global deduplication versus local deduplication. We analyze and discuss the impact of global deduplication (across different dataset sources) and local deduplication (within a single dataset source) on the performance of trained models. (2) The proportion of high-quality/highly deduplicated multi-source data sets in the combination. To study this, we built six configurations of the SlimPajama dataset and trained them individually using the 1.3B Cerebras-GPT model with Alibi and SwiGLU. Our best configuration achieves significant performance improvements compared to RedPajama while using the same number of training tokens. All of our 1.3B models are trained on a Cerebras 16×CS-2 cluster at a total of 80 PFLOP/s with bf16 mixed precision. We further extend our findings on the 7B model with large batch training (e.g., increasing data diversity after global deduplication is crucial). In conclusion, this paper explores the impact of using different data combinations when training large language models and presents observations and best practices regarding global versus local deduplication and the proportion of high-quality/highly deduplicated multi-source datasets .

Paper link:
https://www.aminer.cn/pub/650a566d3fda6d7f067eced6/?f=cs

6. Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

illustrates the problem of how to solve the scalable reinforcement learning problem of training multi-task policies using large-scale offline datasets through the Q-Transformer method. This method utilizes the scalable Q function representation provided by Transformer and is trained through offline time differential backup. By discretizing each action dimension and representing the Q-values ​​of each action dimension as distinct labels, we can apply efficient high-capacity sequence modeling techniques for Q-learning. The researchers made several design decisions to achieve good performance in offline reinforcement learning training and demonstrated that Q-Transformer outperforms previous offline reinforcement learning algorithms and imitation learning techniques on a massively diverse suite of real-world robot manipulation tasks.

Paper link:
https://www.aminer.cn/pub/650a566d3fda6d7f067ecc23/?f=cs

7. FoleyGen: Visually-Guided Audio Generation

Research points out that video-to-audio (V2A) generation is a challenge due to complex relationships between high-dimensional visual and auditory data and issues related to time synchronization. To solve this problem, the research introduces FoleyGen, an open-domain V2A generation system built on the language modeling paradigm. FoleyGen utilizes an off-the-shelf neural audio codec to generate audio tokens by converting in both directions between waveforms and discrete tokens. The generation of audio tokens is achieved through a single Transformer model conditioned on visual features extracted from the visual encoder. A common problem in V2A generation is that the generated audio is inconsistent with the actions visible in the video. To address this issue, research explores three novel visual attention mechanisms. The study also provides an exhaustive evaluation of multiple visual encoders, each pre-trained on single-modal or multi-modal tasks. Experimental results on the VGGSound dataset show that our proposed FoleyGen outperforms previous systems in all objective metrics and human evaluation.

Paper link:
https://www.aminer.cn/pub/650a566d3fda6d7f067ecdb6/?f=cs

8. Stabilizing RLHF through Advantage Model and Selective Rehearsal

illustrate the revolutionary role of large language models (LLMs) in natural language processing, but aligning these models with human values ​​and preferences through RLHF remains a significant challenge. This challenge is characterized by various instabilities such as reward cheating and catastrophic forgetting. In order to stabilize RLHF training, this technical report proposes two innovative methods: 1) Advantage model, which directly models the advantage score, that is, the additional reward relative to the expected reward, and adjusts the score distribution between tasks to prevent reward cheating; 2) Selective Rehearsal, which mitigates catastrophic forgetting by strategically selecting data for PPO training and knowledge recapitulation. Our experimental analysis on public and proprietary datasets shows that the proposed method not only increases the stability of RLHF training but also achieves higher reward scores and winning rates.

Paper link:
https://www.aminer.cn/pub/650a566d3fda6d7f067ecc5a/?f=cs


END

We have added the "Daily Selected New Papers" topic on the homepage of the AMiner website. You can click "Subscribe" and "Add to the Knowledge Base" to obtain all paper information!

Insert image description here
View all featured new papers: https://www.aminer.cn

Guess you like

Origin blog.csdn.net/AI_Conf/article/details/133175466