Real-time tracking of scientific research trends | 8.14 selected new papers from Meta AI, Tsinghua University, University of Hong Kong and other institutions, with a ChatPaper review

As a scientific researcher, you need to search and browse a large amount of academic literature every day to obtain the latest scientific and technological progress and research results. However, traditional retrieval and reading methods can no longer meet the needs of scientific researchers.

ChatPaper is a literature knowledge tool that integrates retrieval, reading, and knowledge Q&A. Help you quickly improve the efficiency of retrieval and reading papers, obtain the latest research trends in the field, and make scientific research work more comfortable.

Insert image description here

Combined with the cutting-edge news subscription function, arXiv selects the most popular new papers of the day and forms a paper review, allowing everyone to understand the cutting-edge news more quickly.

If you want to have an in-depth conversation about a certain paper, you can directly copy the paper link to the browser or go directly to the ChatPaper page: https://www.aminer.cn/chat/g/explain

List of selected new papers on August 14, 2023:
1. Composable Function-preserving Expansions for Transformer ArchitecturesRead the original text

https://www.aminer.cn/pub/64d9a6813fda6d7f061d30d1/

ChatPaper review: When training state-of-the-art neural network models, the computational and time costs are high. Model size is considered a key factor in implementing and improving state-of-the-art models. Increasing the size of a neural network usually requires reinitializing all parameters from scratch, as this implies a change in model parameters and does not allow direct transfer of knowledge from smaller models. To address this issue, the paper proposes six combinable transformation methods that gradually increase the size of Transformer-based neural networks by maintaining functionality, allowing the model's capacity to be expanded as needed. For each transformation, the paper provides proofs that ensure accurate functionality preservation under minimal initialization constraints. The proposed method potentially provides an efficient training process for larger and more powerful models by gradually extending the architecture during training.

2.Improving Joint Speech-Text Representations Without AlignmentRead the original text

https://www.aminer.cn/pub/64d9a6873fda6d7f061d372d/

ChatPaper review: The paper solves the problem of sequence length mismatch between speech and text and proposes a consistency loss to improve this problem. In previous methods, in order to handle the length difference between speech and text, upsampling heuristics or explicit alignment models need to be used, but these methods are relatively complex. Our research shows that by ignoring sequence length, joint speech-text encoders can naturally achieve cross-modal consistent representations. Therefore, the authors propose a consensus loss that tolerates sequences of different lengths and assumes an optimal alignment. Experiments have shown that this loss can improve the subsequent recognition error rate (WER) and is effective in both large-scale monolingual and multilingual systems.

3.Self-Alignment with Instruction BacktranslationRead the original text

https://www.aminer.cn/pub/64d9a6873fda6d7f061d37b9/

ChatPaper review: Research addresses the problem of building high-quality instructional language models and proposes a scalable approach. By automatically generating guidance guides and selecting high-quality examples for self-training, the language model is able to annotate human-written text without teacher signals. After two iterations of improvement, the model outperforms other LLaMa-based models on the Alpaca rankings and does not rely on distilled data, indicating that self-alignment works very well.

4.BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous AgentsRead the original text

https://www.aminer.cn/pub/64d9a6813fda6d7f061d303e/

ChatPaper Overview: The article mainly explains two problems: The first problem is that the research and exploration of LAA (LLM-augmented Autonomous Agents) is still in its infancy, and only limited exploration is currently available. To address this issue, the authors provide a comprehensive comparison on LAAs, including comparisons of agent architectures and LLM backbones, and propose a new strategy to coordinate the communication of multiple LAAs. The second question is about the best choice for designing LAA architecture and choosing LLM and the compatibility of both.

5.Foundation Model is Efficient Multimodal Multitask Model SelectorRead the original text

https://www.aminer.cn/pub/64d9a6873fda6d7f061d37bc/

ChatPaper review: The paper explores an understudied but very important problem: how to predict the performance of pre-trained neural networks on each multi-modal task, such as image recognition, citation, and subtitle generation, without fine-tuning them. , visual question answering and text question answering. A brute force approach is to fine-tune all models on all target datasets, but this incurs high computational costs. Although recent advanced methods adopt lightweight metrics to measure model transferability, they often rely heavily on prior knowledge of a single task and therefore cannot be applied in multi-modal multi-task scenarios. To address this problem, we propose an efficient multi-task model selector (EMMS), which leverages a large-scale base model to convert various label formats (e.g., category, text, and bounding box) from different downstream tasks into a unified Noisy label embedding. EMMS can estimate model transferability via simple weighted linear regression, a problem that can be solved efficiently by an alternating minimization algorithm with convergence guarantees. Extensive experiments on 5 downstream tasks on 24 datasets prove that EMMS is fast, effective, and general enough to evaluate the transferability of pre-trained models, making it the first model selection method in multi-task scenarios .

6.LittleMu: Deploying an Online Virtual Teaching Assistant via Heterogeneous Sources Integration and Chain of Teach Prompts阅读原文

https://www.aminer.cn/pub/64d9a6813fda6d7f061d3022/

ChatPaper Overview: This paper proposes LittleMu, a virtual MOOC teaching assistant, in view of the lack of real scenarios and the complexity of training data in online education platforms. Through heterogeneous integration of structured, semi-structured and unstructured knowledge sources, LittleMu is able to provide accurate answers to a wide range of questions. At the same time, the system has also designed a demonstration called "Chain of Teach" to handle complex uncollected problems. In addition to question answering, the system also develops other educational services such as knowledge-based chatting. Through offline evaluation and online deployment, the authors tested the performance of the system. Since May 2020, the LittleMu system has provided more than 300,000 queries to more than 80,000 users on the XuetangX MOOC platform, and continues to contribute to education more conveniently and fairly.

7.DiLogics: Creating Web Automation Programs With Diverse LogicsRead the original text

https://www.aminer.cn/pub/64d9a6813fda6d7f061d2faf/

ChatPaper review: Network automation can improve productivity, but accurately translating tasks into network operations and scaling to new specifications is challenging. Existing tools can automate tasks that perform the same UI operation (for example, entering text in each field sequentially), but do not support tasks that perform different operations based on different input conditions. This article introduces DiLogics, a demonstration programming system that uses natural language processing to assist users in creating network automation programs that can handle different specifications. DiLogics first semantically segments input data into structured task steps. By recording user demonstrations of each step, DiLogics generalizes network macros into novel yet semantically similar task requirements. Our evaluation shows that non-experts can effectively use DiLogics to create automated programs that satisfy diverse input instructions. DiLogics provides an efficient, intuitive and expressive way to develop network automation programs that meet diverse specifications.


How to use ChatPaper?

The method of using ChatPaper is very simple. Open the AMiner homepage and enter the ChatPaper page from the navigation bar at the top of the page or the lower right corner.

Insert image description here

In the ChatPaper page, you can choose to have a conversation based on a single document or a conversation based on the entire database (personal document database). You can choose to upload a local PDF or directly search for documents on AMiner.

ChatPaper usage tutorial: Click here to view

Guess you like

Origin blog.csdn.net/AI_Conf/article/details/132295725