Popular papers in July丨Llama 2 open source sweeps the world of large models, AI develops its own software, and Transformer expands to 1 billion Token

60% of 2023 has passed! AI has continuously demonstrated amazing capabilities in the past time. The popular papers in July have been updated. Compared with the influence of the papers before, this time, we pay more attention to what technology brings us?

First of all, the open source of Llama 2 has attracted the attention of the large model world, and it is free and commercially available, including three parameter variants of 7 billion, 13 billion and 70 billion, which are optimized for dialogue use cases. As a powerful language model, the open source version of Llama 2 has demonstrated its strength in multiple application scenarios due to its excellent performance and flexibility.

The team of Professor Sun Maosong from Tsinghua University studied the formation of multiple large model Agents into a group to operate a virtual technology company for collaborative software development. This is a new concept, AI provides imagination, we have reason to expect this trend will be more widely used in the future.

The LONGNET proposed by Microsoft can expand the scale of the Transformer model to 1 billion Tokens. This means that the Transformer model can process longer text sequences and thus achieve better results in more natural language processing tasks.

Here we show the most representative 17 popular papers. If you want to get all the papers, please click the link at the end of the article.

1.Llama 2: Open Foundation and Fine-Tuned Chat Models

Meta has open-sourced a free and commercially available version of Llama 2, with three parameter variants of 7 billion, 13 billion, and 70 billion, optimized for conversational use cases.
insert image description here

2.Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems

A review of AI for Science by 63 scholars from 4 institutions. The paper points out some problems faced by artificial intelligence in quantum, atomic and continuous system science, discusses some other common technical challenges, and provides some learning and educational resources A taxonomic list for promoting further research and development in the field of AI for Science.
insert image description here

3.Meta-Transformer: A Unified Framework for Multimodal Learning

The authors propose a framework called Meta-Transformer that utilizes a frozen encoder for multimodal perception without paired multimodal training data. point to a promising future for the development of unified multimodal intelligence using Transformers.
insert image description here

4.Optimized Network Architectures for Large Language Model Training with Billions of Parameters

The authors find that the communication pattern of LLMs is unique, requiring only high-bandwidth any-to-any communication between small groups of GPUs, while communications outside these groups are trivial, sparse, and evenly distributed. To address this issue, the authors propose a new network architecture that divides the cluster into a collection of GPUs connected by a non-blocking any-to-any high-bandwidth interconnect, called an HB domain. The network cost can be reduced by up to 75% without compromising the performance of LLM training.
insert image description here

5.TokenFlow: Consistent Diffusion Features for Consistent Video Editing

Given a source video and a target text cue to generate a high-quality video, the authors propose a framework that leverages the power of text-to-image diffusion models for text-driven video editing tasks.
insert image description here

6.Communicative Agents for Software Development

The team of Professor Sun Maosong from Tsinghua University recently studied how to make multiple large model Agents form a group to operate a virtual technology company (ChatDev) for collaborative software development. Given only one natural language requirement, ChatDev can help users generate software fully automatically.
insert image description here
7. Retentive Network: A Successor to Transformer for Large Language Models

The paper proposes a RetNet network architecture for building large-scale language models, while achieving training parallelism, low-cost reasoning, and good performance.
insert image description here

8.DreamTeacher: Pretraining Image Backbones with Deep Generative Models

This work introduces DreamTeacher, a self-supervised feature representation learning framework, utilizing generative networks to pre-train downstream image backbones.
insert image description here

9.In-context Autoencoder for Context Compression in a Large Language Model

Introduces a model named In-context Autoencoder (ICAE) for context compression in large language models.
insert image description here
10. A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection

A comprehensive overview of GNN for Time Series, including time series forecasting, classification, anomaly detection, and missing data completion tasks.
insert image description here
11.CAME: Confidence-guided Adaptive Memory Efficient Optimization

ACL2023 outstanding paper, researchers from the National University of Singapore, Huawei Noah's Ark Laboratory and other researchers proposed a CAME optimizer, which has the same performance as Adam while reducing memory consumption. Training a large language model through the CAME optimizer can greatly reduce the The cost of model training.
insert image description here
12.VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

Li Feifei's team embodies the latest achievements in intelligence. The robot is connected to a large model and has conducted large-scale research in simulated and real robot environments. It can perform more than 30 daily operation tasks specified in free-form natural language.
insert image description here
13. A Survey on Graph Classification and Link Prediction based on GNN

The purpose of this article is to introduce graph classification and link prediction methods based on graph neural networks. First, the basic principles of graph convolutional neural networks are introduced in detail, and then graph neural network models based on attention mechanisms and autoencoders are described, and their Applications and related datasets in tasks such as node classification, graph classification and link prediction.
insert image description here
14.LONGNET: Scaling Transformers to 1,000,000,000 Tokens

The paper introduces a Transformer variant, LONGNET, that can extend sequence lengths to over 1 billion tokens without sacrificing performance for shorter sequences.
insert image description here

15.Segment Anything Meets Point Tracking

The paper proposes the SAM-PT method, which extends the capabilities of the SAM model to track and segment any target in dynamic videos.
insert image description here
16.Generate Anything Anywhere in Any Scene

A text-to-image diffusion model capable of generating arbitrary scenes, arbitrary places, and arbitrary objects is introduced.
insert image description here
17. RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

This research addresses how to directly apply vision-language models trained on Internet-scale data to end-to-end robot control to improve generalization and enable emerging semantic reasoning.
insert image description here


Click the link to download the "July Must-Read Papers Collection:

https://www.aminer.cn/topic/64d08d4d7dcf6a339bc6713c

Guess you like

Origin blog.csdn.net/AI_Conf/article/details/132163524