Track scientific research trends in real time丨Selected new papers from Google, Max Planck Institute for Light Science and other institutions

As a scientific researcher, you need to search and browse a large amount of academic literature every day to obtain the latest scientific and technological progress and research results. However, traditional retrieval and reading methods can no longer meet the needs of scientific researchers.

AMiner AI is a literature knowledge tool that integrates retrieval, reading, and knowledge Q&A. Help you quickly improve the efficiency of retrieval and reading papers, obtain the latest research trends in the field, and make scientific research work more comfortable.
Insert image description here

Combined with the cutting-edge news subscription function, arXiv selects the most popular new papers of the day and forms a paper review, allowing everyone to understand the cutting-edge news more quickly.

If you want to have an in-depth conversation about a certain paper, you can directly copy the paper link to the browser or go directly to the AMiner AI page: https://www.aminer.cn/chat/g/explain

List of selected new papers on September 14, 2023:

1.MagiCapture: High-Resolution Multi-Concept Portrait Customization

Problems with current personalization methods in the field of facial image generation: the quality of the generated images is often insufficient for commercialization and contains unrealistic imperfections. Especially in portrait image generation, humans are sensitive to any unnatural traces in faces due to their preconceived biases. To solve this problem, a personalization method called MagiCapture is introduced, which combines the concepts of theme and style to generate high-resolution portrait images using a small number of theme and style reference images. The main challenge among them is the lack of real data for generating combined concepts, which results in lower quality of the final output and changes in the identity of the source subjects. To address these issues, a novel attention refocusing loss and auxiliary prior approach are proposed, both of which contribute to robust learning in this weakly supervised learning environment. The method also includes additional post-processing steps to ensure highly realistic results. MagiCapture outperforms other baseline methods in both quantitative and qualitative evaluations, and also generalizes to other non-human subjects.

https://www.aminer.cn/pub/65026d513fda6d7f06474c11/?f=cs

2.Large Language Models for Compiler Optimization

The paper describes research on innovative applications of compiler optimization using large language models. The researchers proposed a Transformer model trained from scratch containing 7B parameters to optimize the code size of LLVM assembly. The input to the model is unoptimized assembly code, and the output is a list of compiler options for the optimal optimizer. During training, these auxiliary learning tasks significantly improve the model's optimization performance and depth of understanding by allowing the model to predict instruction counts before and after optimization, as well as the optimized code itself. The researchers evaluated it on a large set of test programs. Their method achieves a 3.0% improvement over the compiler in reducing instruction counts, outperforming two state-of-the-art baseline methods that require tens of thousands of compilations. Furthermore, the model demonstrated surprisingly strong code reasoning capabilities, generating compilable code 91% of the time and fully simulating the compiler's output 70% of the time. This paper therefore covers the problems faced in using large language models for compiler optimization as well as some important advances made in this area.

https://www.aminer.cn/pub/65026d513fda6d7f06474cc3/?f=cs

3.Statistical Rejection Sampling Improves Preference Optimization

The paper identifies problems in aligning language models with human preferences and describes the limitations of existing approaches. Previous approaches have mainly used reinforcement learning, trained from human feedback through online reinforcement learning methods such as Proximal Policy Optimization (PPO). However, the maximum likelihood estimator (MLE) requires sampling labeled preference pairs from the target optimal policy, while DPO lacks a reward model, limiting its ability to sample preference pairs from the optimal policy. To address these issues, the paper introduces a new method called Statistical Rejection Sampling Optimization (RSO), which uses rejection sampling to source data from the target optimal policy to more accurately estimate the optimal policy. In addition, the paper also proposes a unified framework to improve the loss functions of SLiC and DPO from the perspective of preference modeling. Through extensive experiments on three different tasks, the paper demonstrates the superiority of RSO in Large Language Model (LLM) and human evaluation.

https://www.aminer.cn/pub/65026d513fda6d7f06474b0e/?f=cs

4.Text-Guided Generation and Editing of Compositional 3D Avatars

Research points out that existing methods have some problems in creating and editing realistic 3D facial characters. Existing methods either lack realism, produce unrealistic shapes, or do not support editing, such as modifying hairstyles. The researchers believe that existing methods are limited to using a single modeling method, that is, using the same representation method for the head, face, hair and accessories, when in fact these parts have different structural characteristics and need to use different representations ways to perform better. Based on this observation, the researchers adopted a combined model to generate facial characters, in which the head, face, and upper body are represented using traditional 3D meshes, while hair, clothing, and accessories are represented using Neural Radiation Fields (NeRF). The model-based mesh representation provides strong geometric prior information for facial regions, improving realism and enabling editable character appearance. By using NeRF to represent other components, the approach is able to model and synthesize parts with complex geometric and appearance characteristics, such as curly hair and fluffy scarves. This study introduces their new system for synthesizing these high-quality combined roles from textual descriptions. Experimental results show that their approach produces characters that are more realistic than existing methods and are editable due to their combinatorial nature. For example, their approach can seamlessly transfer combined features such as hairstyles, scarves and other accessories between different characters, supporting applications such as virtual try-on.

https://www.aminer.cn/pub/65026d513fda6d7f06474d08/?f=cs

5.DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models

points out the limitations that may be encountered in expressing the unique characteristics of a work of art (such as brushstrokes, tone, or composition) when using textual prompts as the only constraint. To solve this problem, DreamStyler is introduced, a new framework aimed at artistic image synthesis that excels in text-to-image synthesis and style transfer. DreamStyler optimizes multi-stage text embedding with context-aware text hints, resulting in outstanding image quality. Additionally, through content and style guidance, DreamStyler demonstrates the flexibility to adapt to a range of style references. Experiments have proven its excellent performance in multiple scenarios, indicating great potential in the creation of artworks.

https://www.aminer.cn/pub/65026d513fda6d7f06474c3b/?f=cs

6.TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation Models

The article illustrates a current problem: the difficulty of large language models in dealing with traffic problems, especially in handling numerical data and interacting with simulations. Although specialized transportation basic models exist, they are usually only designed for specific tasks and have limited input-output interaction. Combining these two models enhances their ability to solve complex traffic problems and provide insightful recommendations. To bridge this gap, the author proposes TrafficGPT, which integrates ChatGPT and traffic basic models. Through this integration, TrafficGPT can have the ability to view, analyze, and process traffic data, and provide in-depth decision-making support for urban transportation system management. At the same time, it can also intelligently decompose complex tasks and gradually use the traffic basic model to complete the tasks. In addition, TrafficGPT can assist humans in traffic control decisions through natural language dialogue and allow interactive feedback and revision of results. By seamlessly blending large language models and traffic expertise, TrafficGPT not only advances the development of traffic management, but also provides new ways to leverage artificial intelligence capabilities in this field.

https://www.aminer.cn/pub/65026d513fda6d7f06474b51/?f=cs

7.Deep Quantum Graph Dreaming: Deciphering Neural Network Insights into Quantum Experiments

The article illustrates the challenges posed by the opacity of deep neural networks in interpreting results from quantum optics experiments. Although neural networks can help scientists discover new scientific discoveries, understanding their internal logic is very difficult. To solve this problem, the authors used an interpretable artificial intelligence technology called Deep Dreaming, which was invented in computer vision. The authors use this technique to explore neural network learning for quantum optical experiments. They first trained a deep neural network to learn the properties of quantum systems. After training, they performed an "inversion" on the neural network, asking it how it imagined a quantum system with specific properties and how it continuously modified the quantum system to change the properties. The authors found that neural networks can change the distribution of the initial properties of a quantum system and can conceptualize learning strategies for neural networks. Interestingly, they found that in the shallower layers of the neural network, the network could identify simple properties, while in deeper layers, it could identify complex quantum structures and even quantum entanglement. This is similar to long-time properties known from computer vision, which in this article we identify in complex natural science tasks. The method has potential applications in developing new advanced scientific discovery technologies in quantum physics based on artificial intelligence.

https://www.aminer.cn/pub/65026d513fda6d7f06474cbc/?f=cs


How to use AMiner AI?

The method of using AMiner AI is very simple. Open the AMiner homepage and enter the AMiner AI page from the navigation bar at the top of the page or the lower right corner.

On the AMiner AI page, you can choose to have a conversation based on a single document or a conversation based on the entire database (personal document database). You can choose to upload a local PDF or directly search for documents on AMiner.

Click to view: AMiner AI usage tutorial

Guess you like

Origin blog.csdn.net/AI_Conf/article/details/132971975