Real-time tracking of scientific research trends | New papers selected by Li Hang, Daniela L. Rus and others on 8.11, with a ChatPaper review

As a scientific researcher, you need to search and browse a large amount of academic literature every day to obtain the latest scientific and technological progress and research results. However, traditional retrieval and reading methods can no longer meet the needs of scientific researchers.

ChatPaper is a literature knowledge tool that integrates retrieval, reading, and knowledge Q&A. Help you quickly improve the efficiency of retrieval and reading papers, obtain the latest research trends in the field, and make scientific research work more comfortable.

Insert image description here

Combined with the cutting-edge news subscription function, arXiv selects the most popular new papers of the day and forms a paper review, allowing everyone to understand the cutting-edge news more quickly.

If you want to have an in-depth conversation about a certain paper, you can directly copy the paper link to the browser or go directly to the ChatPaper page: https://www.aminer.cn/chat/g/explain

List of selected new papers on August 11, 2023:

1.Follow Anything: Open-set detection, tracking, and following in real-time 阅读原文

https://www.aminer.cn/pub/64d5b21d3fda6d7f060d0db9/

ChatPaper review: The paper introduces a robotic system called "Follow Anything" (FAn) that can detect, track, and track any object in real time. The system uses a multimodal model that is not restricted by concepts that emerged at training time and can leverage text, image, or click queries to be applied to novel categories at inference time. By leveraging the rich visual descriptors of a large-scale pre-trained model (base model), FAn can detect and segment objects by matching multi-modal queries (text, images, clicks) to input image sequences. These detected and segmented objects can be tracked between image frames, taking into account occlusions and object re-appearance. In a real-time control loop, we demonstrate FAn on a real-world robotic system (a micro-air vehicle) and report its ability to seamlessly track objects of interest. FAn can be deployed on laptops with lightweight (6-8 GB) graphics cards, processing 6-20 frames per second. To summarize, this paper solves the problem of detecting, tracking and tracing any object in real-time and proposes a robotic system called FAn to achieve this goal. The system features an open vocabulary and multi-modal model that can be applied to novel categories at inference time and leverages visual descriptors from large-scale pre-trained models for object detection, segmentation and tracking. At the same time, the system also takes into account occlusion and object reappearance. The capabilities of the system are demonstrated through experiments on a micro-aircraft. Finally, to facilitate rapid adoption and scalability of the system, all code has been open sourced.

2.OpenProteinSet: Training data for structural biology at scaleRead the original text

https://www.aminer.cn/pub/64d5b2153fda6d7f060d0070/

ChatPaper review: illustrates a problem with training data for structural biology that generating multiple sequence alignments is computationally intensive and time-consuming, leaving the research community lacking comparable datasets for training with the likes of AlphaFold2, thereby limiting protein Advances in machine learning. To solve this problem, the authors introduced OpenProteinSet, an open source data set that contains more than 16 million multiple sequence alignment results, related structural homologs from the Protein Data Bank, and protein structure prediction results from AlphaFold2. The authors have successfully used OpenProteinSet to retrain AlphaFold2 and anticipate that OpenProteinSet will be widely used for training and validation data for tasks as diverse as protein structure, function, and design, as well as for large-scale multimodal machine learning research.

3.AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining 阅读原文

https://www.aminer.cn/pub/64d5b21d3fda6d7f060d0db5/

ChatPaper Overview: This paper solves the challenge of differences in specific goals and preferences between different types of audio generation problems, and proposes a unified audio generation framework. The framework leverages self-supervised pre-trained models to learn a universal representation of audio and converts any form of audio into this representation through the GPT-2 model. During the generation process, this representation is used in conjunction with a latent diffusion model for self-supervised audio generation learning. Experimental results demonstrate that the framework achieves new state-of-the-art or competitive performance in text-to-audio, text-to-music, and text-to-speech conversion.

4.Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models’ Alignment 阅读原文

https://www.aminer.cn/pub/64d5b2153fda6d7f060d00a4/

ChatPaper review: The paper points out that unclear guidance on assessing whether large language models (LLMs) are consistent with social norms, values ​​and regulations is a major challenge for practitioners. This obstacle hinders system iteration and deployment of LLMs. To address this issue, this paper provides a comprehensive survey on key dimensions of LLM trustworthiness. The survey covers seven main categories for LLM’s trustworthiness: reliability, security, fairness, prevention of abuse, explainability and reasoning capabilities, compliance with social norms, and robustness. Each main category is further divided into several subcategories, with a total of 29 subcategories. In addition, 8 subcategories were selected for further investigation, and corresponding measurement studies were conducted on several widely used LLMs. Measurements show that, overall, more aligned models perform better in terms of overall credibility. However, the effectiveness of the alignment differs for the different credibility categories considered. This highlights the importance of more refined analysis, testing and continuous improvement on LLM alignment. By illuminating these key dimensions of LLM trustworthiness, this paper aims to provide valuable insights and guidance to practitioners in the field. Understanding and addressing these issues is critical to achieving reliable and ethical deployment of LLMs in a variety of applications.

5.Explainable AI applications in the Medical Domain: a systematic reviewRead the original text

https://www.aminer.cn/pub/64d5b2153fda6d7f060d00c9/

ChatPaper review: The paper explores the application of explainable artificial intelligence (XAI) in the medical field. While AI applications in medicine have been successful in retrospective studies, practical applications have been rare. The field of medical artificial intelligence faces multiple challenges, including how to build user trust, comply with relevant regulations, and use data rationally. Explainable AI aims to help people understand and trust the results of AI. This paper conducts a literature review of 198 relevant articles published in recent years and summarizes some findings. First, these solutions primarily employ model-agnostic, explainable AI techniques. Second, deep learning models are more widely used than other types of machine learning models. Again, although interpretability has been applied to increase trust, few studies have reported physician involvement in this process. Finally, visual and interactive user interfaces are more useful for understanding the system's explanations and recommendations. More collaborative research between medical and AI experts is needed to guide the development of appropriate frameworks for designing, implementing, and evaluating explainable AI solutions in medicine.

6.LLM As DBARead the original text

https://www.aminer.cn/pub/64d5b21d3fda6d7f060d0cab/

ChatPaper review: The paper proposes a database administrator D-Bot based on a large language model (LLM), which can continuously obtain database maintenance experience from text sources and provide reasonable and well-founded timely diagnosis and optimization suggestions for the target database. This research mainly solves the difficult and cumbersome problems faced by database administrators when managing a large number of database instances.


How to use ChatPaper?

The method of using ChatPaper is very simple. Open the AMiner homepage and enter the ChatPaper page from the navigation bar at the top of the page or the lower right corner.
Insert image description here

In the ChatPaper page, you can choose to have a conversation based on a single document or a conversation based on the entire database (personal document database). You can choose to upload a local PDF or directly search for documents on AMiner.

ChatPaper usage tutorial: Click here to view

Guess you like

Origin blog.csdn.net/AI_Conf/article/details/132273142