PMC-LLaMA: Towards Building Open-source Language Models for Medicine

This article is a series of LLM articles, focusing on the translation of "PMC-LLaMA: Towards Building Open-source Language Models for Medicine".

PMC LLaMA: Building open source language models for medicine

Summary

Recently, large language models (LLMs) have demonstrated extraordinary capabilities in natural language understanding. Although these models have demonstrated proficiency in everyday conversations and question-answering, they often struggle in domains that require accuracy, such as medical applications, due to a lack of domain-specific knowledge. In this paper, we describe the process of building a powerful open source language model specifically designed for medical applications, called PMC LLaMA. Our contributions are threefold: (i) We systematically study the process of adapting a general basic language model to the medical domain, which includes data-centric knowledge infusion by integrating 4.8 million biomedical academic papers and 30,000 medical textbooks. , and comprehensive fine-tuning consistent with domain-specific instructions; (ii) we provide a large-scale, comprehensive dataset for instruction tuning. The dataset includes medical question answering (QA), reasoning principles, and dialogue, including a total of 202M tokens; (iii) We conduct a thorough ablation study to demonstrate the effectiveness of each proposed component. When evaluated on various public medical question answering benchmarks, our lightweight PMCLLaMA, containing only 13 billion parameters, shows superior performance, even surpassing ChatGPT. All models, code and datasets can be found at https://github.com/chaoyi-wu/PMC-LLaMA .

introduction

Related work

problem definition

Dataset construction

experiment

result

in conclusion

In this paper, we systematically study the establishment of a medical-specific large language model based on an open source large language model, including data-centric knowledge injection and medical-specific instruction adjustment. Therefore, our proposed PMC LLaMA is the first open source medical-specific language model, which shows excellent performance on various medical benchmarks, surpassing ChatGPT and LLaMA-2 with much fewer parameters.

Guess you like

Origin blog.csdn.net/c_cpp_csharp/article/details/132847145