How to use the PubMed-RISmed package elegantly in R language

 PubMed is one of the largest life science literature databases. When searching using keywords for a topic, a large number of documents will be retrieved. How to screen these documents? In addition to cleverly using keyword combinations, you can also use R to extract relevant document information with one click to achieve quick browsing of the documents. RISmed is an R package for acquiring and analyzing medical literature data from the PubMed database. It can help you perform literature searches, download literature abstracts, obtain author information, etc. Its specific applications are as follows.

# 安装加载包
#install.packages("RISmed") #未安装的需要首先安装RISmed包
library(RISmed)
## Warning: package 'RISmed' was built under R version 4.2.3
#查看使用说明
help(package="RISmed")
# 限定检索主题,如“gestational diabetes"
search_topic<-"gestational diabetes"
search_query<-EUtilsSummary(search_topic,db="pubmed",type="esearch",mindate=2018,maxdate=2023)
## Warning in any(is.na(WhichArgs)) || sapply(WhichArgs, length) > 1: 'length(x) =
## 2 > 1' in coercion to 'logical(1)'

This will search for records containing the keyword "diabetes".

#查看检索内容
summary(search_query)
## Query:
## ("diabetes, gestational"[MeSH Terms] OR ("diabetes"[All Fields] AND "gestational"[All Fields]) OR "gestational diabetes"[All Fields] OR ("gestational"[All Fields] AND "diabetes"[All Fields])) AND 2018/01/01:2023/12/31[Date - Entry] 
## 
## Result count:  12775
#获取摘要信息
records<- EUtilsGet(search_query)

class(records)
## [1] "Medline"
## attr(,"package")
## [1] "RISmed"
#str(records)
# 获取作者信息
authors <- Author(records)

This will extract author information from the literature abstract.

##提取检索结果
pubmed_data <- data.frame('Title'=ArticleTitle(records),

                           'Year'=YearAccepted(records),

                          'journal'=ISOAbbreviation(records))



head(pubmed_data)
##                                                                                                                                                                                          Title
## 1                                                   Preconceptional and prenatal exposure to air pollutants and risk of gestational diabetes in the MADRES prospective pregnancy cohort study.
## 2                                                                                                                                   Mechanism and recent updates on insulin-related disorders.
## 3                                                                         Environmental tobacco smoke increased risk of gestational diabetes mellitus: A birth cohort study in Sichuan, China.
## 4 Development and feasibility of a theory-guided and evidence-based physical activity intervention in pregnant women with high risk for gestational diabetes mellitus: a pilot clinical trial.
## 5                                                                       Association between serum copper level and reproductive health of Women in the United States: a cross-sectional study.
## 6                                                          Prediction of large-for-gestational age at 36 weeks' gestation: two-dimensional vs three-Dimensional vs magnetic resonance imaging.
##   Year                   journal
## 1 2023      Lancet Reg Health Am
## 2 2023        World J Clin Cases
## 3 2023    Diabetes Metab Res Rev
## 4 2023  BMC Pregnancy Childbirth
## 5   NA  Int J Environ Health Res
## 6 2023 Ultrasound Obstet Gynecol
pubmed_data[1:3,1]
## [1] "Preconceptional and prenatal exposure to air pollutants and risk of gestational diabetes in the MADRES prospective pregnancy cohort study."
## [2] "Mechanism and recent updates on insulin-related disorders."                                                                                
## [3] "Environmental tobacco smoke increased risk of gestational diabetes mellitus: A birth cohort study in Sichuan, China."
write.csv(pubmed_data,file='diabetes.csv')
##可视化一下

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.2.3
# 提取发表年份和文章数
pub_years <- YearPubmed(records)
article_counts <- table(pub_years)

# 创建数据框
data_df <- data.frame(Year = as.factor(names(article_counts)), Counts = as.integer(article_counts))

p<-ggplot(data=data_df, aes(x=Year, y=Counts,fill=Year)) +

  geom_bar(stat="identity", width=0.5)+

  labs(y = "Number of articles",title="PubMed articles containing diabetes"

       )+

  scale_fill_brewer(palette="Dark2")

p

If you want to get subscription news in time, please search the official account "Single Cell Society". 

Guess you like

Origin blog.csdn.net/qq_42458954/article/details/133939035