We have already introduced how to use the rmda package to create a clinical decision curve in the article "Teach you how to use the R language to make a clinical decision curve", but the rmda package can only make the clinical decision curve of the logistic regression model. The original stdca for the COX regression model The package R cannot be downloaded. A fan left a message and recommended the ggDCA package to me. Today, let’s demonstrate how to use the ggDCA package to make a COX regression model clinical decision curve.
The ggDCA package is produced by our R language god, PhD supervisor of Southern Medical University, Uncle Y. Using the ggDCA package, we can create logistic regression models and COX regression model clinical decision curves. The drawing is very simple. It also allows me to understand the principles of drawing. At present, ggDCA Still continuing to improve.
Let's demonstrate below, or past breast cancer data.
Import the package and data first
library(ggDCA)
library(rms)
library(foreign)
bc <- read.spss("E:/r/test/Breast cancer survival agec.sav",
use.value.labels=F, to.data.frame=T)
Let’s take a look at the data first:
age means age, pathsize means pathological tumor size (cm), lnpos means positive axillary lymph nodes, histgrad means histopathological grade, er means estrogen receptor status, pr means progesterone receptor status, status Whether the outcome event is death or not, pathscat represents the pathological tumor size category (grouping variable), ln_yesno represents whether there is lymph node enlargement, time is the survival time, and the following agec is set by ourselves, don’t care about it.
First convert some of the categorical variables used and delete missing values
bc$histgrad<-as.factor(bc$histgrad)
bc$er<-as.factor(bc$er)
bc$pr<-as.factor(bc$pr)
bc$ln_yesno<-as.factor(bc$ln_yesno)
names(bc)
bc <- na.omit(bc)
The data set is divided into training set and validation set
set.seed(1)
index <- sample(2,nrow(bc),replace = TRUE,prob=c(0.7,0.3))
traindata <- bc[index==1,]
testdata <- bc[index==2,]
Then use the training set to build 3 predictive models (build randomly)
#####生成3个模型
f1<-cph(Surv(time,status)~age,traindata)
f2<-cph(Surv(time,status)~er+histgrad+pr,traindata)
f3<-cph(Surv(time,status)~er+histgrad+pr+age+ln_yesno,traindata)
Start to draw the graph. If no time is set, the median time will be defaulted
d_train <- dca(f1)
ggplot(d_train)###不设时间的话默认中位数时间
3-year survival rate of a single model
d_train <- dca(f1,
times=36)
ggplot(d_train)
5-year survival rate of multiple models
d_train <- dca(f1,f2,f3,
times=60)####多个模型5年后生存率
ggplot(d_train)
3-year and 5-year survival rates for multiple models
d_train <- dca(f1,f2,f3,
times=c(36,60))####多个模型3年和5年后生存率
ggplot(d_train)
5-year survival rate of multiple models on the validation set
d_train <- dca(f1,f2,f3,
times=60,
new.data=testdata)
I have obsessive-compulsive disorder for this kind of dotted line, and I feel that the solid line is more beautiful
d_train <- dca(f1,f2,f3,
times=60)####多个模型5年后生存率
ggplot(d_train,linetype=1)
The logistic regression model is almost the same as this one, except that the model construction is different and the production is very simple.
For more exciting articles, please pay attention to the public number: zero-based scientific research