R language CalibrationCurves package draws calibration curves with confidence intervals

16740598:

The calibration curve graph represents the difference between the predicted value and the actual value. As an important part of the prediction model, many functions can currently draw calibration curves. Generally divided into two types, one is to pass the Hosmer-Lemeshow test, divide the P value into 10 equal parts, and find the difference between the predicted value and the actual value in each equal part. One is to draw a continuous calibration curve.
Insert image description here
Insert image description here
We have introduced the drawing of calibration curves through many articles in the past, both continuous and equal. Today we will introduce the CalibrationCurves package. Just by looking at the name, you can tell that it is an R package for drawing calibration curves. Its characteristic is that it can draw Calibration curve with confidence interval, let’s operate it next.

library(CalibrationCurves)
bc<-read.csv("E:/r/test/zaochan.csv",sep=',',header=TRUE)

Insert image description here
This is data about premature low-birth-weight infants (public account reply: Premature birth data, you can get this data), less than 2500g is considered low-weight infants. The data are explained as follows: low indicates whether the baby is a preterm low birth weight baby less than 2500g, age indicates mother's age, lwt last menstrual weight, race race, smoke indicates smoking during pregnancy, ptl indicates premature birth history (count), ht indicates a history of high blood pressure, ui indicates uterine allergy, ftv indicates early pregnancy. The number of visits to the doctor, bwt is the newborn weight value. Let’s first convert categorical variables into factors

bc$race<-ifelse(bc$race=="black",1,ifelse(bc$race=="white",2,3))
bc$smoke<-ifelse(bc$smoke=="nonsmoker",0,1)
bc$race<-factor(bc$race)
bc$ht<-factor(bc$ht)
bc$ui<-factor(bc$ui)

Proportionately divide the data

set.seed(123)
tr1<- sample(nrow(bc),0.6*nrow(bc))##随机无放抽取
bc_train <- bc[tr1,]#60%数据集
bc_test<- bc[-tr1,]#40%数据集

Build a model using modeling set data

fit<-glm(low ~ age + lwt + race + smoke + ptl + ht + ui + ftv,
         family = binomial("logit"),
         data = bc_train )

The calibration curve mainly compares the relationship between predicted values ​​and actual values, so we need to generate predicted probabilities and actual values. Our line generates predicted values

pr1<- predict(fit,type = c("response"))

In addition to generating the predicted probability, we also need to generate an actual ending Y value

yval<-bc_train$low

Drawing a calibration curve is actually very simple. It only requires one sentence of code. Fill in the probability in front and the outcome in the end.

valProbggplot(pr1, yval)

Insert image description here
Insert image description here
Graphics can also be modified, including color and line shape.

valProbggplot(pr1, yval, CL.smooth = TRUE, logistic.cal = TRUE, lty.log = 2,
              col.log = "red", lwd.log = 1.5)

Insert image description here

valProbggplot(pr1, yval, CL.smooth = TRUE, logistic.cal = TRUE, lty.log = 9,
              col.log = "red", lwd.log = 1.5, col.ideal = colors()[10], lwd.ideal = 0.5)

Insert image description here

Guess you like

Origin blog.csdn.net/dege857/article/details/132892250