We often use survival curves to express the cumulative survival rate or morbidity of patients in clinical practice. As shown in the figure below, the Kaplan-Meier survival curve can well indicate the morbidity, survival and other key data of patients at different times. It's clear at a glance.
Today we will demonstrate how to make two pictures like the above. The colorful pictures are from our last SEER database article, titled: Incidence, Prognostic Factors and Survival Outcome in Patients With Primary Hepatic Lymphoma. We still use our previous breast cancer data. I didn't want to use this data, but I did not find good data. Friends in need pay attention to the official account reply: breast cancer, you can get the data.
Two packages of survival and survminer are required, which need to be downloaded in advance.
We first import the package and breast cancer data
library(survival)
library("survminer")
library(foreign)
bc <- read.spss("E:/r/test/Breast cancer survival agec.sav",
use.value.labels=F, to.data.frame=T)
bc <- na.omit(bc)
names(bc)
Let’s take a look at the data first:
age means age, pathsize means pathological tumor size (cm), lnpos means positive axillary lymph nodes, histgrad means histopathological grade, er means estrogen receptor status, pr means progesterone receptor status, status Whether the outcome event is death or not, pathscat represents the pathological tumor size category (grouping variable), ln_yesno represents whether there is lymph node enlargement, time is the survival time, and the following agec is set by ourselves, don’t care about it.
This time we want to compare whether there is lymph node enlargement (ln_yesno) on the survival outcome of breast cancer.
First, we fit and analyze the equations to get the number of events, survival time and other data for each group of patients.
fit <- survfit(Surv(time,status) ~ln_yesno,
data = bc) # 拟合方程
summary(fit)
Use ggsurvplot to plot
ggsurvplot(fit, data = bc)
You can also add confidence intervals and risk occurrence tables
ggsurvplot(fit, data = bc,
conf.int = TRUE,
risk.table = TRUE) # 添加风险表
You can also add the total patient survival time and P value
在这里插入代码片
```ggsurvplot(fit, # 创建的拟合对象
data = bc,
conf.int = TRUE, # 显示置信区间
pval = TRUE, # 添加P值
add.all = TRUE) # 添加总患者生存曲线
![在这里插入图片描述](https://img-blog.csdnimg.cn/20210226103523522.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L2RlZ2U4NTc=,size_16,color_FFFFFF,t_70)
还可以进一步美化
```r
ggsurvplot(fit, # 创建的拟合对象
data = bc, # 指定变量数据来源
conf.int = TRUE, # 显示置信区间
pval = "log-rank test p: 0.031", # 添加P值
surv.median.line = "hv", # 添加中位生存时间线
risk.table = TRUE, # 添加风险表
risk.table.col = "strata", # 根据分层更改风险表颜色
xlab = "Follow up time(d)", # 指定x轴标签
legend = c(0.7,0.2), # 指定图例位置
legend.title = "Kaplan-Meier", # 设置图例标题
legend.labs = c("pr=1", "pr=0"), # 指定图例分组标签
break.x.by = 10,# 设置x轴刻度间距
break.y.by = 0.1,# 设置y轴刻度间距
palette = c("#E7B800", "#2E9FDF"),##更改线条颜色
ggtheme = theme_bw()) #添加网格线
Next, make the event (death) rate function graph
ggsurvplot(fit, data = bc,
conf.int = TRUE, # 增加置信区间
fun = "cumhaz") # 绘制累计风险曲线
Can be further beautified
ggsurvplot(fit, data = bc,
conf.int = TRUE, # 增加置信区间
fun = "cumhaz",# 绘制累计风险曲线
pval = "log-rank test p:0.0025", # 添加P值
risk.table = TRUE, # 添加风险表
risk.table.col = "strata", # 根据分层更改风险表颜色
xlab = "Follow up time(d)", # 指定x轴标签
legend = c(0.2,0.8), # 指定图例位置
legend.title = "Kaplan-Meier", # 设置图例标题
legend.labs = c("pr=1", "pr=0"), # 指定图例分组标签
break.x.by = 10,# 设置x轴刻度间距
break.y.by = 0.05,# 设置y轴刻度间距
palette = c("#E7B800", "#2E9FDF"),##更改线条颜色
ggtheme = theme_bw()) #添加网格线
For more exciting articles, please pay attention to the public number: Zero-Basic Research. To get the breast cancer data in this article, follow the official account and reply: breast cancer, you can get the data