R Language Survival analysis visual analysis

 

Complete description link: http://tecdat.cn/?p=5438

 

Survival analysis corresponds to a set of statistical methods for the time it takes to investigate events of interest occur.

Survival analysis was used in various fields, such as:

Cancer research is the analysis of survival time,

Sociology "event history analysis"

In the works of "downtime analysis."

In cancer research, the typical research questions are as follows:

What is the impact of certain clinical characteristics of the patient's survival?

Three-year survival probability of a person is how much?

The survival rate of patients in each group there are differences do?

 

basic concept

Here we start with the basic definition of the term survival analysis, including:

Survival time and events

Survival function and hazard function

The survival time in cancer research and event type

There are different types of events, including:

relapse

death

From "response to treatment" (complete remission) to an event of interest occurs time commonly referred to as survival time (or the time the event occurred).

Cancer research in two of the most important measures include: i) the time of death ; and ii) non- recurrence-free survival time , which corresponds to the time between treatment response and disease recurrence. It is also known as non- disease-free survival time and non- event-free survival time .

As mentioned above, survival analysis focuses on the expected duration until an event of interest occurs (relapse or death).

 

Kaplan-Meier survival assessment

Kaplan-Meier (KM) method is a nonparametric method for survival probability estimates of survival time observed (Kaplan and Meier, 1958).

KM survival curve is a plot of the probability of survival and time management knowledge, it provides a useful summary of the data can be used to estimate metrics such as median survival time and the like.

R survival analysis

Install and load the packet R

We will use two R package:

Survival is calculated survival analysis

survminer summary of survival analysis and visualization of results

安装软件包

install.packages(c("survival","survminer"))

加载包

library("survival")library("survminer")

Sample data set

We will use the data in lung cancer survival package provided.

data("lung")head(lung)

inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss1    3  306      2  74  1      1      90      100    1175      NA2    3  455      2  68  1      0      90        90    1225      153    3 1010      1  56  1      0      90        90      NA      154    5  210      2  57  1      1      90        60    1150      115    1  883      2  60  1      0      100        90      NA      06  12 1022      1  74  1      1      50        80      513      0

inst: Agency Code

Time: survival time in days of

Status: Review of the status review = 1, 2 = death

Age: Age

Gender: Male = 1 Female = 2

ph.ecog: ECOG performance score (0 = Good 5 = dead)

ph.karno: Karnofsky performance score (bad = 0- good = 100) assessed by a physician

pat.karno: Karnofsky performance score by the patient assessment

Meals: calories consumed during meals

wt.loss: the last six months of weight loss

Survival curves calculated: survfit ()

We calculated the probability of survival by gender.

Function survfit () [in survival package] it can be used to calculate the Kaplan-Meier survival estimation. On its main points include:

Use function Surv () to create objects to survive

 

To calculate the survival curve, enter the following:

fit<-survfit(Surv(time,status)~sex,data=lung)print(fit)

Call: survfit(formula = Surv(time, status) ~ sex, data = lung)n events median 0.95LCL 0.95UCLsex=1 138    112    270    212    310sex=2  90    53    426    348    550

By default, the function print () Displays a brief summary of the survival curve. It shows the number of observations, the number of events, median survival and median of confidence limits.

If you want to display a more complete summary of the survival curve, enter the following:

# Summary of survival curvessummary(fit)# Access to the sort summary tablesummary(fit)$table

 

Visualization survival curves

We generate survival curves of the two groups of subjects.

 


 

 

 

 

legend.labs change the legend label.


 

 

 

The median survival time of each group represents the survival probability S (t) of 0.5 times.

 

Parameter xlim can be shortened survival curves, as follows:


 

 

 

Note that you can use parameters fun to specify three converts frequently used:

 

 

 

 

The cumulative hazard is commonly used to estimate the probability of danger.

 ​、

 

Summarizes the survival curves: Kaplan-Meier life table

如上所述,您可以使用函数summary()来获得生存曲线的完整摘要:

summary(fit)

 

在生存曲线已经与一个或多个变量拟合的情况下,surv_summary对象包含表示变量的额外列。这使得有可能根据地层或某些因素的组合来面对ggsurvplot的输出。

 

Log-Rank检验比较生存曲线:survdiff()

数秩检验是比较两条或更多条生存曲线的最广泛使用的方法。零假设是两组在生存期间没有差异。 

可以使用survdiff()如下:

surv_diff<-survdiff(Surv(time,status)~sex,data=lung)surv_diff

Call:survdiff(formula = Surv(time, status) ~ sex, data = lung)N Observed Expected (O-E)^2/E (O-E)^2/Vsex=1 138      112    91.6      4.55      10.3sex=2  90      53    73.4      5.68      10.3Chisq= 10.3  on 1 degrees of freedom, p= 0.00131

 

存活率差异的对数秩检验给出p = 0.0013的p值,表明性别组在存活方面差异显着。

 复杂的生存曲线

在本节中,我们将使用多个因素的组合计算生存曲线。接下来,我们将面向ggsurvplot()的输出结合因素

 fit2<-survfit(Surv(time,status)~sex+rx+adhere,data=colon)

使用幸存者可视化输出。下面的图显示了性别变量根据rx&adhere的值生存的曲线。

 

生存分析对应于一组统计方法,用于调查感兴趣事件发生所花费的时间。

生存分析被用于各种领域,例如:

癌症研究为患者生存时间分析,

“事件历史分析”的社会学

在工程的“故障时间分析”。

在癌症研究中,典型的研究问题如下:

某些临床特征对患者的生存有何影响?

个人三年存活的概率是多少?

各组患者的生存率有差异吗?

 

  

概要

生存分析是一组数据分析的统计方法,其中感兴趣的结果变量是事件发生之前的时间。

 

在这篇文章中,我们演示了如何使用两个R软件包的组合来执行和可视化生存分析:生存(用于分析)和生存者(用于可视化)。

 

有问题欢迎联系我们!

 

大数据部落 -中国专业的第三方数据服务提供商,提供定制化的一站式数据挖掘和统计分析咨询服务

Statistical analysis and data mining consulting services: y0.cn/teradat (Consulting Services, please contact the official website customer service )

Click here to send me a messageQQ:3025393450

[Service] Scene  

Research; the company outsourcing; online and offline one training; data collection; academic research; report writing; market research.

[Tribe] big data to provide customized one-stop data mining and statistical analysis consultancy

Welcome to elective our R language data analysis will be mining will know the course!

Guess you like

Origin www.cnblogs.com/tecdat/p/11324615.html