In SCI papers, we can often see some of these tables, most of which are named Table 2, which is mainly used to indicate the relationship between cause and result of single factor analysis or the relationship between grouping variables. The following figure
shows that there are countless tables in the paper. Today We show you how to make such a table step by step through an example data demonstration.
Let’s first make a table like this today, as shown in the figure below. Let’s first analyze what it means to express this table. It mainly expresses the relationship between single factor and outcome variable. Single factor variables are grouped and analyzed. Some categorical variables such as gender and smoking are also grouped and compared.
We use a breast cancer survival tumor data that comes with SPSS to demonstrate , First we import this data into R, and delete missing values (here is just to demonstrate how to make a table, the real analysis can not delete missing values in this way),
library(foreign)
library(survival)
bc <- read.spss("E:/ r/Breast cancer survival agec.sav”,
use.value.labels=F, to.data.frame=T)
bc <- na.omit(bc)
Check the data variable
head(bc)
age represents age, pathsize represents Pathological tumor size (cm), lnpos means positive axillary lymph nodes, histgrad means histopathological grade, er means estrogen receptor status, pr means progesterone receptor status, status outcome event is dead, pathscat means pathological tumor size category (grouping Variable), ln_yesno indicates whether there is lymph node enlargement, time is the survival time, and the following agec is set by ourselves, so don't care about it.
Suppose we want to know the influence of age and survival relationship, we can first group age
age1<-cut(bc$age,breaks = 3,labels = c(1,2,3))#
Equally divided into 3 intervals, named 1, 2, 3 dc<-cbind(bc,age1)#put Variable join table
First, let's do a single factor analysis of the relationship between age and death outcome. Originally, there is a time variable, and COX regression should be done, but I only use generalized linear equations to demonstrate the data. In fact, using COX regression equations is the same.
First start the univariate analysis of death outcome and age, and put age1 into the equation
f.age <- glm(status ~ age1, family = binomial, data = dc)
summary(f.age) to
calculate OR and 95%CI
exp( The result of confint(f.age))
exp(coef(f.age))
has come out. How to analyze it? Mainly look at age12 and age13. We originally divided age1 into 3 intervals. Why are there only two here? We compare age12 and age13 with age11, so if age11 is equal to 1, we also create a word form first and fill in the age
Let's make a categorical variable, ln_yesno (whether there is lymph node enlargement), first convert it to a categorical variable, and
continue to analyze
f.ln_yesno <- glm(status ~ ln_yesno, family = binomial, data = dc)
summary(f. ln_yesno)
continue to find OR and 95% CI
exp(confint(f.ln_yesno))
exp(coef(f.ln_yesno))
continue to fill in the form
so far, we have made both continuous variables and categorical variables, and other indicators can be done by doing the same This table is a bit troublesome, but in fact it is just changing the variables, and it took less than 20 minutes to make it.