Or other biomedical research papers in the "Table" mostly baseline characteristics of descriptive statistics. Using the R statistics, summary alone, and then copy the results to excel table, time-consuming and error-prone!
tableone package "came into being", it can be very quick and easy solution to this problem, focusing on learning cost is very low, probably a few minutes?
A data loading, R packet
## install.packages("tableone")
library(tableone)
library(survival)
data(pbc)
head(pbc)
Second single group summaries
1 summarizes the entire data set
For the entire data set will be described pbc summary, use CreateTableOne()
can
tab1 <- CreateTableOne(data = pbc)
print(tab1)
Since the data in the form of numerical classification variables, so categorical variables are shown the mean (standard deviation).
2 Set the variable type
dput ( names ( PBC)) # output data set Variable name
Variable ## requires the summary of
MyVars <- c ( "Time", "Status", "trt", "Age", "Sex", "ascites", "edema ", " Bili ", " Copper ", " AST ", " Stage ")
## requires variable into categorical variables
catVars <- c ( " Status ", " trt ", " ascites ", " Stage ")
## Object A TableOne the Create
TAB2 <- CreateTableOne ( VARS = MyVars,data = pbc, factorVars = catVars)
print(tab2, showAllLevels = TRUE)
showAllLevels = TRUE
All categories of factors will result shows categorical variables.
Here some of the variables chosen at random function display, the display counts and percentages for categorical variables.
3 non-normally distributed variables
Since the default normally distributed continuous variables, so the above continuous variables are expressed as mean + standard deviation.
Non-normal distribution data of the actual data, by nonnormal
specified, this variable is shown as the median (quartiles).
#假设"bili","ast","copper"非正态分布
biomarkers <- c("bili","copper","ast")
print(tab2, nonnormal = biomarkers)
Visible "bili","ast","copper"
are by median (quartiles); if set nonnormal = TRUE
, all variables are processed by the non-normal distribution.
More than three Group Summary
1 packet statistics
The actual result is often necessary to set the data summarized by a variable grouping. The following shows the use of trt
grouping Summary:
tab3 <- CreateTableOne(vars = myVars, strata = "trt" , data = pbc, factorVars = catVars)
tab3
Note that a packet omission NA
The results can be seen to trt
be grouped, and each group were carried out summary statistics and the output test P values.
Test Method: Default categorical variables chi-square test ( chisq.test()
); Analysis of variance for continuous variables default ( oneway.test()
), when the two groups for analysis of variance t
test.
2 defined test methods
Non-normality of the data display mode to the median (quartiles), test methods is also preferred not to use T-test:
Non-normal distribution of continuous variables used kruskal.test()
test between the two groups when, kruskal.test()
and wilcox.test()
equivalents;
Categorical variables can be used fisher.test()
be fisher
exact test, through the exact()
conduct of a specified variable fisher's exact test.
Overall #addOverall add information
TAB4 <- CreateTableOne ( VARS = MyVars, Strata = "TRT", Data = PBC, factorVars = catVars, addOverall = TRUE) #exact provided fisher's exact test variable Print ( TAB4, nonnormal = biomarkers, Exact = "stage")
Four Export Results
write.csv
A key export results
tab4Mat <- Print ( TAB4, nonnormal = biomarkers, Exact = "Stage", quote = FALSE, noSpaces = TRUE, printToggle = FALSE, showAllLevels = TRUE)
## saved in CSV format file
write.csv ( tab4Mat, File = "myTable .csv ")
Just apply a table format that can be inflicted hi (text) Huan (chapter) in the style excel in this own hair
References:
https://cran.r-project.org/web/packages/tableone/vignettes/introduction.html
◆ ◆ ◆ ◆ ◆