The ggpubr package is a very classic R package that can draw professional paper drawings. It enhances ggplot2. The description of the R package also introduces how to create and customize publication drawings based on "ggplot2". It is a package for SCI. Today we use ggpubr to draw box plots with statistical data required for professional papers. Let’s
start by importing the R package and data first, and use the ToothGrowth data that comes with the R software.
library(ggpubr)
data("ToothGrowth")
df <- ToothGrowth
This is the data that comes with the R software, describing the effect of vitamin C on guinea pig tooth growth. The result is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) via one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C, coded as VC) .
Len: Tooth length, supp intervention method, there are two types: vitamin C or anti-thrombosis (VC or OJ). Dose: The dose of the drug.
The function for drawing boxplots is the ggboxplot function. Let's first draw a basic boxplot. x is a category. If it is a number, it must be converted into a factor. Y is the tooth length and needs to be a continuous variable.
p <- ggboxplot(data=ToothGrowth, x = "supp", y = "len",
color = "supp", palette = "npg", add = "jitter")
p
Palette is a palette, depending on which style you need, there are "npg", "aaas", "lancet", "jco", "ucscgb", "uchicago", "simpsons" and "rickandmorty". Optional. Let's change the style and see
p <- ggboxplot(data=ToothGrowth, x = "supp", y = "len",
color = "supp", palette = "ucscgb", add = "jitter")
p
add = "jitter" means adding scattered points, we will try it without adding them.
p <- ggboxplot(data=ToothGrowth, x = "supp", y = "len",
color = "supp", palette = "ucscgb")
p
The advantage of the ggpubr package is that it can calculate statistical values of related data and display them beautifully. Let’s demonstrate it below and add a P value to it.
p + stat_compare_means()
The default is Wilcoxon rank sum test, if we want to switch to T test
p + stat_compare_means(method = "t.test")
If we want to perform a paired comparison of two data
ggpaired(ToothGrowth, x = "supp", y = "len",
color = "supp", line.color = "gray", line.size = 0.4,
palette = "ucscgb")+
stat_compare_means(paired = TRUE)
If we want to perform more group comparisons, such as comparing different measurements, we must first set the groups to be compared. Here we set the three groups of 0.5, 1, and 2 for comparison.
my_comparisons <- list( c("0.5", "1"), c("1", "2"), c("0.5", "2") )
Drawing
ggboxplot(ToothGrowth, x = "dose", y = "len",
color = "dose", palette = "npg")+
stat_compare_means(comparisons = my_comparisons, label.y = c(29, 35, 40))+
stat_compare_means(label.y = 45)
If we set the 0.5 group as the reference group and compare other groups with it, the anova method is used here.
# Multiple pairwise test against a reference group
ggboxplot(ToothGrowth, x = "dose", y = "len",
color = "dose", palette = "npg")+
stat_compare_means(method = "anova", label.y = 40)+ # Add global p-value
stat_compare_means(aes(label = after_stat(p.signif)),
method = "t.test", ref.group = "0.5")
It is also possible to compare related categories in subgroup data, which is very practical. What is selected here is the dose of the drug dose divided into groups.
p <- ggboxplot(ToothGrowth, x = "supp", y = "len",
color = "supp", palette = "npg",
add = "jitter",
facet.by = "dose", short.panel.labs = FALSE)
p
You can also find the P value for each group
p + stat_compare_means(
aes(label = paste0("p = ", after_stat(p.format)))
)
Indicated at different doses. Different drugs have different effects on mice's teeth.
Finally, to summarize, this is a very useful R package with a lot of drawing functions.