Forest plots are very common in papers, and are mostly used to represent the ratio effect of variables and outcome variables in multi-factor analysis, and can be drawn intuitively by graphical methods. In the past, we have introduced how to draw a forest map in the article "Quickly draw a multi-factor regression analysis forest map in R language (1)", but the drawing is relatively simple and not beautiful enough, and it is not possible to draw a relatively complex forest map. Today we will introduce the forestploter package, which is equivalent to further strengthening the functions on the basis of the forestplot package. The production method is relatively simple, and the fine control of the graphics is strengthened, and single-group and multi-group forest maps can be drawn.
We first import the R package and data
library(grid)
library(forestploter)
dt<-read.csv("E:/r/test/forest2.csv",sep=',',header=TRUE)
This is a forest map data (official account reply: forest map data 2, you can get the data). If you find it troublesome, you can download it here: https://download.csdn.net/download/dege857/86945654?spm=1001.2014 .3001.5501
Let me explain the first few variables Subgroup: grouping, that is, the group and subgroup, Treatment: the number of cases in the treatment group, Placebo: use of placebo, that is, the number of cases in the control group. Est: effect value, which can be OR or HR, low: the lowest value of the effect value, which can be considered as the lower limit of the credible interval, hi: the highest value of the effect value, which can be considered as the upper limit of the credible interval.
Let's draw a basic forest map first. The next few data are not needed for the time being. Let's simplify the data first.
dt <- dt[,1:6]
View(dt)
The data has become a streamlined data (pictured above), let's sort out the data format first, let it indent one space first
dt$Subgroup <- ifelse(is.na(dt$Placebo),
dt$Subgroup,
paste0(" ", dt$Subgroup))
Next, we turn the NA (missing) place of the treatment group and the control group into a space
dt$Treatment <- ifelse(is.na(dt$Treatment), "", dt$Treatment)
dt$Placebo <- ifelse(is.na(dt$Placebo), "", dt$Placebo)
Generate a variable se, which represents the size of the square when drawing
dt$se <- (log(dt$hi) - log(dt$est))/1.96
Generate a drawing interval, which will be used for drawing later
dt$` ` <- paste(rep(" ", 20), collapse = " ")
Generate HR and confidence intervals
dt$`HR (95% CI)` <- ifelse(is.na(dt$se), "",
sprintf("%.2f (%.2f to %.2f)",
dt$est, dt$low, dt$hi))#sprintF返回字符和可变量组合
Finally, we get the drawing data
and we can draw the data after sorting out the data.
p <- forest(dt[,c(1:3, 8:9)],
est = dt$est, #效应值
lower = dt$low, #可信区间下限
upper = dt$hi, #可信区间上限
sizes = dt$se, #黑框的大小
ci_column = 4, #在那一列画森林图,要选空的那一列
ref_line = 1,
arrow_lab = c("Placebo Better", "Treatment Better"),
xlim = c(0, 4),
ticks_at = c(0.5, 1, 2, 3),
footnote = "This is the demo data. Please feel free to change\nanything you want.")
p
Some people may want to ask, some forest maps have P values, what should we do if there is no P value, we can add it to it
dt$p <- paste(rep("<0.05", 22))
redraw
p <- forest(dt[,c(1:3, 8:10)],
est = dt$est, #效应值
lower = dt$low, #可信区间下限
upper = dt$hi, #可信区间上限
sizes = dt$se, #黑框的大小
ci_column = 4, #在那一列画森林图,要选空的那一列
ref_line = 1,
arrow_lab = c("Placebo Better", "Treatment Better"),
xlim = c(0, 4),
ticks_at = c(0.5, 1, 2, 3),
footnote = "This is the demo data. Please feel free to change\nanything you want.")
p
In this way, the P value comes out.
After the graph is generated, we can also adjust the details of the graph. We put the total number of cases in the first line at the end and change it to summary, and then change its name to Overall.
dt_tmp <- rbind(dt[-1, ], dt[1, ])
dt_tmp[nrow(dt_tmp), 1] <- "Overall"
The forest_theme function can adjust the details of the forest picture. We can set the predetermined forest style module on the forest_theme function, and then draw directly.
tm <- forest_theme(base_size = 10, #文本的大小
# Confidence interval point shape, line type/color/width
ci_pch = 15, #可信区间点的形状
ci_col = "#762a83", #CI的颜色
ci_fill = "blue", #ci颜色填充
ci_alpha = 0.8, #ci透明度
ci_lty = 1, #CI的线型
ci_lwd = 1.5, #CI的线宽
ci_Theight = 0.2, # Set an T end at the end of CI ci的高度,默认是NULL
# Reference line width/type/color 参考线默认的参数,中间的竖的虚线
refline_lwd = 1, #中间的竖的虚线
refline_lty = "dashed",
refline_col = "grey20",
# Vertical line width/type/color 垂直线宽/类型/颜色 可以添加一条额外的垂直线,如果没有就不显示
vertline_lwd = 1, #可以添加一条额外的垂直线,如果没有就不显示
vertline_lty = "dashed",
vertline_col = "grey20",
# Change summary color for filling and borders 更改填充和边框的摘要颜色
summary_fill = "yellow", #汇总部分大菱形的颜色
summary_col = "#4575b4",
# Footnote font size/face/color 脚注字体大小/字体/颜色
footnote_cex = 0.6,
footnote_fontface = "italic",
footnote_col = "red")
After setting up the module, you can draw directly
pt <- forest(dt_tmp[,c(1:3, 8:9)],
est = dt_tmp$est,
lower = dt_tmp$low,
upper = dt_tmp$hi,
sizes = dt_tmp$se,
is_summary = c(rep(FALSE, nrow(dt_tmp)-1), TRUE),
ci_column = 4,
ref_line = 1,
arrow_lab = c("Placebo Better", "Treatment Better"),
xlim = c(0, 4),
ticks_at = c(0.5, 1, 2, 3),
footnote = "This is the demo data. Please feel free to change\nanything you want.",
theme = tm)
plot(pt)
We can also modify the details of the picture, for example, we want to turn the third row into red
g <- edit_plot(p, row = 3, gp = gpar(col = "red", fontface = "italic"))
g
Change the color of blocks and strips in rows 3, 6, 10, and 11 to green
g <- edit_plot(g,
row = c(3, 6, 11, 13),
col = 4,
which = "ci",
gp = gpar(col = "green"))
g
Make the text on lines 2, 5, 10, 13, 17, 20 bold
g <- edit_plot(g,
row = c(2, 5, 10, 13, 17, 20),
gp = gpar(fontface = "bold"))
g
Change the background of the fifth row to green
g <- edit_plot(g, row = 5, which = "background",
gp = gpar(fill = "darkolivegreen1"))
g
insert text at top
g <- insert_text(g,
text = "Treatment group",
col = 2:3,
part = "header",
gp = gpar(fontface = "bold"))
g
Add an underline under the title
g <- add_underline(g, part = "header")
g
Insert text at the tenth line position
g <- insert_text(g,
text = "This is a long text. Age and gender summarised above.\nBMI is next",
row = 10,
just = "left",
gp = gpar(cex = 0.6, col = "green", fontface = "italic"))
g
Let's introduce how to draw a multi-group forest map. Drawing a pair-group forest map involves multiple groups of data. Let's re-import the data
dt<-read.csv("E:/r/test/forest2.csv",sep=',',header=TRUE)
Set the indentation to make it look better. This step is the same as before
dt$Subgroup <- ifelse(is.na(dt$Placebo),
dt$Subgroup,
paste0(" ", dt$Subgroup))#######如果变量没有缺失,就缩进一格,也就是前进一格
Because it is a double-group variable, it is necessary to set 2 n, this step is basically the same as before
dt$n1 <- ifelse(is.na(dt$Treatment), "", dt$Treatment)###将缺失的部分变为空格
dt$n2 <- ifelse(is.na(dt$Placebo), "", dt$Placebo)
Because we want to draw two forest maps, we need to add two empty spaces to draw the pictures
dt$`CVD outcome` <- paste(rep(" ", 20), collapse = " ")
dt$`COPD outcome` <- paste(rep(" ", 20), collapse = " ")
Set some basic parameters of the forest map, this step is the same as before
tm <- forest_theme(base_size = 10,
refline_lty = "solid", #参考线类型
ci_pch = c(15, 18),
ci_col = c("#377eb8", "#4daf4a"),
footnote_col = "blue",
legend_name = "Group", #设置标题名字
legend_value = c("Trt 1", "Trt 2"), #设置分组名字
vertline_lty = c("dashed", "dotted"),
vertline_col = c("#d6604d", "#bababa"))
Finally draw, here I want to say, ci_column = c(3, 5) refers to drawing in columns 3 and 5, est_gp1 and est_gp2 are a group, est_gp3 and est_gp4 are a group, and so on
p <- forest(dt[,c(1, 19, 21, 20, 22)],
est = list(dt$est_gp1,
dt$est_gp2,
dt$est_gp3,
dt$est_gp4),
lower = list(dt$low_gp1,
dt$low_gp2,
dt$low_gp3,
dt$low_gp4),
upper = list(dt$hi_gp1,
dt$hi_gp2,
dt$hi_gp3,
dt$hi_gp4),
ci_column = c(3, 5),
ref_line = 1,
vert_line = c(0.5, 2),
nudge_y = 0.2,
theme = tm)
p
Finally, the two-group forest map is also drawn, and it can also be adjusted like a single group. This should be the most detailed one I wrote, and it took 15 pages.