A heatmap is a graph that uses color to indicate data dependencies. There are many R packages that can draw heat maps. In the past, we have introduced "R language pheatmap package draws heat maps". Today we will introduce the R language linkET package to draw correlation combined network heat maps. Correlation heat map means the correlation between heat map and other data, and compares the internal relationship between multiple data.
The linkET package was written by one of our countrymen. Currently, it needs to be downloaded using devtools::install_github. The author’s description of the package is that everything can be connected to each other, which is quite interesting. Let's download the package first,
devtools::install_github("Hy4m/linkET", force = TRUE)
Sometimes Rstudio will prompt you to update other packages? Choose 3 here, just don’t update it.
After downloading the package, we import the package and data. Because it is a correlation heat map, we need to import two data
library(linkET)
library(ggplot2)
library(dplyr)
varespec<-read.csv("E:/r/test/varespec.csv",sep=',',header=TRUE)
varechem<-read.csv("E:/r/test/varechem.csv",sep=',',header=TRUE)
These two data come from the literature Väre, H., Ohtonen, R. and Oksanen, J. (1995) Effects of reindeer grazing on understorey vegetation in dry Pinus sylvestris forests. Journal of Vegetation Science 6, 523–530 The varespec data frame has
24 The rows are 44 columns, and the columns are the estimated coverage values for the 44 species. The variable name is composed of scientific names. The data introduction says that people who are familiar with the vegetation types can know it at a glance. I don’t know what kind of plant it is, but it is the name of the plant anyway. The varechem data frame has 24 rows and 14 columns, giving the soil properties for the same locations as in the varespec data frame. Chemical measurements have obvious names that are characteristic of some metallic elements. Baresoil represents the estimated cover of bare soil and Humdepth represents the thickness of the humus layer.
The linkET package can draw a single heat map or a correlation heat map, which we will show below.
It has several special functions to introduce first, the correlate function can calculate the correlation of data
correlate(varechem)
It is also possible to calculate the correlation coefficient of different data
correlate(varespec[1:30], varechem)
After calculating the correlation, the graph can be generated. First, the single data
correlate(varechem) %>%
as_md_tbl() %>%
qcorrplot() +
geom_square()
Another double data
correlate(varespec[1:30], varechem) %>%
qcorrplot() +
geom_square()
Make personalized color changes
correlate(varespec[1:30], varechem) %>%
qcorrplot() +
geom_square() +
scale_fill_gradientn(colours = RColorBrewer::brewer.pal(11, "RdBu"))
The Qcorrplot function can visualize the coefficientized matrix
qcorrplot(correlate(varechem)) +
geom_square() +
scale_fill_gradientn(colours = RColorBrewer::brewer.pal(11, "RdBu"))
The Type coefficient can control us to take partial graphics, for example, I only want to take the lower half
qcorrplot(correlate(varechem), type = "lower") +
geom_square() +
scale_fill_gradientn(colours = RColorBrewer::brewer.pal(11, "RdBu"))
You can also customize your own style by using the set_corrplot_style() function and geom_square() function. For example, I want to change the box to a circle and define the colors as red, blue and white.
set_corrplot_style(colours = c("red", "white", "blue"))
qcorrplot(correlate(varechem), type = "lower") +
geom_shaping(marker = "circle")
If you want to revert back to the system custom color, you can use
set_default_style()
Next, let's draw the network heat map of the correlation. Before drawing, we need to conduct a Mantel test. The author of the R package said this, the Mantel and partial Mantel test of the dissimilarity matrix. , note here that spec_select selects the index value of the column, and the varespec data is exactly 44 columns
mantel <- mantel_test(varespec, varechem,
spec_select = list(Spec01 = 1:7,
Spec02 = 8:18,
Spec03 = 19:37,
Spec04 = 38:44))
After obtaining the R value and P value of each category, we represent them in segments
mantel<-mantel %>%
mutate(rd = cut(r, breaks = c(-Inf, 0.2, 0.4, Inf),
labels = c("< 0.2", "0.2 - 0.4", ">= 0.4")),
pd = cut(p, breaks = c(-Inf, 0.01, 0.05, Inf),
labels = c("< 0.01", "0.01 - 0.05", ">= 0.05")))
After calculation, you can draw further
qcorrplot(correlate(varechem), type = "lower", diag = FALSE) +
geom_square() +
geom_couple(aes(colour = pd, size = rd),
data = mantel,
curvature = nice_curvature())
It can be further modified. This connecting line is too thick and not very beautiful. Let's adjust it
qcorrplot(correlate(varechem), type = "lower", diag = FALSE) +
geom_square() +
geom_couple(aes(colour = pd, size = rd),
data = mantel,
curvature = nice_curvature()) +
scale_fill_gradientn(colours = RColorBrewer::brewer.pal(11, "RdBu")) +
scale_size_manual(values = c(0.5, 1, 2))
custom line color
qcorrplot(correlate(varechem), type = "lower", diag = FALSE) +
geom_square() +
geom_couple(aes(colour = pd, size = rd),
data = mantel,
curvature = nice_curvature()) +
scale_fill_gradientn(colours = RColorBrewer::brewer.pal(11, "RdBu")) +
scale_size_manual(values = c(0.5, 1, 2)) +
scale_colour_manual(values = color_pal(3))
Change the name of the legend
qcorrplot(correlate(varechem), type = "lower", diag = FALSE) +
geom_square() +
geom_couple(aes(colour = pd, size = rd),
data = mantel,
curvature = nice_curvature()) +
scale_fill_gradientn(colours = RColorBrewer::brewer.pal(11, "RdBu")) +
scale_size_manual(values = c(0.5, 1, 2)) +
scale_colour_manual(values = color_pal(3)) +
guides(size = guide_legend(title = "Mantel's r",
override.aes = list(colour = "grey35"),
order = 2),
colour = guide_legend(title = "Mantel's p",
override.aes = list(size = 3),
order = 1),
fill = guide_colorbar(title = "Pearson's r", order = 3))
There are still many details that can be modified, so I will not introduce them one by one. The official account replied: Network heat map data, you can get the data.