Upset plot可以将共同出现的变量排列成集合,并展示它们的频率的柱状图。诀窍在于,它试图让你很容易看到组成集合的元素。 弥补了种类太多而Venn图展示不充分的局限性。
特点主要有:
使用感知上有效的视觉编码,也就是说,使之容易准确地阅读数据。
不仅要使交叉点可视化,还要使交叉点的组合可视化(例如,涉及两个特定集合的所有交叉点)。
可视化关于交叉点的属性。有趣的不仅仅是一个交叉点的大小,我们还想知道与交叉点相关的数据是否不同或相似。
在R中可以UpSetR实现
安装
Install the latest released version from CRAN
install.packages("UpSetR")
Download the latest development code of UpSetR from GitHub using devtools with
devtools::install_github("hms-dbmi/UpSetR")
例子
movies <- read.csv( system.file("extdata", "movies.csv", package = "UpSetR"), header=T, sep=";" )
mutations <- read.csv( system.file("extdata", "mutations.csv", package = "UpSetR"), header=T, sep = ",")
Movie数据集由GroupLens实验室创建,由Bilal Alsallakh策划,突变数据集最初由TCGA联盟创建,代表了多形性胶质母细胞瘤队列中100个突变最多的基因。
upset(movies,attribute.plots=list(gridrows=60,plots=list(list(plot=scatter_plot, x="ReleaseDate", y="AvgRating"),
list(plot=scatter_plot, x="ReleaseDate", y="Watches"),list(plot=scatter_plot, x="Watches", y="AvgRating"),
list(plot=histogram, x="ReleaseDate")), ncols = 2))
upset(mutations, sets = c("PTEN", "TP53", "EGFR", "PIK3R1", "RB1"), sets.bar.color = "#56B4E9",
order.by = "freq", empty.intersections = "on")
一个使用两组查询(战争电影和黑色电影)的例子,以及比较平均评分(顶部)和平均评分与电影被观看次数(底部)的属性图。
upset(movies, attribute.plots=list(gridrows = 100, ncols = 1,
plots = list(list(plot=histogram, x="AvgRating",queries=T),
list(plot = scatter_plot, y = "AvgRating", x = "Watches", queries = T))),
sets = c("Action", "Adventure", "Children", "War", "Noir"),
queries = list(list(query = intersects, params = list("War"), active = T),
list(query = intersects, params = list("Noir"))))
更详细绘图方案可参见:Chapter 8 UpSet plot | ComplexHeatmap Complete Reference (jokergoo.github.io)
Reference:
Visualizing Intersecting Sets (jku-vds-lab.at)
http://www.nature.com/nmeth/journal/v11/n8/abs/nmeth.3033.html
如果要使用UpSetR包,需要引用 :
Jake R Conway, Alexander Lex, Nils Gehlenborg UpSetR: An R Package for the Visualization of Intersecting Sets and their Properties doi: https://doi.org/10.1093/bioinformatics/btx364