Using corrgram package corrgram () function, the correlation matrix can be used to show graphically
code sample
options(digits=2)
cor(mtcars)
install.packages("corrgram")
library(corrgram)
corrgram(mtcars,order=TRUE,lower.panel = panel.shade,
upper.panel = panel.pie,text.panel = panel.txt,
main="Corrgram of mtcars intercorrelations")
We start with lower triangular cells begin to explain this piece of graphics. Default shape, directed from the lower left and upper right blue slash indicates the cell in the two variables positive correlation,
in turn, point to the lower right of the slash indicates a negative correlation variables and red left. The deeper the color, the higher the saturation, the greater the relevance of the variable.
This figure will be similar to the variable mode associated together, the matrix of rows and columns of re-ordering
the cell # triangular pie chart shows the same information. Color functions above, the correlation between the size of the pie is filled with fast-size display.
corrgram () function in the following format
corrgram (x, order =, panel =, text.panel =, diag.panel =)
where, x is an observation line of the data frame.
When the order = TRUE, the correlation matrix using the principal component analysis of the reordering variables, which will make the bivariate relationship model more pronounced
panel type of off-diagonal element is set to use the panel. lower.panel and upper.panel element type respectively disposed below and above the main diagonal.
text.panel and diag.panel options control the main diagonal element type
corrgram () function option panel
-diagonal
panel.pie: filling ratio of the size of the pie chart represents the correlation
panel.shade: with depth shading to represent size correlation
panel.ellipse: confidence ellipse draw a smooth curve and
panel.pts: Videos a scatter plot
panel.conf: correlation and confidence intervals shown
main diagonal
panel.txt: output variable name
panel.minmax: output the maximum and minimum values of the variable name variable
panel.density: nuclear density curve and output variable names
Second code sample
corrgram(mtcars,order=TRUE,lower.panel = panel.ellipse,
upper.panel = panel.pts,text.panel = panel.txt,diag.panel = panel.minmax,
main="Corrgram of mtcars intercorrelations")
FIG frame correlation coefficient data variable mtcars, the triangular region including a smooth curve fit and confidence ellipse, triangle comprising the scattergram. Minimum and maximum variables comprising the main diagonal.
Third code sample
corrgram(mtcars,order=TRUE,lower.panel = panel.shade,
upper.panel = NULL,text.panel = panel.txt,
main="Corrgram of mtcars intercorrelations")
11.4 马赛克图
使用马赛克图观察两个以上的类别变量的分布
在马赛克图中,嵌套矩形面积正比于单元格频率,其中该频率即多维列联表中的频率。
颜色或阴影可表示拟合模型的残差值。
vcd包中的mosais()函数可以绘制马赛克图
R基础安装中的mosaicplot()也可绘制马赛克图,但是没有vcd包中的mosaic()函数功能多。
数据使用基础安装中的Titanic数据集为例,它包含存活或者死亡乘客数,乘客的船舱等级,性别,年龄层数据
数据:
ftable(Titanic)
Survived No Yes
Class Sex Age
1st Male Child 0 5
Adult 118 57
Female Child 0 1
Adult 4 140
2nd Male Child 0 11
Adult 154 14
Female Child 0 13
Adult 13 80
3rd Male Child 35 13
Adult 387 75
Female Child 17 14
Adult 89 76
Crew Male Child 0 0
Adult 670 192
Female Child 0 0
Adult 3 20
mosaic()函数可按如下方式调用
mosaic(table)
其中table是数组形式的列联表。另外也可用
mosaic(formula, data=)
其中formula是标准的R表达式,data设定一个数据框或者表格。添加选项shade=TRUE将根据拟合模型的皮尔逊残差值对图形上色。
添加legend=TRUE 将展示残差的图例
代码示例
library(vcd)
mosaic(Titanic,shade=TRUE,legend=TRUE)
#和
mosaic(~Class+Sex+ Age+Survived,data=Titanic,shade=TRUE,legend=TRUE)
马赛克图隐含着大量的数据信息,例如;(1)从船员到头等舱,存活率突然增高;(2)大部分孩子都处在三等舱和二等舱中;(3)在头等舱中的大部分女性都存活下来了,而三等舱中仅有一半女性存活;(4)船员女性很少
扩展的马赛克图添加了颜色和阴影来表示拟合模型的残差值。在本例中,蓝色阴影表名在假定生存率与船员等级性别和年龄层无关的条件下,该类别下的生存率通常超出预期值,红色阴影含义相反
小结:
利用corrgram包中的corrgram()函数,可以用图形的方式展示相关系数矩阵
使用马赛克图观察两个以上的类别变量的分布