在R语言环境中绘制直方图可以使用 hist, plot和ggplot2等
hist
语法
hist(x, breaks = "Sturges",
freq = NULL, probability = !freq,
include.lowest = TRUE, right = TRUE,
density = NULL, angle = 45, col = NULL, border = NULL,
main = paste("Histogram of" , xname),
xlim = range(breaks), ylim = NULL,
xlab = xname, ylab,
axes = TRUE, plot = TRUE, labels = FALSE,
nclass = NULL, warn.unused = TRUE, …)
参数
x – 数组, 包含histogram所要展示的数据(a vector of values for which the histogram is desired.)
breaks, 可为以下几种类型:
- 数组 – 包含histogram单元分隔点(a vector giving the breakpoints between histogram cells)
- 函数 – 用于计算分割点数组(a function to compute the vector of breakpoints)
- 数 – 设定histogram中单元数量(a single number giving the number of cells for the histogram)
- 字符串 – 指定计算histogram中单元数量的算法 (a character string naming an algorithm to compute the number of cells)
- 函数 – 计算histogram单元数量(a function to compute the number of cells)
freq – 逻辑(布尔型)变量.
- True – the histogram graphic is a representation of frequencies, the counts component of the result
- False – probability densities, component density, are plotted (so that the histogram has a total area of one)
probability
an alias for !freq, for S compatibility.include.lowest
logical; if TRUE, an x[i] equal to the breaks value will be included in the first (or last, for right = FALSE) bar. This will be ignored (with a warning) unless breaks is a vector.right
logical; if TRUE, the histogram cells are right-closed (left open) intervals.density
the density of shading lines, in lines per inch. The default value of NULL means that no shading lines are drawn. Non-positive values of density also inhibit the drawing of shading lines.angle
the slope of shading lines, given as an angle in degrees (counter-clockwise).col
a colour to be used to fill the bars. The default of NULL yields unfilled bars.border
the color of the border around the bars. The default is to use the standard foreground color.main, xlab, ylab
these arguments to title have useful defaults here.xlim, ylim
the range of x and y values with sensible defaults. Note that xlim is not used to define the histogram (breaks), but only for plotting (when plot = TRUE).axes
logical. If TRUE (default), axes are draw if the plot is drawn.plot
logical. If TRUE (default), a histogram is plotted, otherwise a list of breaks and counts is returned. In the latter case, a warning is used if (typically graphical) arguments are specified that only apply to the plot = TRUE case.labels
logical or character string. Additionally draw labels on top of bars, if not FALSE; see plot.histogram.nclass
numeric (integer). For S(-PLUS) compatibility only, nclass is equivalent to breaks for a scalar or character argument.warn.unused
logical. If plot = FALSE and warn.unused = TRUE, a warning will be issued when graphical parameters are passed to hist.default().
样例
样例 1 – 使用hist
# 准备数据
data<-rnorm(n=1000, m=24.2, sd=2.2)
# 绘制直方图
hist(data)
样例 2 – 使用hist - 调整数据间隔数量
# 准备数据
data<-rnorm(n=1000, m=24.2, sd=2.2)
# 绘制直方图
hist(data, breaks=30)
样例 3 – 使用hist - 分布密度直方图
# 准备数据
data<-rnorm(n=1000, m=24.2, sd=2.2)
# 绘制直方图
hist(data, freq=FALSE)
样例 4 – 使用hist - 分布密度直方图 + 密度分布曲线
# 准备数据
data<-rnorm(n=1000, m=24.2, sd=2.2)
# 绘制直方图
hist( data, freq = FALSE, ylim = c(0, 0.2))
curve(dnorm(x, mean=mean(data), sd=sd(data)), col = 2, lty = 2, lwd = 2, add = TRUE)
样例 5 – 使用hist - 分布频度直方图 + 色彩
# 准备数据
data<-rnorm(n=1000, m=24.2, sd=2.2)
# 绘制直方图
colors = c("red", "yellow", "green", "violet", "orange", "blue", "pink", "cyan")
hist(data, right=FALSE, col=colors, main="Data Distrubution", xlab="x data")
样例 6 – 使用hist - 分布频度直方图 + label
# 准备数据
data<-rnorm(n=1000, m=24.2, sd=2.2)
# 绘制直方图
colors = c("red", "yellow", "green", "violet", "orange", "blue", "pink", "cyan")
h <-hist(data, right=FALSE, col=colors, main="Data Distrubution", xlab="x data")
text(h$mids,h$counts,labels=h$counts, adj=c(0.5, -0.5))
样例 7 – 使用hist - 两个分布频度直方图
# 准备数据
data1<-rnorm(n=1000, m=24.2, sd=2.2)
data2<-rnorm(n=1000, m=34.2, sd=2.2)
# 绘制直方图
hist( data1, freq = FALSE, ylim = c(0, 0.20), xlim=c(15, 45), col='skyblue', main="Histogram of Data", xlab="data")
hist( data2, freq = FALSE, ylim = c(0, 0.20), add=T, col='red')
样例 8 – 使用hist - 两个分布频度直方图 + 透视色
# 准备数据
data1<-rnorm(n=1000, m=24.2, sd=2.2)
data2<-rnorm(n=1000, m=34.2, sd=2.2)
# 绘制直方图
hist( data1, freq = FALSE, ylim = c(0, 0.20), xlim=c(15, 45), border=T, col='skyblue', main="Histogram of Data", xlab="data")
hist( data2, freq = FALSE, ylim = c(0, 0.20), add=T, border=T, col=rgb(0, 1, 0, 0.5))
plot
样例
样例 9 – 使用plot - 两个分布频度直方图
# 准备数据
data1<-rnorm(n=1000, m=24.2, sd=2.2)
data2<-rnorm(n=1000, m=34.2, sd=2.2)
p1 <- hist(data1, plot=FALSE)
p2 <- hist(data2, plot=FALSE)
# 绘制直方图
plot(0,0,type="n",xlim=c(15,45),ylim=c(0,200),xlab="x",ylab="freq",main="Two histograms")
plot(p1,col="green",density=10,angle=135,add=TRUE)
plot(p2,col="blue",density=10,angle=45,add=TRUE)
样例 10 – 使用plot - 两个分布密度曲线图
# 准备数据
data1<-rnorm(n=1000, m=24.2, sd=2.2)
data2<-rnorm(n=1000, m=34.2, sd=2.2)
## 计算分布密度
densdata1 <- density(data1)
densdata2 <- density(data2)
##
xlim <- range(densdata2$x,densdata1$x)
ylim <- range(0,densdata2$y, densdata1$y)
#pick the colours
data1Col <- rgb(1,0,0,0.2)
data2Col <- rgb(0,0,1,0.2)
##
plot(densdata1, xlim = xlim, ylim = ylim, xlab = 'data',
main = 'Distribution of data2 and data2',
panel.first = grid())
#
polygon(densdata1, density = -1, col = data1Col)
polygon(densdata2, density = -1, col = data2Col)
## 标题
legend('topleft',c('data1','data2'),
fill = c(data1Col, data2Col), bty = 'n',
border = NA)
ggplot
样例
样例 11 – 使用ggplot2 - 分布密度曲线
安装
install.packages("ggplot2")
library(ggplot2)
# 准备数据
data<-rnorm(n=1000, m=24.2, sd=2.2)
# 分布密度曲线
ggplot(data=NULL, aes(x=data)) + geom_density()
样例 12 – 使用ggplot2 - 两个分布密度曲线
library(ggplot2)
# 准备数据
data1 <- data.frame( length = rnorm(n=1000, m=24.2, sd=2.2) )
data2 <- data.frame( length = rnorm(n=1000, m=34.2, sd=2.2) )
data1$veg <- 'A'
data2$veg <- 'B'
vegLengths <- rbind(data1, data2)
ggplot(vegLengths, aes(length, fill = veg)) + geom_density(alpha = 0.2)
[1] https://www.rdocumentation.org/packages/graphics/versions/3.4.3/topics/hist
[2] http://www.r-tutor.com/elementary-statistics/quantitative-data/histogram
[3] https://www.r-bloggers.com/basics-of-histograms/
[4] https://stackoverflow.com/questions/3541713/how-to-plot-two-histograms-together-in-r
[5] http://ggplot2.org/
[6] http://www.cookbook-r.com/Graphs/Plotting_distributions_(ggplot2)/