R language ggplot2 package to draw bar graphs

Due to work needs, I need to use ggplot2 to draw a line chart. After the project is over, I will learn about drawing the ggplot file package. First, I will reprint two articles that are of great help.

introduction

The drawing quality of the ggplot2 package is undoubted, but its drawing grammar is still a bit difficult for novices. The book ggplot2: data analysis and graphic art also introduces the basic philosophy and operations of the ggplot2 package. I personally feel that the examples are not rich enough. , So the usage of this package still stays at the level of half a bottle of vinegar. One day, I suddenly found an example book of the ggplot2 package. I'm overjoyed. The English version is called R Graphics Cookbook , and the Chinese version is called R Data Visualization Manual.

Basic operation

First of all, what I want to introduce is to draw a histogram with ggplot2, the basic usage is as follows:

install.packages('gcookbook')#R数据可视化手册书中的数据集
library(ggplot2)
library(gcookbook) 
ggplot(pg_mean, aes(x=group, y=weight)) + geom_bar(stat="identity")

 
  
  
  • 1
  • 2
  • 3
  • 4
  • 5

Histogram
The ggplot() function generates a layer, aes() specifies the x and y axis variables, and x is generally a nominal variable; geom_bar generates a histogram layer.

If x is a continuous variable or a numeric variable, the histogram will be a bit different. If it is a continuous variable, the abscissa of the histogram will take every value between the maximum and minimum of the continuous variable; if it is a numeric variable, you have to convert the numeric variable into a factor. See the code for examples.

BOD#BBD数据集 Time变量不包含6这个数值
Time demand
1 8.3
2 10.3
3 19.0
4 16.0
5 15.6
7 19.8
ggplot(BOD, aes(x=Time, y=demand)) + geom_bar(stat="identity")#直接赋值,不做因子转化
ggplot(BOD, aes(x=factor(Time), y=demand)) + geom_bar(stat="identity")#变量进行因子转化
 
  
  
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

figure 2
image 3
The difference between directly assigning a continuous variable to x and converting the value to a factor to x can be clearly seen from the figure.

Adjust color

The histogram is filled with black and gray by default, and we can modify the fill color and the border color of the histogram.

ggplot(pg_mean, aes(x=group, y=weight)) +
geom_bar(stat="identity", fill="lightblue", colour="black")
 
  
  
  • 1
  • 2

Figure 4

Grouped histogram

Sometimes we want to compare two different types of objects in the same data, we can use grouped histograms. We use the cabbage_exp data set, which contains two different types of data

cabbage_exp#数据集
Cultivar Date Weight
c39 d16 3.18
c39 d20 2.80
c39 d21 2.74
c52 d16 2.26
c52 d20 3.11
c52 d21 1.47
ggplot(cabbage_exp,aes(x=Date, y=Weight, fill=Cultivar))+geom_bar(stat='identity',position="dodge")#identity意味着把y当做值去输入,如果改成bin,就会计算y出现的频数。dodge意味是各组是左右分布而不是上下重叠
ggplot(cabbage_exp,aes(x=Date, y=Weight, fill=Cultivar))+geom_bar(stat='identity')
 
  
  
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

Figure 5
Figure 12 Overlap grouping

Make frequency histogram

The frequency histogram does not need to specify the variable y, but is automatically generated according to the frequency of occurrence of the variable x.

ggplot(diamonds, aes(x=cut)) + geom_bar(stat="bin")#stat="bin"可有可无,该参数是默认的
 
  
  
  • 1

Figure 6

Give different colors to positive and negative histograms

csub <- subset(climate, Source=="Berkeley" & Year >= 1900)
csub$pos <- csub$Anomaly10y >= 0 #pos变量是个布尔变量,值为T or F
ggplot(csub, aes(x=Year, y=Anomaly10y, fill=pos)) +geom_bar(stat="identity", position="identity")#position="identity"是为了关闭负值直方图没有定义的警告
 
  
  
  • 1
  • 2
  • 3

Figure 7

Adjust the size and width of the histogram column

Use the width function to adjust the histogram width . The default value is 0.9. We can modify the value to make the column wider or narrower.

ggplot(pg_mean, aes(x=group, y=weight)) + geom_bar(stat="identity", width=0.5)#窄柱子
ggplot(pg_mean, aes(x=group, y=weight)) + geom_bar(stat="identity", width=0.5)#宽柱子
 
  
  
  • 1
  • 2

Figure 8 Narrow column
Figure 9 Wide column
Also in the grouping histogram, there is no gap between the groups of pillars. If you want to change the distance between the pillars, we need to set position=po
sition_dodge().

ggplot(cabbage_exp, aes(x=Date, y=Weight, fill=Cultivar)) +geom_bar(stat="identity", width=0.5, position=position_dodge(0.6))#值越大越离得远
 
  
  
  • 1

Picture 10

Add data to the graph

You can use geom_text() to add legend or data to the histogram, and you can display the data on the column or elevation by setting vjust.

ggplot(cabbage_exp,aes(x=interaction(Date, Cultivar),y=Weight))+geom_bar(stat="identity")+geom_text(aes(label=Weight), vjust=-0.2)#显示在上面
ggplot(cabbage_exp,aes(x=interaction(Date,Cultivar),y=Weight))+geom_bar(stat="identity") +geom_text(aes(label=Weight),vjust=1.5,colour="white")#显示在里面
 
  
  
  • 1
  • 2

Figure 13 is shown above
Show in
Let me introduce some basics for now. If you need more advanced topics or examples, you can refer to the R data visualization manual, or use help() more often.

introduction

Guess you like

Origin blog.csdn.net/weixin_41792162/article/details/108324022