R language data visualization ggplot2 basics 3 add geometric objects

R language data visualization ggplot2 basics 3 add statistical transformation of geometric object data

Add geometric objects

In the last lecture, we introduced how to create a scatter chart. In this lecture, we introduced how to create other types of graphs and how to create graphs with multiple geometric objects. Also use the data mpg that comes with tidyverse as an example,

ggplot2::mpg

In this data, we pay more attention to the two variables displ and hwy, displ represents the engine size (liter), hwy represents the fuel efficiency (miles/gallon) on the highway. In order to show the relationship between these two variables, we can try to draw a smooth curve, the smoothing method uses LOESS

ggplot(data = mpg)+
  geom_smooth(method = "loess",mapping = aes(x = displ, y = hwy),
              formula = "y~x")

Insert picture description here

Of course, the smoothing method can be changed. For example, we can use linear smoothing.

ggplot(data = mpg)+
  geom_smooth(method = "lm",mapping = aes(x = displ, y = hwy),
              formula = "y~x")

Insert picture description here
We can use the trend line as a new layer to overlap the layer of the scatter chart drawn in the previous lecture, that is, paste the trend line to the original data:

Layered Grammar (the default smooth method is LOESS)

ggplot()+
  layer(data = mpg, mapping = aes(x = displ, y = hwy),
        geom = "point" ,stat = "identity",position="identity")+
  layer(data = mpg,mapping = aes(x = displ, y = hwy),
        geom = "smooth" ,stat = "smooth",position="identity")+
  scale_y_continuous()+
  scale_x_continuous()+
  coord_cartesian()

Insert picture description here

The Layered Grammar of the two layers can also be simplified (the gray area around the trend line is the 95% confidence interval):

ggplot(data = mpg)+
  geom_point(mapping = aes(x = displ, y = hwy))+
  geom_smooth(method = "loess",mapping = aes(x = displ, y = hwy),
              formula = "y~x")

Insert picture description here
But even the above three lines of code are still not minimal code, because the data is the same as aesthetics mapping, and loess is the default method of smooth, so the above three lines of code can be further simplified into one line

Minimal Code:

ggplot(data = mpg,mapping = aes(x = displ, y = hwy))+geom_point()+geom_smooth()

If you want to add other functions, such as displaying different models by color, you can directly add a statement on the minimal code

ggplot(data = mpg,mapping = aes(x = displ, y = hwy))+
  geom_point(mapping = aes(color = class))+geom_smooth()

Insert picture description here

Statistical transformation of data

In this part, we use diamonds as an example.

ggplot2::diamonds

If we want to show the frequency of diamond cuts of different grades, we can use a histogram to represent:

Insert picture description here

The Minimal Code for drawing this histogram is:

ggplot(data=diamonds)+
  geom_bar(mapping = aes(x = cut))

The function used here to create the histogram geometric object is geom_bar. For the creation of this statistical image, we are also very interested in the specific things that happen under the function. The
Insert picture description here
geom_bar function uses data=diamonds as input, and we specify mapping = aes(x = cut), that is, based on the diamonds data set, calculate the number of each cut. This function is provided by the stat_count function. This step is data transformation. When the image is formed, it is based on data transformation The results are plotted. If you use Layered Grammar to omit scale and coord code as follows:

ggplot()+
  layer(data = diamonds,mapping = aes(x = cut),
        geom = "bar",stat = "count",position="identity")

Although Minimal Code is more efficient in engineering, Layered Grammar is more helpful for us to understand the logic of ggplot2 drawing in our study.

We can also use frequency histogram to show:

ggplot(data=diamonds)+
  geom_bar(mapping = aes(x = cut,y =..prop..,group = 1))

Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44207974/article/details/112856606