R language data visualization ggplot2 basics 2 create a single-layer scatter plot to create a facet

R language data visualization ggplot2 basics 2 create a single-layer scatter plot to create a facet

Single layer scatter plot

In this lesson, we start with the simplest scatter plot and introduce the basics of ggplot2 application. First, we download and apply the tidyverse package:

install.packages("tidyverse")
library(tidyverse)

Use tidyverse's own data mpg for example,

> ggplot2::mpg
# A tibble: 234 x 11
   manufacturer model displ  year   cyl trans drv     cty
   <chr>        <chr> <dbl> <int> <int> <chr> <chr> <int>
 1 audi         a4      1.8  1999     4 auto~ f        18
 2 audi         a4      1.8  1999     4 manu~ f        21
 3 audi         a4      2    2008     4 manu~ f        20
 4 audi         a4      2    2008     4 auto~ f        21
 5 audi         a4      2.8  1999     6 auto~ f        16
 6 audi         a4      2.8  1999     6 manu~ f        18
 7 audi         a4      3.1  2008     6 auto~ f        18
 8 audi         a4 q~   1.8  1999     4 manu~ 4        18
 9 audi         a4 q~   1.8  1999     4 auto~ 4        16
10 audi         a4 q~   2    2008     4 manu~ 4        20
# ... with 224 more rows, and 3 more variables:
#   hwy <int>, fl <chr>, class <chr>

In this data, we pay more attention to the two variables displ and hwy, displ represents the engine size (liter), hwy represents the fuel efficiency (miles/gallon) on the highway. In order to show the relationship between these two variables, we first draw a simple scatter plot:

ggplot(data = mpg)+
  geom_point(mapping = aes(x = displ, y = hwy))

Insert picture description here

Because we only want a scatter plot, so with minimal code, rather than on a talk Layered Grammar introduced, if it is to write Layered Grammar

ggplot()+
  layer(data = mpg,mapping = aes(x = displ, y = hwy),
        geom = "point",stat = "identity",position="identity")+
  scale_y_continuous()+
  scale_x_continuous()+
  coord_cartesian()

The output of this code is the same as the image output by the minimal code, but this code is written strictly in accordance with Layered Grammar. The beginning of ggplot() indicates that the next step is to create a graphic object using graphic grammar. The first step is to create a layer. A layer contains data, mapping (aesthetic mapping), geometric objects, statistical transformation, and position adjustment; the second step is to specify the scale and coordinate system; compare Layered Grammar and minimal code, you can start to build some intuition, which can be omitted , What are the minimal requirements, and the subsequent code samples use minimal code.

On the whole, this scatter plot has a downward trend, but which points in the middle of the right are likely to cause a nonlinear pattern in the linear fitting residual. Therefore, we hope that introducing another variable can explain this phenomenon, so we can Use different colors to indicate the scattered points corresponding to different models:

ggplot(data = mpg)+
  geom_point(mapping = aes(x = displ, y = hwy, color = class))

Insert picture description here
In this way, we can compare the relationship between engine size and fuel efficiency by model.

You can try to change color=class to alpha=class, shape=class or size=class in aesthetics mapping. These three sentences can use transparency, point type, and point size to represent different models.

Single layer scatter plot facet

If we don't want to combine these models into one picture for comparison, but use subplot to show the relationship between the engine size and fuel efficiency of each model, then we need to create a facet.

Minimal Code:

ggplot(data = mpg)+
  geom_point(mapping = aes(x = displ, y = hwy))+
  facet_wrap(~class,nrow = 2)

Layered Grammar:

ggplot()+
  layer(data = mpg,mapping = aes(x = displ, y = hwy),
        geom = "point",stat = "identity",position="identity")+
  facet_wrap(~class,nrow = 2)+
  scale_y_continuous()+
  scale_x_continuous()+
  coord_cartesian()

Insert picture description here
facet_wrap(~class,nrow = 2) means that the data of the car model is used as a subplot to create a facet, and the subplot is evenly arranged into two rows.

You can also use two variables to create a facet, such as changing facet_wrap(~class,nrow = 2) to facet_grid(drv~cyl), and the result is

Insert picture description here
If you want to remove the rows and keep only the columns, you can use facet_grid(.~cyl) instead,
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44207974/article/details/112856550