(Turn) You can understand group by and aggregation functions in this way

The words written in the front: I have been using group by for a long time. When I woke up this morning, I suddenly felt that group by was very strange. Is a certain column or an aggregate function of a certain column, how can group by multiple fields be well understood? But in the end, I turned it around, just write it down, and the big cows just skip it.

=========Start of text ===========

  Let's first look at Table 1 below, the table name is test:

 

Table 1

  Execute the following SQL statement:

1
2
SELECT  name  FROM  test
GROUP  BY  name

  You should easily know the result of the operation, yes, it is the following table 2:

 

Table 2

  However, in order to better understand the application of "group by" multiple columns" and "aggregate function" , I suggest that in the process of thinking, from table 1 to table 2, add an imaginary intermediate table: virtual table 3. Let's talk about how to think about the execution of the above SQL statement:

1. FROM test: After the sentence is executed, the result should be the same as Table 1, which is the original table.

2. FROM test Group BY name: After this sentence is executed, we imagine that virtual table 3 is generated, as shown in the figure below, the generation process is as follows: group by name, then find the column of name, the row with the same name value, Merge into one line, for example, for the name value of aa, then the two lines <1 aa 2> and <2 aa 3> are merged into one line, and all the id and number values ​​are written into one cell.

 

3. Next , execute the Select statement for virtual table 3 :

(1) If you execute select *, then the returned result should be virtual table 3, but the contents of some cells in id and number are multiple values, and the relational database is based on relationships, and the cells are not Multiple values ​​are allowed, so you see, an error is reported when the select * statement is executed.

(2) Let's look at the name column again. There is only one data in each cell, so if we select the name, there is no problem. Why does the name column have only one value per cell, because we use the name column to group by.

(3) So what to do when there are multiple data in the cells in id and number? The answer is to use aggregate functions, which are used to input multiple data and output one data. Such as cout(id), sum(number), and the input of each aggregation function is each multi-data cell.

(4) For example, if we execute select name, sum(number) from test group by name, then sum will perform the sum operation on each cell in the number column of virtual table 3, such as the number column in the row whose name is aa. The sum operation, that is, 2+3, returns 5, and the final execution result is as follows:

 (5) How to understand group by multiple fields: such as group by name, number, we can regard name and number as a whole field , and group them as a whole. As shown below

(6) Next, you can operate with select and aggregation functions. If you execute select name,sum(id) from test group by name,number, the result is as follows:

 

Reprinted from: http://www.cnblogs.com/wiseblog/articles/4475936.html

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326542740&siteId=291194637