Why can not refer directly to the original GROUP columns in the table after BY

SELECT GROUP BY restrictions after column

  SQL standard provisions, when the table aggregate queries can only write the following three kinds of content in the SELECT clause: GROUP BY clause specified by polymerization key, aggregate functions (SUM, AVG, etc.), constants. We look at an example

  We have students class table (tbl_student_class) and data are as follows:

DROP TABLE IF EXISTS tbl_student_class;
CREATE TABLE tbl_student_class (
  id int(8) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增主键',
  sno varchar(12) NOT NULL COMMENT '学号',
  cno varchar(5) NOT NULL COMMENT '班级号',
  cname varchar(20) The NOT  NULL the COMMENT ' class name ' ,
   a PRIMARY  KEY (ID) 
) the COMMENT = ' Student Class table ' ; 

- ------------------------- --- 
- Records of tbl_student_class 
- ---------------------------- 
the INSERT  the INTO tbl_student_class the VALUES ( ' . 1 ' , ' 20,190,607,001 ' , ' 0607 ' , ' film 7 classes ' );
 INSERT  INTOtbl_student_class the VALUES ( ' 2 ' , ' 20,190,607,002 ' , ' 0607 ' , ' film 7 classes ' );
 the INSERT  the INTO tbl_student_class the VALUES ( ' . 3 ' , ' 20,190,608,003 ' , ' 0608 ' , ' video class 8 ' );
 the INSERT  the INTO tbl_student_class the VALUES ( ' 4 ' , '20190608004' , ' 0608 ' , ' video class 8 ' );
 the INSERT  the INTO tbl_student_class the VALUES ( ' . 5 ' , ' 20,190,609,005 ' , ' 0609 ' , ' film 9 classes ' );
 the INSERT  the INTO tbl_student_class the VALUES ( ' . 6 ' , ' 20,190,609,006 ' , ' 0609 ' , ' film 9 classes ');

  We want to count each class (class number, class name) a number of people, and has the largest student number, how can we write the query SQL? I think we should be able

SELECT cno,cname,count(sno),MAX(sno) 
FROM tbl_student_class
GROUP BY cno,cname;

  But some people will think, cno and cname always been one, cno Once, cname also determined that SQL is not so to write?

SELECT cno,cname,count(sno),MAX(sno) 
FROM tbl_student_class
GROUP BY cno;

  The execution error:

[Err] 1055 - Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.tbl_student_class.cname' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

  Prompt: SELECT list of the second expression (CNAME) is not in a GROUP BY clause, while it is not a function of the polymerization; sql This mode: ONLY_FULL_GROUP_BY incompatible.

  Why can not directly reference the original table after BY GROUP (not in the GROUP BY clause) in a column ? Moji, we look down slowly.

SQL mode

  MySQL server can operate in different modes of SQL, and may be applied in different ways in these modes for different clients, depending on the value of the system variable sql_mode. SQL DBA global pattern may be provided to match the operating requirements of the site server, and each application can be provided its own conversation mode SQL request. Mode affects the MySQL support SQL syntax and data validation checks it performs , which makes use of MySQL in different environments and to use MySQL together with other database servers becomes easier. For more details, please refer to the official website: Server SQL Modes . Different versions of MySQL, the content will be slightly different (including the default value), when the attention of review is consistent with its own version of MySQL.

  SQL mode is mainly divided into two categories: syntax checking and data type support classes, commonly used as follows

  Syntax support classes    

    ONLY_FULL_GROUP_BY

      For GROUP BY aggregate operations, if a column in the SELECT, HAVING or ORDER BY clause column does not appear in a GROUP BY, then the SQL is not legitimate

    ANSI_QUOTES

      ANSI_QUOTES then enabled, can not be used to refer to a string double quotation marks, because it is interpreted as an identifier, the same effect with `. After setting it, update t set f1 = "" ..., will report Unknown column '' in field list syntax errors such

    PIPES_AS_CONCAT

    NO_TABLE_OPTIONS

      MySQL-specific syntax is not part of the output when using SHOW CREATE TABLE, such as ENGINE, when using this kind of cross-mysqldump DB migration needs to be considered

    NO_AUTO_CREATE_USER

      Literally does not automatically create a user. When a MySQL user authorization, we are accustomed to using the GRANT ... ON ... TO dbuser take the opportunity to create a user together. Setting this option after the operation is similar to the oracle, you must first create a user before authorization

  Check the data type   

    NO_ZERO_DATE

      Consider the date '0000-00-00' illegal, and is set back about strict mode

      1. If strict mode is set, NO_ZERO_DATE nature meet. But if INSERT IGNORE or UPDATE IGNORE, '0000-00-00' and still allows the warning display only;

      2, if the non-strict model, the NO_ZERO_DATE provided, similarly to the above effect, '0000-00-00', but allows the warning display; if not set NO_ZERO_DATE, no warning, as perfectly legal value;

      3, NO_ZERO_IN_DATE case similar to the above, except that to control the date and the day, whether or not may be 0, i.e., whether a legitimate 2010-01-00;

    NO_ENGINE_SUBSTITUTION

      When using the ALTER TABLE or CREATE TABLE designated ENGINE, the required storage engine is disabled or not compiled, how to deal with. When enabled NO_ENGINE_SUBSTITUTION, then direct throw an error; when this value is not set, CREATE by default storage engine alternative, ATLER do not make changes, and throw a warning

    STRICT_TRANS_TABLES

      It is set to indicate enable strict mode. Note STRICT_TRANS_TABLES not a combination of several strategies, alone refers to INSERT, UPDATE or invalid values ​​appear less how to deal with:

      1, the aforementioned 'pass int, illegal strict mode, if enabled 0 becomes non-strict model, generating a warning;

      2, Out Of Range, is inserted into the maximum boundary value;

      3, when the new row to be inserted, when it does not contain an explicit definition no value DEFAULT clause non-NULL column, the column value is missing;

  The default mode

    In the case where we do not have to modify the configuration file, MySQL have their own default mode; different versions, the default mode is different

- View MySQL version 
the SELECT VERSION (); 

- View sql_mode 
the SELECT  @@ sql_mode ;

     We can see that the default mode 5.7.21 include:

ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION

    The first: ONLY_FULL_GROUP_BY will constraint: When we aggregation queries, SELECT columns can not directly contain non-GROUP BY clause column . What if we get rid of the mode (from "strict mode" to "relaxed mode") do?

    We found that the above error of SQL

-- 宽松模式下 可以执行
SELECT cno,cname,count(sno),MAX(sno) 
FROM tbl_student_class
GROUP BY cno;
View Code

    能正常执行了,但是一般情况下不推荐这样配置,线上环境往往是“严格模式”,而不是“宽松模式”;虽然案例中,无论是“严格模式”,还是“宽松模式”,结果都是对的,那是因为 cno 与 cname 唯一对应的,如果 cno 与 cname 不是唯一对应,那么在“宽松模式下” cname 的值是随机的,这就会造成难以排查的问题,有兴趣的可以去试试。那为什么会有 ONLY_FULL_GROUP_BY 模式呢 我们继续往下看

  阶(order)是用来区分集合或谓词的阶数的概念。谓词逻辑中,根据输入值的阶数对谓词进行分类。= 或者 BETWEEEN 等输入值为一行的谓词叫作"一阶谓词",而像 EXISTS 这样输入值为行的集合的谓词叫作"二阶谓词"(HAVING 的输入值也是集合,但它不是谓词)。以此类推,三阶谓词=输入值为"集合的集合"的谓词,四阶谓词=输入值为"集合的集合的集合"的谓词,但是 SQL 里并不会出现三阶
以上的情况,所以不用太在意。简单点如下图

  谈到了阶,就不得不谈下集合论;集合论是 SQL 语言的根基,因为它的这个特性,SQL 也被称为面向集合语言。只有从集合的角度来思考,才能明白 SQL 的强大威力。通过上图,相信大家也都能看到,这里不做更深入的讲解了,有兴趣的可以去查相关资料。

为什么聚合后不能再引用原表中的列

  很多人都知道聚合查询的限制,但是很少有人能正确地理解为什么会有这样的约束。表 tbl_student_class 中的 cname 存储的是每位学生的班级信息,但需要注意的是,这里的 cname 只是每个学生的属性,并不是小组的属性,而 GROUP BY 又是聚合操作,操作的对象就是由多个学生组成的小组,因此,小组的属性只能是平均或者总和等统计性质的属性,如下图

  询问每个学生的 cname 是可以的,但是询问由多个学生组成的小组的 cname 就没有意义了。对于小组来说,只有"一共多少学生"或者"最大学号是多少?"这样的问法才是有意义的。强行将适用于个体的属性套用于团体之上,纯粹是一种分类错误;而 GROUP BY 的作用是将一个个元素划分成若干个子集,使用 GROUP BY 聚合之后,SQL 的操作对象便由 0 阶的"行"变为了 1 阶的"行的集合",此时,行的属性便不能使用了。SQL 的世界其实是层级分明的等级社会,将低阶概念的属性用在高阶概念上会导致秩序的混乱,这是不允许的。此时我相信大家都明白:为什么聚合后不能再引用原表中的列 。

单元素集合也是集合

  现在的集合论认为单元素集合是一种正常的集合。单元素集合和空集一样,主要是为了保持理论的完整性而定义的。因此对于以集合论为基础的 SQL 来说,当然也需要严格地区分元素和单元素集合。因此,元素 a 和集合 {a} 之间存在着非常醒目的层级差别。

a ≠ {a}

  这两个层级的区别分别对应着 SQL 中的 WHERE 子句和 HAVING 子句的区别。WHERE 子句用于处理"行"这种 0 阶的对象,而 HAVING 子句用来处理"集合"这种 1 阶的对象。

总结

  1、SQL 严格区分层级,包括谓词逻辑中的层级(EXISTS),也包括集合论中的层级(GROUP BY);

  2、有了层级区分,那么适用于个体上的属性就不适用于团体了,这也就是为什么聚合查询的 SELECT 子句中不能直接引用原表中的列的原因;

  3、一般来说,单元素集合的属性和其唯一元素的属性是一样的。这种只包含一个元素的集合让人觉得似乎没有必要特意地当成集合来看待,但是为了保持理论的完整性,我们还是要严格区分元素和单元素集合;

参考

  《SQL基础教程》

  《SQL进阶教程》

Guess you like

Origin www.linuxidc.com/Linux/2019-09/160762.htm