【SQL Statement Review】Chapter 1-2

SQL review

Learning Objective: Review SQL Statements

Learning address: https://linklearner.com/learn/detail/70

Chapter 1 Initial Database

A database is a data collection that saves a large amount of data and processes it through a computer that can be accessed efficiently. This collection of data is called a database (Database, DB). A computer system used to manage a database is called a Database Management System (DBMS).

1.1 Types of DBMS

DBMS is mainly classified by the data storage format (type of database). At this stage, there are mainly the following five types.

  • Hierarchical Database (HDB)

  • Relational Database (RDB)

    • Oracle Database: Oracle's RDBMS
    • SQL Server: Microsoft RDBMS
    • DB2: IBM's RDBMS
    • PostgreSQL: an open source RDBMS
    • MySQL: an open source RDBMS

    The above are five representative RDBMSs, which are characterized by two-dimensional tables composed of rows and columns to manage data. This type of DBMS is called a relational database management system (Relational Database Management System, RDBMS).

  • Object Oriented Database (OODB)

  • XML Database (XML Database, XMLDB)

  • Key-Value Store (KVS), for example: MongoDB

This course will introduce you to a database management system using the SQL language, that is, a relational database management system (RDBMS).

1.2 RDBMS relational database management system

When using RDBMS, the most common system structure is the client/server type (C/S type) structure

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-KQCsvw18-1690284630394)(https://oss.linklearner.com/wonderful-sql/ch01/ch01.01 %E7%B3%BB%E7%BB%9F%E7%BB%93%E6%9E%84.jpg)]

1.3 Initial SQL

[)(https://oss.linklearner.com/wonderful-sql/ch01/ch01.02%E8%A1%A8%E7%9A%84%E7%A4%BA%E4%BE%8B.jpg)]

The table structure stored in the database is similar to the rows and columns in excel. In the database, a row is called a record , which is equivalent to a record, and a column is called a field , which represents the data items stored in the table.

According to the different types of instructions given to RDBMS, SQL statements can be divided into the following three categories.

  • DDL : DDL (Data Definition Language, Data Definition Language) is used to create or delete objects such as databases for storing data and tables in the database. DDL contains the following types of instructions.
    • CREATE : Create objects such as databases and tables
    • DROP: Delete objects such as databases and tables
    • ALTER: Modify the structure of objects such as databases and tables
  • DML : DML (Data Manipulation Language, Data Manipulation Language) is used to query or change the records in the table. DML contains the following types of instructions.
    • SELECT : query data in a table
    • INSERT: insert new data into the table
    • UPDATE: update the data in the table
    • DELETE: delete data in the table
  • DCL : DCL (Data Control Language, Data Control Language) is used to confirm or cancel the changes made to the data in the database. In addition, you can also set whether the RDBMS user has the authority to operate the objects in the database (database tables, etc.). DCL contains the following types of instructions.
    • COMMIT: Confirm changes made to data in the database
    • ROLLBACK: Cancels changes made to data in the database
    • GRANT : Grants user permission to operate
    • REVOKE: Cancel the user's operation authority

90% of the SQL statements actually used belong to DML, and this book will also focus on DML.

1.4 Basic writing rules of SQL

  • SQL statements must end with a semicolon ( ; )
  • SQL does not distinguish the case of keywords, but the data inserted into the table is case sensitive
  • Words need to be separated by half-width spaces or newlines

Words in SQL statements must be separated by half-width spaces or newline characters, and full-width spaces cannot be used as word separators, otherwise errors will occur and unexpected results will appear.

1.5 Basic operations of the database

Create a database:

CREATE DATABASE <数据库名称>;

Create the database used in this course:

CREATE DATABASE shop;

Use current database:

use <数据库名>;
use shop;

Table creation:

CREATE TABLE <表名>
(<列名1><数据类型><该列所需约束>,
<列名1><数据类型><该列所需约束>,
<列名1><数据类型><该列所需约束>,
<列名1><数据类型><该列所需约束>,
.
.
<列名1><数据类型><该列所需约束>,....

);

Create the database used in this course:

CREATE TABLE product
(product_id CHAR(4) NOT NULL,
product_name varchar(100) NOT NULL,
product_type varchar(32) NOT NULL,
sale_price integer ,
purchase_price integer,
regist_date date,
primary key(product_id)
 );

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-zXWoWiKK-1690284630401) (C:\Users\Ga\AppData\Roaming\Typora\typora-user-images\ image-20230724105048252.png)]

Naming rules

  • Only half-width English letters, numbers, and underscores (_) can be used as database, table, and column names

  • The name must start with a half-width English letter

Table 1-3 Correspondence between commodity table and product table column names

Specification of data type

In the table created by the database, all columns must specify the data type, and each column cannot store data that does not match the data type of the column.

The four most basic data types

  • INTEGER type

It is used to specify the data type (numeric type) of the column that stores integers, and cannot store decimals.

  • CHAR type

It is used to store fixed-length strings. When the length of the string stored in the column does not reach the maximum length, half-width spaces are used to make up for it. Because it will waste storage space, it is generally not used.

  • VARCHAR type

Used to store variable-length strings. Fixed-length strings will be filled with half-width spaces when the number of characters does not reach the maximum length, but variable-length strings are different. Even if the number of characters does not reach the maximum length, half-width spaces will not be filled. .

  • DATE type

Used to specify the data type (date type) of the column that stores the date (year, month, and day).

constraint

Constraints are functions that restrict or add conditions to the data stored in a column in addition to the data type.

NOT NULLIt is a non-null constraint, that is, the column must enter data.

PRIMARY KEYIt is a primary key constraint, which means that the column is a unique value, and the data of a specific row can be retrieved through this column.

table data insertion

INSERT into <表名>(1,列2,列3...) VALUES(1,值2,值3...);

When performing a full-column INSERT on a table, the column list after the table name can be omitted. At this time, the value of the VALUES clause will be assigned to each column in order from left to right by default.

In this example:

INSERT into product VALUES('0001','T恤衫','衣服','1000','500','2009-09-20');
INSERT INTO product VALUES('0002', '打孔器', '办公用品', 500, 320, '2009-09-11');
INSERT INTO product VALUES('0003', '运动T恤', '衣服', 4000, 2800, NULL);
INSERT INTO product VALUES('0004', '菜刀', '厨房用具', 3000, 2800, '2009-09-20');
INSERT INTO product VALUES('0005', '高压锅', '厨房用具', 6800, 5000, '2009-01-15');
INSERT INTO product VALUES('0006', '叉子', '厨房用具', 500, NULL, '2009-09-20');
INSERT INTO product VALUES('0007', '擦菜板', '厨房用具', 880, 790, '2008-04-28');
INSERT INTO product VALUES('0008', '圆珠笔', '办公用品', 100, NULL, '2009-11-11');

Can also insert multiple rows

INSERT into product VALUES('0001','T恤衫','衣服','1000','500','2009-09-20');
					VALUES('0002', '打孔器', '办公用品', 500, 320, '2009-09-11');
					VALUES('0003', '运动T恤', '衣服', 4000, 2800, NULL);
					VALUES('0004', '菜刀', '厨房用具', 3000, 2800, '2009-09-20');
					VALUES('0005', '高压锅', '厨房用具', 6800, 5000, '2009-01-15');
					VALUES('0006', '叉子', '厨房用具', 500, NULL, '2009-09-20');
					VALUES('0007', '擦菜板', '厨房用具', 880, 790, '2008-04-28');
					VALUES('0008', '圆珠笔', '办公用品', 100, NULL, '2009-11-11');

Data can also be copied from other tables using the INSERT...SELECT statement

--将商品表中的数据复制到商品复制表中
insert into productcopy(product_id,product_name,product_type,sale_price,purchase_price,regist_date)
select product_id,product_name,product_type,sale_price,purchase_price,regist_date from product;

table deletion and update

Syntax to drop a table:

DROP TABLE <表名>;

Delete the product table

It is important to note that the deleted table cannot be restored and can only be re-inserted, so please be very cautious when performing the delete operation.

DROP TABLE product;

ALTER TABLE statement to add columns

ALTER TABLE <表名> ADD COLUMN <列的定义>;

Add a product_name_pinyin column that can store 100-digit variable-length strings

ALTER TABLE product ADD COLUMN product_name_pinyin vARCHAR(100);

Delete the product_name_pinyin column

ALTER TABLE product DROP COLUMN product_name_pinyin;

Delete specific columns in a table (syntax)

DELETE FROM product WHERE COLUMN_NAME='xxx';

The ALTER TABLE statement, like the DROP TABLE statement, cannot be undone after execution. Columns added by mistake can be deleted through the ALTER TABLE statement, or re-created after deleting all tables. 【Extended content】

clear table content

TRUNCATE TABLE TABLE_NAME;

Advantages: compared to drop / delete, truncateit is the fastest when used to clear data.

data update

Basic syntax:

UPDATE <表名>
	SET <列名>=<表达式>[,<列名2>=<表达式2>...]
	WHERE<条件>--可选,非常重要
	ORDER BY 子句 --可选
	limit 子句; --可选
	

When using update, pay attention to adding where conditions, otherwise all rows will be modified according to the statement

-- 修改所有的注册时间
UPDATE product
   SET regist_date = '2009-10-10';  
-- 仅修改部分商品的单价
UPDATE product
   SET sale_price = sale_price * 10
 WHERE product_type = '厨房用具';  

UPDATE can also be used to update the column to NULL (this update is commonly known as NULL emptying). At this point, you only need to directly write the value on the right side of the assignment expression as NULL.

-- 将商品编号为0008的数据(圆珠笔)的登记日期更新为NULL  
UPDATE product
   SET regist_date = NULL
 WHERE product_id = '0008'; 

Like the INSERT statement, the UPDATE statement can also use NULL as a value. **However, only columns that do not have NOT NULL constraints and primary key constraints can be emptied to NULL. **If you update the column with the above constraints to NULL, an error will occur, which is the same as the INSERT statement.

index

Take the catalog page (index) of the Chinese dictionary as an example, we can quickly find the desired word in the catalog (index) sorted by pinyin, strokes, radicals, etc.

How to create an index

Indexes can be created directly when creating a table, the syntax is as follows:

CREATE TABLE mytable(
    ID INT NOT NULL,
    username VARCHAR(16) NOT NULL,
    INDEX[indexName](username(length))
);

It can also be created using the following statement:

-- 方法一
CREATE INDEX indexName ON table_name(column_name)

--方法二
CREATE TABLE tableName ADD INDEX indexName(column_name)
  • index classification

    • primary key index

    The index built on the primary key is called the primary key index. A data table can only have one primary key index. The index column value does not allow null values. It is usually created together when the table is created.

    • unique index

    The index built on the UNIQUE field is called a unique index. A table can have multiple unique indexes. The index column value is allowed to be empty. Multiple null values ​​in the column value will not cause repeated conflicts.

    • normal index

    Indexes built on ordinary fields are called ordinary indexes.

    • prefix index

    A prefix index refers to an index built on the first few characters of a character type field or on the first few bytes of a binary type field, rather than building an index on the entire field. Prefix indexes can be built on columns of char, varchar, binary, and varbinary types, which can greatly reduce the storage space occupied by the index and improve the query efficiency of the index.

    • full text index

    An index that uses "word segmentation technology" to search for keywords in long texts.

    grammar:SELECT * FROM article WHERE MATCH (col1,col2,...) AGAINST (expr [ search _ modifier ])

    1. Before MySQL 5.6, only the MyISAM storage engine supports full-text indexing;

    2. MySQL 5.6 and later versions, both MyISAM and InnoDB storage engines support full-text indexing;

    3. Only when the data type of the field is char, varchar, text and its series can a full-text index be built.

    4. If possible, please try to create a table first and insert all the data before creating a full-text index, instead of creating a full-text index directly when creating a table, because the former is more efficient than the latter.

    • single column index

    Indexes built on a single column are called single-column indexes.

    • Joint index (composite index, multi-column index)

    An index built on multiple columns is called a joint index, also known as a compound index or a composite index.

practice questions

CREATE TABLE Addressbook(
regist_no integer not null,
name VARCHAR(128) not null,
address VARCHAR(256) not null,
tel_no char(10) ,
mail_address char(20) ,
primary key(regist_no)
);

Chapter 2 Basic Query

2.1 SELECT statement basics

Select data from table

SELECT statement

When selecting data from a table, you need to use the SELECT statement, which means only selecting (SELECT) the necessary data from the table. The process of querying and selecting the necessary data through the SELECT statement is called a matching query or query (query).

The basic SELECT statement contains two clauses, SELECT and FROM. Examples are as follows:

SELECT <列名>, 
  FROM <表名>;

Among them, the SELECT clause enumerates the name of the column that you want to query from the table, and the FROM clause specifies the name of the table from which the data is selected.

Select data that meets the criteria from the table

WHERE statement

When it is not necessary to retrieve all the data, but to select data that meets certain conditions such as "the product type is clothing" and "the sales unit price is more than 1,000 yen", use the WHERE statement.

The SELECT statement specifies the conditions for querying data through the WHERE clause. In the WHERE clause, conditions such as "the value of a certain column is equal to this string" or "the value of a certain column is greater than this number" can be specified. Execute the SELECT statement containing these conditions, and you can query only the records that meet the conditions.

SELECT <列名>, ……
  FROM <表名>
 WHERE <条件表达式>;

Compare the difference between the following two output results:

-- 用来选取product type列为衣服的记录的SELECT语句
SELECT product_name, product_type
  FROM product
 WHERE product_type = '衣服';
-- 也可以选取出不是查询条件的列(条件列与输出列不同)
SELECT product_name
  FROM product
 WHERE product_type = '衣服';

Result: In the output result, there are two columns in front and one column in the back

related laws

  • An asterisk (*) represents the meaning of all columns.
  • Line breaks can be used freely in SQL without affecting statement execution (but blank lines cannot be inserted).
  • When setting a Chinese alias, you need to use double quotation marks (") to enclose it.
  • Use DISTINCT in the SELECT statement to remove duplicate rows.
  • A comment is a part of an SQL statement used to identify instructions or precautions. It is divided into two types: 1-line comment "-- " and multi-line comment "/* */".
-- 想要查询出全部列时,可以使用代表所有列的星号(*)
SELECT * from product;
-- SQL语句可以使用AS关键字为列设定别名(用中文时需要双引号(“”))。
SELECT product_id as id,
       product_name as name,
       purchase_price as "进货单价"
  FROM product;
-- 使用DISTINCT删除product_type列中重复的数据
SELECT distinct product_type
  FROM product;

2.2 Arithmetic and comparison operators

arithmetic operator

The main operators of the four arithmetic operations that can be used in SQL statements are as follows:

meaning operator
addition +
subtraction -
multiplication *
division /

comparison operator

operator meaning
= equal to ~
<> is not equal to ~
>= greater than or equal to ~
> greater than ~
<= less than or equal to ~
< less than ~
-- 选取出sale_price列为500的记录
SELECT product_name, product_type
  FROM product
 WHERE sale_price = 500;

common law

  • Constants or expressions can be used in the SELECT clause.
  • When using comparison operators, be sure to pay attention to the position of the inequality and equal signs.
  • In principle, data of the string type is sorted according to the dictionary order, which cannot be confused with the numerical order.
  • When you want to select NULL records, you need to use the IS NULL operator in the conditional expression. When you want to select records that are not NULL, you need to use the IS NOT NULL operator in the conditional expression.
-- SQL语句中也可以使用运算表达式
SELECT product_name, sale_price, sale_price * 2 AS "sale_price x2"
  FROM product;
-- WHERE子句的条件表达式中也可以使用计算表达式
SELECT product_name, sale_price, purchase_price
  FROM product
 WHERE sale_price - purchase_price >= 500;
/* 对字符串使用不等号
首先创建chars并插入数据
选取出大于‘2’的SELECT语句*/
-- DDL:创建表
CREATE TABLE chars
(chr CHAR3NOT NULL, 
PRIMARY KEY(chr));
-- 选取出大于'2'的数据的SELECT语句('2'为字符串)
SELECT chr
  FROM chars
 WHERE chr > '2';
-- 选取NULL的记录
SELECT product_name, purchase_price
  FROM product
 WHERE purchase_price IS NULL;
-- 选取不为NULL的记录
SELECT product_name, purchase_price
  FROM product
 WHERE purchase_price IS NOT NULL;

2.3 Logical operators

NOT operator

When you want to express 不是……, in addition to the above <> operator, there is another operator that expresses negation and has a wider range of use: NOT.

NOT cannot be used alone, it must be used in combination with other query conditions. For example:

Select the records whose sales unit price is greater than or equal to 1000 yen

select product_name,product_type,sale_price
  from product 
 where sale_price>=1000;

picture

select product_name,product_type,sale_price
  from product 
 where not sale_price>=1000;

picture

It can be seen that by negating the query condition that the sales unit price is greater than or equal to 1000 yen (sale_price >= 1000), the products whose sales unit price is less than 1000 yen are selected. That is to say NOT sale_price >= 1000and sale_price < 1000are equivalent.

It is worth noting that although negating a condition through the NOT operator can obtain the result of the opposite query condition, its readability is obviously not as good as specifying the query condition explicitly, so this operator cannot be abused.

AND operator and OR operator

When you want to use multiple query conditions at the same time, you can use AND or OR operators.

AND is equivalent to "and", similar to the intersection in mathematics;

OR is equivalent to "or", similar to the union in mathematics.

As shown below:

AND operator working effect diagram

picture

OR operator working effect diagram

picture

Precedence by parentheses

What should I do if I want to find such a product?

"Product Type is Office Supplies" and "Registration Date is September 11, 2009 or September 20, 2009" The ideal result is "Puncher", but when you enter the following information, you will get the wrong result

-- 将查询条件原封不动地写入条件表达式,会得到错误结果
SELECT product_name, product_type, regist_date
  FROM product
 WHERE product_type = '办公用品'
   AND regist_date = '2009-09-11'
    OR regist_date = '2009-09-20';

The reason for the error is that the AND operator takes precedence over the OR operator . If you want to perform the OR operation first, you can use parentheses :

-- 通过使用括号让OR运算符先于AND运算符执行
SELECT product_name, product_type, regist_date
  FROM product
 WHERE product_type = '办公用品'
   AND ( regist_date = '2009-09-11'
        OR regist_date = '2009-09-20');

Please refer to the figure below for operator precedence

image

truth table

How to understand complex operations?

When encountering sentences with complex conditions, it is not easy to understand the meaning of the sentence. At this time, the truth table can be used to sort out the logical relationship.

What is the truth value?

The three operators NOT, AND, and OR described in this section are called logical operators. The logic mentioned here means to operate on the truth value. A truth value is a value that is either TRUE or FALSE.

For example, for the query condition sale_price >= 3000, since the value of the sale_price column of the record whose product_name is 'sports T-shirt' is 2800, it will return false (FALSE), while the value of the record whose product_name is 'pressure cooker' The value of the sale_price column is 5000, so it returns TRUE.

The AND operator returns true if the truth values ​​on both sides are true, otherwise it returns false.

The OR operator returns true as long as one of the truth values ​​on both sides is not false, and returns false only if the truth values ​​on both sides of it are false.

The NOT operator simply converts true to false and false to true.

truth table

picture

The truth table whose query condition is P AND (Q OR R)

picture

How should that be expressed?

At this time, the truth value is the third value besides true and false— uncertainty (UNKNOWN). There is no such third value in general logical operations. Languages ​​other than SQL also basically use only two truth values, true and false. In contrast to the usual logical operations called binary logic, only the logical operations in SQL are called ternary logic.

The AND and OR truth tables for three-valued logic are:

picture

practice questions

1. Write a SQL statement productto select the products whose registration date ( regist_date) is after April 28, 2009 from the (commodity) table, and the query result should contain product nameand regist_datetwo columns.

SELECT product_name ,regist_date
  FROM product
 WHERE regist_date>2009-4-28;

2. Please state the return results when executing the following 3 SELECT statements on the product table.

SELECT *
  FROM product
 WHERE purchase_price = NULL;
SELECT *
  FROM product
 WHERE purchase_price <> NULL;
SELECT *
  FROM product
 WHERE product_name > NULL;

All 3 SELECT statements are the result

Conceptually, NULL means "no value" or "unknown value", and it is treated as a distinct value. To test for NULL, you cannot use arithmetic comparison operators such as =, <, or !=. Instead use the IS NULL and IS NOT NULL operators.

3. The SELECT statement in the above section can extract from the table the commodities whose productsales unit price ( sale_price) is more than 500 yen higher than the purchase unit price ( ). purchase_priceWrite two SELECT statements that get the same result. The execution result is as follows:

SELECT product_name,sale_price,purchase_price
  FROM product
 WHERE not sale_price-purchase_price<500;
select product_name,sale_price,purchase_price
  from product
 where sale_price-purchase_price >= 500;

2.4 Aggregate queries on tables

aggregate function

The functions used for aggregation in SQL are called aggregate functions. The following five are the most commonly used aggregate functions:

  • SUM: Calculate the total value in a numeric column in the table
  • AVG: calculates the average value in a numeric column in a table
  • MAX: Calculate the maximum value of data in any column in the table, including text and number types
  • MIN: Calculate the minimum value of data in any column in the table, including text and number types
  • COUNT: Calculate the number of records (rows) in the table

like:

-- 计算销售单价和进货单价的合计值
SELECT sum(sale_price),sum(purchase_price)
  FROM product;
-- 计算销售单价和进货单价的平均值
SELECT avg(sale_price),AVG(purchase_price)
  from product;
-- 计算全部数据的行数(包含 NULL 所在行)
SELECT COUNT(*)
  FROM product;

Use DISTINCT to perform aggregation operations to remove duplicate values

When performing aggregation operations on the entire table, there may be multiple rows of the same data in the table, such as product type (product_type column).

In some scenarios, aggregation functions cannot be used directly for aggregation operations, and must be DISTINCTused with functions.

For example: To calculate how many types of coffee are on sale in total, how to calculate it?

As mentioned above, DISTINCTthe function is used to delete duplicate data. Before applying the COUNT aggregation function, DISTINCTthe requirement can be realized by adding the keyword.

Aggregate Function Application Rules

  • The operation result of the COUNT aggregation function is related to the parameters. COUNT(*) / COUNT(1) gets all rows containing NULL values, and COUNT(<column name>) gets all rows not containing NULL values.
  • Aggregate functions do not handle rows containing NULL values, with the exception of COUNT(*).
  • MAX / MIN functions are available for both text type and numeric type columns, while SUM / AVG functions are only available for numeric type columns.
  • Use the DISTINCT keyword in the parameters of the aggregation function to get the aggregation result with duplicate values ​​removed.

2.5 Grouping tables

GROUP BY statement

Previously, the aggregation function was used to process the data of the entire table. When you want to group and summarize (ie: summarize the existing data according to a certain column), GROUP BY can help you:

SELECT <列名1>,<列名2>,<列名3>,....
  FROM <表名>
GROUP BY <列名1>,<列名2>,<列名3>,....;

Take a look at the difference with and without the GROUP BY statement:

SELECT product_type, COUNT(*)
  FROM product
 GROUP BY product_type;

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-C87MHNhO-1690284630415) (C:\Users\Ga\AppData\Roaming\Typora\typora-user-images\ image-20230725154158643.png)]

 -- 不含GROUP BY
  SELECT product_type,COUNT(*)
    FROM product

注:这里会报错:In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column ‘shop.product.product_type’; this is incompatible with sql_mode=only_full_group_by

Because the meaning of the query statement is unclear, it is impossible to return a table with a reasonable relationship. In order to make the returned results reasonable, it is necessary to clarify what is the basis for grouping. At this time, the group by statement is used

picture

In this way, the GROUP BY clause groups the table like a slice of cake. The columns specified in the GROUP BY clause are called aggregation keys or grouping columns .

When the aggregation key contains NULL

Take the purchase unit price (purchase_price) as an example of aggregation key

SELECT purchase_price, COUNT(*)
  FROM product
 GROUP BY purchase_price;

[External link image transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the image and upload it directly (img-SMyYzRhO-1690284630417) (C:\Users\Ga\AppData\Roaming\Typora\typora-user-images\ image-20230725154839399.png)]

At this time, NULL will be aggregated as a set of special data.

GROUP BY writing position

There are strict requirements for the writing order of the clauses of GROUP BY. If the requirements are not followed, the SQL cannot be executed normally. The order of the clauses that have appeared so far is:

  1. SELECT ➡️ 2. FROM ➡️ 3. WHERE ➡️ 4. GROUP BY

Among them, the first three items are used to filter data, and GROUP BY processes the filtered data

Use GROUP BY in the WHERE clause

SELECT purchase_price, COUNT(*)
  FROM product
 WHERE product_type = '衣服'
 GROUP BY purchase_price;

common mistakes

  1. When a column other than the aggregate key is written in the SELECT clause of the aggregate function and an aggregate function such as COUNT is used, if the column name appears in the SELECT clause, it can only be the column name specified in the GROUP BY clause (that is, the aggregate key).
  2. Using the alias of the column in the GROUP BY clause The alias can be specified by AS in the SELECT clause, but the alias cannot be used in the GROUP BY. Because in DBMS, the SELECT clause is executed after the GROUP BY clause.
  3. The reason for using aggregate functions in WHERE is that the premise of using aggregate functions is that the result set has been determined, and WHERE is still in the process of determining the result set, so conflicts will cause errors. If you want to specify conditions, you can use aggregate functions in the SELECT, HAVING (discussed shortly below), and ORDER BY clauses.

2.6 Using HAVING to get a specific group

We learned how to get grouped aggregation results, now let’s think about it, how to get partial results of grouped aggregation results?

After the table is grouped by GROUP BY, how can I only take out two groups?

picture

Here WHERE is not feasible, because the WHERE clause can only specify the conditions of records (rows), but cannot be used to specify the conditions of groups (for example, "the number of data rows is 2 rows" or "the average value is 500", etc.).

You can use the HAVING clause after the GROUP BY.

HAVING is used like WHERE.

It is worth noting that the HAVING clause must be used in conjunction with the GROUP BY clause, and it limits the grouping and aggregation results, while the WHERE clause limits the data rows (including the grouping columns). The two perform their respective duties and should not be confused.

HAVINGFeatures

The HAVING clause is used to filter the group, and you can use constants, aggregate functions, and column names (aggregate keys) specified in GROUP BY.

-- 常数
SELECT product_type, COUNT(*)
  FROM product
 GROUP BY product_type
HAVING COUNT(*) = 2;

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-kUFODrRR-1690284630420) (C:\Users\Ga\AppData\Roaming\Typora\typora-user-images\ image-20230725162214476.png)]

wrong form

-- 错误形式(因为product_name不包含在GROUP BY聚合键中)
SELECT product_type, COUNT(*)
  FROM product
 GROUP BY product_type
HAVING product_name = '圆珠笔';

2.7 Sorting query results

ORDER BY

In some scenarios, it is necessary to obtain a sorted result, such as the scores of athletes in the Olympic Games. The organizing committee uses the reverse score results to determine who will win the gold, silver and bronze medals. The execution results of SQL statements are randomly arranged by default. If you want to sort them in order, you need to use the ORDER BY clause.

SELECT <列名1>, <列名2>, <列名3>, ……
  FROM <表名>
 ORDER BY <排序基准列1> [ASC, DESC], <排序基准列2> [ASC, DESC], ……

Among them, the parameter ASC means sorting in ascending order, DESC means sorting in descending order, and the default is ascending order. At this time, the parameter ASC can be defaulted.

The following code will get the query results sorted by sales price in reverse order:

-- 降序排列
SELECT product_id, product_name, sale_price, purchase_price
  FROM product
 ORDER BY sale_price DESC;

If there are multiple column sorting requirements, just write the sorting column + sorting parameters in the ORDER BY clause in turn, see the following code for details:

See the following code for details:

-- 多个排序键
SELECT product_id, product_name, sale_price, purchase_price
  FROM product
 ORDER BY sale_price, product_id;

It needs to be specially noted: Since NULL cannot be compared using comparison operators, that is, it cannot be compared with text types, number types, date types, etc., when there is a NULL value in the sorting column, the NULL result will be displayed in the query result beginning or end.

-- 当用于排序的列名中含有NULL时,NULL会在开头或末尾进行汇总。
SELECT product_id, product_name, sale_price, purchase_price
  FROM product
 ORDER BY purchase_price;

Aliases used in the ORDER BY clause

As mentioned above in GROUP BY, the alias defined in the SELECT clause cannot be used in the GROUP BY clause, but the alias can be used in the ORDER BY clause. Why not in GROUP BY but in ORDER BY?

This is because the execution order of the SELECT statement when using the HAVING clause in SQL is:

FROM → WHERE → GROUP BY → SELECT → HAVING → ORDER BY

The execution sequence of SELECT is after the GROUP BY clause and before the ORDER BY clause.

When the alias is used in the ORDER BY clause, the alias set by the SELECT clause is already known, but the existence of the alias is not known when the GROUP BY clause is executed, so the alias can be used in the ORDER BY clause, but in the GROUP Aliases cannot be used in BY.

ORDER BY encounters NULL

In MySQL, NULLvalues ​​are considered 非NULLlower than any other value, so when the order is ASC (ascending), NULLthe value appears first, and when the order is DESC (descending), it sorts last.

If you want to specify that existing NULLlines appear in the first or last line, special handling is required.

Build the example table using the following code:

CREATE TABLE user (
    id INT NOT NULL AUTO_INCREMENT,
    name VARCHAR(50),
    date_login DATE,
    PRIMARY KEY (id)
);

INSERT INTO user(name, date_login) VALUES
(NULL,    '2017-03-12'), 
('john',   NULL), 
('david', '2016-12-24'), 
('zayne', '2017-03-02');

Since when sorting, NULLthe value of is lower than 非NULLthe value of (can be understood as 0or -∞), then we need to specialize this default case when sorting to achieve the desired effect.

Generally, there are two requirements:

  • Sorts NULLthe values ​​in the last row, and also 非NULLsorts all the values ​​in ascending order.

For numeric or date types, you can add a minus sign (minus) before the sort field to get reverse sorting. ( -1、-2、-3....-∞)

SELECT *
  FROM user
  ORDER BY -date_login DESC;

For characters or character numbers, this method may not be able to get the expected sorting results, you can use IS NULLcomparison operators. Additionally ISNULL( )the function is equivalent to using IS NULLthe comparison operator.

-- IS NULL
SELECT * FROM user 
 ORDER BY name IS NULL ASC,name ASC;
 
-- ISNULL()
SELECT * FROM user 
 ORDER BY ISNULL(name) ASC,name ASC;

The above statement first uses ISNULL(name)the field to sort in ascending order, and only when namethe column value NULLis , ISNULL(name)it is true, so it is sorted to the last row, and the value name ASCis 非NULLsorted in ascending order

You can also use COALESCEthe function to achieve the requirements

SELECT * FROM user 
 ORDER BY COALESCE(name, 'zzzzz') ASC;

picture

  • Sorts NULLthe value in the first row and 非NULLsorts all the values ​​in descending order.

For numeric or date types, you can add a minus sign (minus) before the sort field to achieve. ( -∞...-3、-2、-1)

SELECT * FROM user 
 ORDER BY -date_login ASC;

picture

For characters or character numbers, this method may not be able to get the expected sorting results, you can use IS NOT NULLcomparison operators. Additionally !ISNULL( )the function is equivalent to using IS NOT NULLthe comparison operator

-- IS NOT NULL
SELECT * FROM user 
 ORDER BY name IS NOT NULL ASC,name DESC;

-- !ISNULL()
SELECT * FROM user 
 ORDER BY !ISNULL(name) ASC,name DESC;

The above statement first uses !ISNULL(name)the field to sort in ascending order, and only when namethe value of the column is not NULL, !ISNULL(name)it is true, so it is sorted to the row, and the value of name DESCis 非NULLsorted in descending order.

You can also use COALESCEthe function to achieve the requirements

SELECT * FROM user 
 ORDER BY COALESCE(name, 'zzzzz') DESC;

picture

practice questions

1. Please point out all grammatical errors in the following SELECT statement

SELECT product_id, SUM(product_name)
--本SELECT语句中存在错误。
  FROM product 
 GROUP BY product_type 
 WHERE regist_date > '2009-09-01';

It is obviously wrong here, where is to process rows, and group by is to process groups, so where should not be after it, and the execution order should be: from-> where ->group by->select

2. Please write a SELECT statement to find the types of commodities whose sale_pricetotal sales unit price (column) is greater than 1.5 times the total purchase unit price ( column). purchase_priceThe execution result is as follows.

product_type | sum  | sum 
-------------+------+------
衣服         | 5000 | 3300
办公用品      |  600 | 320
 SELECT product_type,sum(sale_price),sum(purchase_price)
   FROM product
   group by product_type
   Having sum(sale_price)>sum(purchase_price)*1.5;

3. Before that, we used the SELECT statement to select all the records in the product (commodity) table. At the time we used ORDER BYthe clause to specify the sort order, but I can't remember how to specify it now. Consider ORDER BYthe contents of the clause based on the following execution results.

picture

The dates must be arranged in ascending order, but regist_date will add a negative sign and then descend.

Guess you like

Origin blog.csdn.net/weixin_44195690/article/details/131924795
Recommended