SQL language
Whether you are a junior or intermediate data analyst, sql is a must-have skill for work, and you must also know how to tear sql code by hand. To avoid forgetting while learning, use this skill as my first step in summarizing relevant knowledge.
Note: [] brackets indicate optional;
1. Data Definition Language (Data Definition Language, DDL)
DDL includes the creation, modification, and deletion of various objects (tables, views, indexes, etc.) in the database. Common commands such as create, alter, drop:
1.1 create
Function: used to create a database/table
grammar:
create database/table name;
create table [if not exists] name (
'字段名' 列的类型 [属性] [索引] [注释],
...
'字段名' 列的类型 [属性] [索引] [注释]
)[表类型][字符集设置][注释];
example:
##创建数据库school
create database [if not exists] school;
## 创建数据库表student
create table [if not exists] student(
id int(4) not null auto_increment comment '学号',
name varchar(30) not null default comment '学生姓名',
gender varchar(2) not null default comment '性别',
class int(4) not null default comment '班级',
primary key(id)
)engine=innodb default charset=utf8;
type of data
type of data | description |
---|---|
integer(size) int(size) smallint(size) tinyint(size) |
Only hold whole numbers. Specify the maximum number of digits in parentheses |
decimal(size,d) numeric(size,d) |
Holds numbers with decimals. "Size" specifies the maximum number of digits. "D" specifies the maximum number of digits to the right of the decimal point. |
char(size) | Holds a fixed-length character string (can hold letters, numbers, and special characters). Specify the length of the string in parentheses. |
varchar(size) | Holds variable-length character strings (can hold letters, numbers, and special characters). Specify the maximum length of the string in parentheses. |
date(yyyymmdd) | date |
Constraints
Constraint type | description |
---|---|
not null | Indicates that a column cannot store NULL values |
unique | Ensure that each row of a column must have a unique value |
primary key | Unique identification, a combination of NOT NULL and UNIQUE. Ensure that a column (or a combination of two columns and multiple columns) has a unique identifier, which helps to find a specific record in the table more easily and quickly |
foreign key | Ensure that the data in one table matches the referential integrity of the values in another table foreign key (stu_id) references student (id) |
check | Ensure that the values in the column meet the specified conditions check(age>0) |
default | Specifies the default value when no value is assigned to the column. |
auto_increment | Self-increasing function, initial value 1, step size 1, and initial value can also be set. |
1.2 alter
Function: Add, modify or delete columns to an existing table
Syntax:
alter table table_name add/modify/change/rename/drop name;
--rename用于修改表名
alter table table_name rename [to] new table_name ;
alter table student rename to student2021;
--add 增加一列多多列
alter table table_name add column_name datatype
alter table student add(
age int(2) not null default comment '年龄',
address varchar(30) not null default comment '地址'
);
--modify 修改字段类型和约束
alter table table_name modify column_name datatype...;
alter table student modify (name varchar(10) default 'unknown')
--change 修改字段名
alter table table_name change old_name new_name datatype...; --新字段需要完整定义
alter table student change name stu_name char(4);
--drop 删除字段,删除表的结构及其所依赖的约束、索引等,执行后无法回滚
alter table table_name drop column column_name;
alter table student drop column age;
1.3 truncate
Function: Clear all rows in the table, but the table structure and its constraints, indexes, etc. remain unchanged, and cannot be rolled back after execution like drop.
Syntax:
truncate [table] table_name;
truncate student;
2. Data Manipulation Language
Used to manipulate records, common commands such as insert, update, delete, are used to add, modify and delete records respectively
2.1 insert
Function: Add records to the table
Syntax:
insert into table_name values (values1, values2...)
insert into talbe_name (columns1, columns2, ...) values (values1, values2...)
insert into student (id, name) VALUES ('10', '张三');
2.2 update
Function: modify the data in the table
Syntax
update table_name set columns_name = new_values where columns_name2 = old_values
update student set name = '小李', age = 12 where id = 10
2.3 delete
Function: delete a row in the table
Syntax
delete from table_name where columns_name = some_values
delete from student where id = 10
Note:
delete can delete part of the data in the table and retain the table structure, while truncate can delete all data in the table and retain the table structure, but it is fast
3. Data Query Language (DQL)
Query records and basic structure in database tables
3.1 Syntax
select * from table where condition;
select id, name from student where id = 10;
3.2 Description of common functions
function | description |
---|---|
from | Select the data table |
where | Form records that meet certain conditions |
group by | Group the result set according to one or more columns, generally combined with aggregate functions |
having | After filtering and grouping each group of data, it is generally used in combination with aggregate functions, because where cannot be used with aggregate functions |
order by | Specify the column to sort the result set, the default ascending order is asc, and descending order is desc |
limit n[,m] | Return the first n records, or from the nth record, return m records, excluding the nth record |
sum() avg() count() min() max() |
Common aggregate functions |
3.3 Advanced function description
function | description |
---|---|
like | Used to search for the specified pattern in the column in the WHERE clause |
in | Allow multiple values to be specified in the WHERE clause |
_ Or% wildcard | _Any character, %Any character |
between A and B | Select a value in the data range between the two values. These values can be numeric, text, or date including a but not b |
inner join on | Intersection of two tables that meet the conditions |
left join on | The records that meet the conditions are subject to the left table, and the ones that are not matched in the right table are null |
right join on | The records that meet the conditions are subject to the right table, and the left table does not match null |
full join on | All records that meet the conditions are matched in one of the left and right tables |
union | Connect up and down, merge the results of two or more SELECT statements, and perform column conversion (highly difficult test sites) |
upper() | Convert the value of the field to uppercase |
Lower() | Convert the value of the field to lowercase |
mid() | Extract the intermediate value MID(column_name,start[,length]) including the starting position |
len () | Return field length |
round() | Round numeric fields to the specified number of decimal places. round(column_name,decimals) |
now() | Return to the current system environment |
format() | Format the display of the field format(column_name,format) format(Now(),'YYYY-MM-DD') |
case when then end | 条件表达式函数,也可用于行转列(高难度考点) case columns_name when condition then [else] end |
3.4 开窗函数
函数类型 | 开窗函数 | 描述 |
---|---|---|
序号函数 | row_number() | 有序号排序 row_number() over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
rank() | 有序号排名,相同分数并列排 rank() over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
|
dense_rank() | 始终返回连续的排名值 dense_rank() over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
|
前后函数 | lag() | 返回columns_name当前行往上offset行的值 lag(columns_name [, offset, default_values ] ) over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
lead() | 返回columns_name当前行往下offset行的值 lead(columns_name [, offset, default_values ] ) over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
|
头尾函数 | first_value() | 返回columns_name有序行集第一行的值 first_value(columns_name) over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
last_value() | 返回columns_name有序行集最后一行的值 last_value(columns_name) over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
|
分布函数 | percent_rank() | 返回某列或某列组合后每行的百分比排序 初始值0,有重复值 percent_rank() over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
cume_dist() | 计算累积分布值,有重复值 表示值小于或等于行的值除以总行数的行数 cume_dist() over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
|
其他函数 | nth_value() | 从结果集的第N行获取值 nth_value() over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
ntile() | 按顺序分n组 ntile() over ( [ partition by 分组列 ] order by 排序列 desc/asc) |
3.5 select语句执行顺序:
from—where—group by—having—select—order by—limit
4. 结束语
此篇为基础总结篇,下一篇文章介绍相关实战演练