一、上次课程回顾

https://blog.csdn.net/SparkOnYarn/article/details/104888513
查看正在运行的mysql的进程，show processlist; 生产上绝大部分情况下使用的是select语句，建表规划需要是属于行业标准、前人的经验（数据表是否要做自增长）；思考一个机器上有很多张表就会影响查询速度；表在创建的时候需要加一条叫修改时间；数据的新增，字段名称和值需要一一对一个；update的时候不加条件会把所有的数据都进行更新。
cdh中的mysql binlong服务需要开启；生产上MySQL主从架构，主机器–>写，从机器–>读。

二、MySQL语法二

2.1、最基础的SQL语法

1、按条件查询：查询年龄是20岁以下的：与年龄相关：>、 >=、 <、 <=、 =

select * from ruozedata.rzdata where age < 20;

2、查询名称是ruoze的数据：

select * from ruozedata.rzdata where name = ‘ruoze’;

3、查询名称不为ruoze的数据：

select * from ruozedata.rzdata where name != ‘ruoze’;

4、多条件查询：查询名称是ruoze且年龄大于20的：

select * from ruozedata.rzdata where age > 20 and name = ‘ruoze’;

5、查询名称是ruoze或者年龄大于20的：

select * from ruozedata.rzdata where age > 20 or name = ‘ruoze’;

6、查询名称中含有o字母的数据：

select * from ruozedata.rzdata where name like ‘%o%’;

7、查询名称以j开头的数据：j为开头，剩余字符不确定：

select * from ruozedata.rzdata where name like ‘j%’;

8、查询名称以e为结尾的数据：

select * from ruozedata.rzdata where name like ‘%e’;

9、查询第三个字母为o的数据（加上两个占位符即可，_表示占位符）：

select * from ruozedata.rzdata where name like ‘__o%’;

10、查询名称中第三个字母不为o的数据：

select * from ruozedata.rzdata where name not like ‘__o%’;

2.2、MySQL中的union联合查询 && join连接 && 求和统计

1、数据准备：

1、创建数据表a
create table a (
	id int(5) not null auto_increment,
	name varchar(20),
	age int(3)
);
insert into a(name,age) values ('john',24);
insert into a(name,age) values ('ron',26);
insert into a(name,age) values ('sail',18);

创建数据表b
create table b (
	id int(5) not null auto_increment,
	name varchar(20),
	city varchar(40)
);
insert into b(name,city) values ('john','苏州')；
insert into b(name,city) values ('sail','埃及')；
insert into b(name,city) values ('fox','北京')；

2、union查询

1、union查询：
select id,name from a
union
select id,name from b

//查询结果：
1	john
2	sail
3	fox
3	ron

2、union all查询：
1	john
2	sail
3	fox
1	john
2	sail
3	ron

3、left join查询：

以左边的表为主表匹配右边的表：

select a.*,b.* from a
left join b
on a.name=b.name;

//查询结果：
id	name	age	id	name	city
1	john	24	1	john	suzhou
2	sail	30	2	sail	beijing
3	fox	40	[NULL]	[NULL]	[NULL]

right join查询：

以右边的表为主表进行匹配左边的表：

select a.*,b.* from a
right join b
on a.name=b.name;

//查询结果：
id	name	age	id	name	city
1	john	24	1	john	suzhou
2	sail	30	2	sail	beijing
[NULL]	[NULL]	[NULL]	3	ron	yindu

4、inner join，根据name关联，两张表都存在的显示，不存在的不显示：

select a.*,b.* from a
inner join b
on a.name=b.name;

//查询结果：
id	name	age	id	name	city
1	john	24	1	john	suzhou
2	sail	30	2	sail	beijing

在这里插入图片描述

5、求和 && 统计 && 去重 && 子查询

1、年龄之和
select sum(age) from ruozedata.a;

2、年龄出现次数
select count(age) from ruozedata.b;

3、少用count(*)，count(0)表示的是第一个字段，count(name) --> 统计name出现的次数：

4、关于计算age的数据：
是求和还是计算条数：sum还是count

5、重复的name数据进行去重：
select count(distinct id) from b;

6、#求年龄<30的各个名称的年龄和sum(age)，如下两种写法是一样的：

select name,sum(age) from ( select * from a where age < 30) as t group by name;
select name,sum(age) from a where age < 30 group by name;

7、求各个名称的年龄和<30

第一种写法：select name,sum(age) from a group by name having sum(age) <30;
第二种子查询写法：

select * from (
select name,sum(age) as sumage from a 
group by name
) as t 
where t.sumage < 30;

8、年龄的倒序、正序排列：

select * from ruozedata.a order by age desc limit 2; //倒序排列查看2条数据
select * from ruozedata.a order by age ase; //升序排列

总结：

where、 group by、 having、order by 、limit都是有先后顺序的
limit永远是最后，倒数第二是order by，第一肯定是where，
```
  select XXX from a 
  where XXX
  order by 
  limit 2;
```

三、sql语句训练

use ruozedata;

create table dept (
deptno numeric(2),
dname varchar(14),
loc varchar(13)
);

insert into dept values (10, ‘ACCOUNTING’, ‘NEW YORK’);
insert into dept values (20, ‘RESEARCH’, ‘DALLAS’);
insert into dept values (30, ‘SALES’, ‘CHICAGO’);
insert into dept values (40, ‘OPERATIONS’, ‘BOSTON’);

create table salgrade (
grade numeric,
losal numeric,
hisal numeric
);

insert into salgrade values (1, 700, 1200);
insert into salgrade values (2, 1201, 1400);
insert into salgrade values (3, 1401, 2000);
insert into salgrade values (4, 2001, 3000);
insert into salgrade values (5, 3001, 9999);

create table emp (
empno numeric(4) not null,
ename varchar(10),
job varchar(9),
mgr numeric(4),
hiredate datetime,
sal numeric(7, 2),
comm numeric(7, 2),
deptno numeric(2)
);

insert into emp values (7369, ‘SMITH’, ‘CLERK’, 7902, ‘1980-12-17’, 800, null, 20);
insert into emp values (7499, ‘ALLEN’, ‘SALESMAN’, 7698, ‘1981-02-20’, 1600, 300, 30);
insert into emp values (7521, ‘WARD’, ‘SALESMAN’, 7698, ‘1981-02-22’, 1250, 500, 30);
insert into emp values (7566, ‘JONES’, ‘MANAGER’, 7839, ‘1981-04-02’, 2975, null, 20);
insert into emp values (7654, ‘MARTIN’, ‘SALESMAN’, 7698, ‘1981-09-28’, 1250, 1400, 30);
insert into emp values (7698, ‘BLAKE’, ‘MANAGER’, 7839, ‘1981-05-01’, 2850, null, 30);
insert into emp values (7782, ‘CLARK’, ‘MANAGER’, 7839, ‘1981-06-09’, 2450, null, 10);
insert into emp values (7788, ‘SCOTT’, ‘ANALYST’, 7566, ‘1982-12-09’, 3000, null, 20);
insert into emp values (7839, ‘KING’, ‘PRESIDENT’, null, ‘1981-11-17’, 5000, null, 10);
insert into emp values (7844, ‘TURNER’, ‘SALESMAN’, 7698, ‘1981-09-08’, 1500, 0, 30);
insert into emp values (7876, ‘ADAMS’, ‘CLERK’, 7788, ‘1983-01-12’, 1100, null, 20);
insert into emp values (7900, ‘JAMES’, ‘CLERK’, 7698, ‘1981-12-03’, 950, null, 30);
insert into emp values (7902, ‘FORD’, ‘ANALYST’, 7566, ‘1981-12-03’, 3000, null, 20);
insert into emp values (7934, ‘MILLER’, ‘CLERK’, 7782, ‘1982-01-23’, 1300, null, 10);

#找出所有部门编号为30的所有员工的编号和姓名：
select empno,ename from emp where deptno=30;

#找出部门编号为10中所有经理，和部门编号为20的所有销售员的详细资料
select * from emp where (deptno=10 and job = ‘MANAGER’) or (deptno=20 and job=‘SALESMAN’);

#查询所有员工信息，用工资降序排序，如果工资相同使用入职日期排序：
select * from emp order by sal desc,hiredate asc;

#列出工资加奖金大于1500的，各种工作及从事此工作的员工人数：
select * from emp where (sal+comm) > 1500;
select * from emp;
#发现问题，null值+任何数值为null，
select *, sal+comm, sal+ifnull(comm,0) from emp;

select job,count(job) from emp
where (sal+ifnull(comm,0)) > 1500
group by job;

用作示例，having后面可以加上其它字段条件筛选：

select job,count(job) from emp
group by job
having min(sal+ifnull(comm,0)) > 1500;

select * from emp where job=‘salesman’;

#总结：group by后面出现多少字段，在select后面就要跟上多少字段：
select xxx,yyy,count(job) from emp
group by xxx,yyy

#列出在销售部工作的员工的姓名，假定我们不知道员工的部门编号：
select ename from emp where deptno =
(select deptno from dept where dname=‘SALES’);

#列出每种工作的最高工资、最低工资、人数；
select
job,max(sal+ifnull(comm,0)),min(sal+ifnull(comm,0)),count(empno)
from emp
group by job;

#列出薪金高于公司平均薪金的所有员工工号、员工姓名、所在部门名称、上级领导、工资、工资等级；

select avg(sal+ifnull(comm,0)) from emp;

员工姓名、工号可以获取导到，所在部门名称deptno，工资获取到，工资等级关联其它表：

第一步：

select
ename,
deptno,
mgr,
sal+ifnull(comm,0)
from emp
where (sal+ifnull(comm,0)) > (select avg(sal+ifnull(comm,0)) from emp);

第二步：比起第一步多增加了一个字段：部门名称

select
e.ename,
e.deptno,d.dname,
e.mgr,
e.sal+ifnull(e.comm,0)
from emp as e
left join dept as d on e.deptno=d.deptno
where (sal+ifnull(comm,0)) > (select avg(sal+ifnull(comm,0)) from emp);

第三步：如何看上级领导，在emp表中，表自己与自己进行关联：

select
e.ename,
e.deptno,d.dname,
e.mgr,
e.sal+ifnull(e.comm,0)
from emp as e
left join dept as d on e.deptno=d.deptno
left join emp m on e.mgr=m.empno
where (e.sal+ifnull(e.comm,0)) > (select avg(sal+ifnull(comm,0)) from emp);

第四步：根据工资等级表：

select
e.ename,
e.deptno,d.dname,
e.mgr,
e.sal+ifnull(e.comm,0),
s.grade
from emp as e
left join dept as d on e.deptno=d.deptno
left join emp m on e.mgr=m.empno
left join salgrade s on (e.sal+ifnull(e.comm,0)) between s.losal and s.hisal
where (e.sal+ifnull(e.comm,0)) > (select avg(sal+ifnull(comm,0)) from emp);

列出薪金高于所在部门编号为30工作的所有员工的薪金的员工姓名和薪金、部门名称：

select sal+ifnull(comm,0) from emp where deptno=30;

#要加一个all,注意all和any的区别
select
e.ename,e.sal+ifnull(e.comm,0),
d.deptno,d.dname
from emp e,dept d
where e.deptno=d.deptno
and sal+ifnull(comm,0) > all(select sal+ifnull(comm,0) from emp where deptno=30);

四、本次课程作业

1、sql的各种where条件
2、sql的group
3、sql的三种join
4、04txt分享的9句SQL
5、整理刚才分享的小知识点
6、补充资料文件夹
7、彩蛋SQL视频

Spark on yarn

发布了8 篇原创文章 · 获赞 0 · 访问量 220

私信关注

剑指数据仓库-SQL02