Data Analysis MySQL Learning

Data Analysis MySQL Learning

Reference Course: Data Analysis by Senior Brother Dai

Original curtain format notes: Brother Dai’s data analysis enlightenment course: SQL basic syntax + operating principles + cloud database construction.opml , extraction code: jb27

basic grammar

Grammatical structures:select--from--where--group by--having--order by--limit

Running order:from--where--group by--having--order by--limit--select

image-20220419220833007

select

select 字段名 from 表名                        #字段名决定了查询后显示的字段  表明指定了这一查询涉及的数据来源  
                                              #select和from要加空格 单段代码句末不需要加分号,多段最好加上分号
select * from 表明                             #查询所有项(所有列)
select name as 姓名, population 人口 from 表名  #为字段添加别名 as可有可无
select distinct 字段名, 字段名 from 表名         #有多个字段时候,distinct是紧跟在select后面,不能放到中间字段,                                                     #distinct是对这几个字段组成的行去重行数据
select name, gdp, population, gdp/population 人均GDP from world  #可以进行简单的计算

where

select 字段名 from 表名 where 表达式(字段名 表达式 值)                #限定查询行必须满足的条件
select name,population from world where name='Germany'  #查询德国的人口
SELECT name,gdp FROM world WHERE gdp between 2550010 and 255001000
select name,population from world where name in ('Sweden', 'Norway', 'Denmark')

image-20220418133836855

Note is nullthat it is used to query null values ​​(null), which are not equal to 0 or null strings.

select 字段名 from world where 字段名 like '通配符+字符' 
select name,population from world where name like '_t%' #查询第二个字符是t的国家名称和人口
  • whereIn addition to using operators for conditional judgment in the expression of the clause, you can also use likeoperators to combine wildcards for fuzzy query. Wildcards are used to match part of the value, followed by likedata filtering. Commonly used wildcards include %and _, which %are used to match multiple values. characters can be zero, one or multiple characters, and can _only be used to match a single character
  • between contains the boundary. If you don’t want to include the boundary, you can add a condition to !=exclude it.

order by

select 字段名 from 表名 
where 表达式
order by 字段名 asc|desc   #规定查询出的结果集显示的顺序,desc为降序,asc为升序,默认不写为升序

SELECT winner, yr, subject FROM nobel WHERE winner like 'Sir%' order by yr desc, winner asc

order by subject in ('chemistry','physics') , subject, winner #subject in ()在括号内的为1,不在为0,可以把数据排在最前或者最后

limit

limit [位置偏移量x,]行数n  # 限制查询结果集显示的行数,第一行的位置偏移量是0,从x+1行开始返回n行

Aggregation functions and group by

AVG()  SUM() COUNT()  COUNT()  MAX() MIN()
group by 字段1,字段2

Aggregation functions are suitable for obtaining summary information of data, such as the number of rows in a certain field, the average value of a certain field, the maximum and minimum numbers in a certain field, etc.

having

having 表达式  #限定分组聚合后的查询行必须满足的条件,使用该子句是为了对group by分组后的数据进行筛选

Summarize:

standard syntax

select 字段名
from 表名
[where 表达式]
[group by 字段名]
[having 表达式]
[order by 字段名 asc|desc]
[limit [位置偏移量,]行数]

working process:

from--where--group by--having--order by--limit--select

  1. Execute the from statement to retrieve a copy of the table from the database
  2. Execute the where statement to filter out qualified data rows in the copied table
  3. Execute the group by statement to partition the filtered data based on the specified fields, and de-group the fields based on them. This is equivalent to Excel creating a pivot table and adding row labels.
  4. Execute the having statement to filter the groups that meet the conditions
  5. Execute the order by statement to sort the filtered data
  6. Execute the limit statement to limit the displayed rows of the sorted data.
  7. Execute the select statement to extract the last field to be displayed

Common functions

  1. Math functions

    round(x,y)——四舍五入函数

    • The round function rounds x values ​​to y decimal places.

    • When y is a negative value, the corresponding number of digits to the left of the decimal point is retained as 0 and no rounding is performed.

    • For example: round(3.15,1) returns 3.2, round(14.15,-1) returns 10

  2. String functions

    concat(s1,s2,...)——连接字符串函数

    • The concat function returns the string generated by the connection parameters s1, s2, etc.

    • When any parameter is null, null is returned.

    • For example: concat('My',' ','SQL') returns My SQL, concat('My',null,'SQL') returns null

    replace(s,s1,s2)——替换函数

    • The replace function uses the string s2 to replace all s1 in s

    • For example: replace('MySQLMySQL','SQL','sql') returns MysqlMysql

    left(s,n)、right(s,n)&substring(s,n,len)——截取字符串一部分的函数

    • The left function returns the leftmost n characters of string s

    • The right function returns the rightmost n characters of string s

    • The substring function returns the string s. A substring of length len is taken starting from the nth character. n can also be a negative value. Then a substring of length len is taken starting from the nth character from the last. If there is no len value, the substring is taken. From the nth character to the last character

    • For example: left('abcdefg',3) returns abc, right('abcdefg',3) returns efg, substring('abcdefg',2,3) returns bcd, substring('abcdefg',-2,3) returns fg , substring('abcdefg',2) returns bcdefg

  3. data type conversion function

cast(x as type)——转换数据类型的函数

  • The cast function converts an x ​​value of one type to a value of another type

  • The type parameter can be filled in with char(n), date, time, datetime, decimal, etc. and converted to the corresponding data type.

  1. datetime function

    year(date)、month(date)、day(date)——获取年月日的函数

    • date can be a date consisting of year, month and day, or it can be a date and time consisting of year, month, day, hour, minute and second.

    • year(date) returns the year in date format, month(date) returns the month in date format, day(date) returns the day in year-date format

    • For example: year('2021-08-03') returns 2021, month('2021-08-03') returns 8, day('2021-08-03') returns 3

date_add(date,interval expr type)&date_sub(date,interval expr type)——对指定起始时间进行加减操作

  • date is used to specify the starting time

  • date can be a date consisting of year, month and day, or it can be a date and time consisting of year, month, day, hour, minute and second.

  • expr is used to specify the time interval to add or subtract from the starting time

  • type indicates the way expr is interpreted, type can be the following values

    • Mainly use the values ​​in the red boximg
  • The date_add function adds the starting time, and the date_sub function subtracts the starting time.

  • For example: date_add('2021-08-03 23:59:59',interval 1 second) returns 2021-08-04 24:00:00, date_sub('2021-08-03 23:59:59',interval 2 month) returns 2021-06-03 23:59:59

datediff(date1,date2)——计算两个日期之间间隔的天数

  • The datediff function calculates the interval time from date1-date2. Only the date part of the date is involved in the calculation, and the time is not involved.

  • For example: datediff('2021-06-08','2021-06-01') returns 7, datediff('2021-06-08 23:59:59','2021-06-01 21:00:00' ) returns 7, datediff('2021-06-01','2021-06-08') returns -7

date_format(date,format)——将日期和时间格式化 标准日期格式’%Y-%m-%d’

image-20220419154242284

  1. conditional judgment function

    if(expr,v1,v2)

    • If the expression expr is true return value v1, otherwise return v2

    • For example: if(1<2,'Y','N') returns Y, if(1>2,'Y','N') returns N

    case when

    • case expr when v1 then r1 [when v2 then r2] …[else rn] end

      • 例如:case 2 when 1 then ‘one’ when 2 then ‘two’ else ‘more’ end 返回two

      • The value after case is 2, which is equal to the value after when in the second branch statement, so two is returned.

    • case when v1 then r1 [when v2 then r2]…[else rn] end

      • For example: case when 1<0 then 'T' else 'F' end returns F

      • The result of 1<0 is false, so the function return value is F after else

Advanced statements

window function

  • Standard syntax:

    窗口函数over([partition by 字段名] [order by 字段名 asc|desc])

    Window functions can only be written in select. The two clauses in over() are optional. partition by specifies the partition basis, and order by specifies the sort basis.

  • sort window function

    • rank()over()Jump sorting 99 99 90 89 corresponds to 1 1 3 4
    • dense_rank()over()Parallel sorting 99 99 90 89 corresponds to 1 1 2 3
    • row_number()over()Continuity sorting 99 99 90 89 corresponds to 1 2 3 4
  • Offset Analysis Window Function

    • lag(字段名,偏移量[,默认值])over()
    • lead(字段名,偏移量[,默认值])over()

table join

  • Inner join removes all rows with null values

    • select field name

    • from table name 1 inner join table name 2 on table name 1. field name = table name 2. field name

    • Note that the inner join can be omitted. If you use join directly, the default is inner join.

  • A left join retains all rows from the left table

    • select field name

    • from table name 1 left join table name 2 on table name 1. field name = table name 2. field name

  • Right join retains all rows from the table on the right

    • select field name

    • from table name 1 right join table name 2 on table name 1. field name = table name 2. field name

subquery

The subquery itself is a complete query statement, which is then wrapped and nested in the main query statement with parentheses (). The subquery can be nested at multiple levels. The most commonly used subqueries are used in the from and where clauses.

  • The subquery in the where clause is often used when the query conditions cannot be achieved in one step. A query needs to be performed first, and condition judgment is made based on the query results.
  • The subquery in the from clause is essentially a piece of query code, and the data obtained is used as the data source of the main query; an alias must be used

Cloud database construction

  1. Purchase Alibaba Cloud SQL products.
  2. Configure cloud database accounts, databases, and whitelists
  3. Install datagrip to connect to the database
  4. Install sublime to store and open sql files

excel link database

  • Install Mysql driver
  • Check whether Excel is 32-bit or 64-bit
  • Configure ODBC
  • Excel uses ODBC to get data from MySQL
  • Create charts based on data obtained from the database

tableau linked database

  • We directly choose to connect to the server, select Mysql, and fill in the database parameters
  • You can drag and drop the database or write custom SQL

Cloud database construction

  1. Purchase Alibaba Cloud SQL products.
  2. Configure cloud database accounts, databases, and whitelists
  3. Install datagrip to connect to the database
  4. Install sublime to store and open sql files

excel link database

  • Install Mysql driver
  • Check whether Excel is 32-bit or 64-bit
  • Configure ODBC
  • Excel uses ODBC to get data from MySQL
  • Create charts based on data obtained from the database

tableau linked database

  • We directly choose to connect to the server, select Mysql, and fill in the database parameters
  • You can drag and drop the database or write custom SQL

For more learning and exchange, please click on my blog

Guess you like

Origin blog.csdn.net/vision666/article/details/124299457