SQL Conditional Functions Date Functions Text Functions Window Functions

Played for a few days, combined work and rest, continued to review and brush up sql

1. Conditional function

1. Topic: Now the operation wants to divide the users into two age groups: under 25 years old and 25 years old and above, and check the number of users in these two age groups respectively ( age is null and also recorded as under 25 years old )

user_profile

 Desired result:

 Involved knowledge:

You need to use the case function, which is a branch function that can return one of multiple possible results based on a conditional expression. Can be used anywhere an expression is allowed, but cannot be executed with a single statement.

simple case function

Evaluates the test expression, comparing the value of the test expression with the simple expression of each where clause in the order written from top to bottom. If the value of a simple expression is equal to the value of the test expression, the first matching when clause is returned, if the values ​​of all expressions are not equal to the value of the test expression, if an else clause is specified, Returns the value of the result specified in the else clause, or NULL if no else clause is specified

search case function

Evaluates the Boolean expression for each when clause in the order they are written top to bottom. Returns the value of the result expression corresponding to the first Boolean expression that evaluates to true. If there is no Boolean expression that evaluates to true, and when an else clause is specified, returns the result specified by the else clause, or null if no else clause is specified

SELECT CASE WHEN age < 25 OR age IS NULL THEN '25岁以下' 
            WHEN age >= 25 THEN '25岁及以上'
            END age_cut,COUNT(*)number
FROM user_profile
GROUP BY age_cut
select
    if (age >= 25, "25岁以上", "25岁以下") AS age_cut,
    count(*) as number
from
    user_profile
group by
    age_cut;

2. Date function

1. Questions: Now the operation wants to calculate the number of daily user practice questions in August 2021 , please take out the corresponding data.

question_practice_detail

Desired result:

 

 Involved knowledge:

Since it involves time, you can directly use the day() month() year() function. Since it is to calculate the number of practice questions per day in August, it needs to be separated by date time. Since the month is specified, you can use where

select
    day(date) day,
    count(question_id) question_cnt
from
    question_practice_detail
where
    month(date) = 8
    and year(date) = 2021
group by
    date

3. Text function

1. Topic: Count the number of people of each gender

user_submit

Desired result:

 Involved knowledge:

You can use substring_index(str,delim,count) 

        str: the string to process

        delim: delimiter

        count: count

If count is a positive number, from left to right, all the content on the left of the nth separator. If count is negative, count from right to left, everything to the right of the nth delimiter.

Example: str=www.baidu.com

sunstring_index(str,' . ',1)

        Result: www

sunstring_index(str,' . ',-2)

        Result: baidu.com

select
    substring_index (profile, ',', -1) gender,
    count(*) number
from
    user_submit
group by
    gender

Use substring_index to intercept the last field, gender, then count the number of gender, and finally group by gender

Involved knowledge:

You can use the like function for fuzzy matching. % indicates a placeholder, and then use if to judge. If the profile field contains a female field, it is female, otherwise it is male, and it is gender, and then count the number. Because the number of people of each gender needs to be counted, gender is used to group.

select
    if (profile like '%female', 'female', 'male') gender,
    count(*) number
from
    user_submit
group by
    gender

 Four, window function

Topic: Now the operation wants to find the students with the lowest gpa in each school for research. Please take out the lowest gpa in each school.

Desired result:

  First of all, you can get the lowest gpa of each school first, and you can use the min function and group grouping to get the lowest gpa of each school respectively

 Solution 1: Since the device_id also needs to be obtained, the value in it needs to be obtained again. Then use the where field and (university and gpa)

select
    device_id,
    university,
    gpa
from
    user_profile
where
    (university, gpa) in (
        select
            university,
            min(gap)
        from
            user_profile
        group by
            university
    )
order by
    university

Solution 2:

Involved knowledge:

The window function involves the ranking in the group and needs to involve the advanced function window function of sql. Window functions are also called OLAP functions

The basic syntax of window functions:

<窗口函数> over (partition by <用于分组的列名>
                order by <用于排序的列名>)

There are two types of functions that can be placed in the window function:

1. Special window function: rank, dense_rank, row_number special window function

2. Aggregation functions, sum, avg, max, min, etc.

Because window functions operate on the results of where or group by clauses, window functions can only be written in select clauses in principle.

        Partition by is used to group tables

        The order by clause is to sort the grouped results

There is already a group by clause grouping function before, why do we need window functions.

        After group by grouping and summarizing, the number of rows in the table is changed, one category per row. The partition function will not reduce the number of rows in the original table.

Other window functions:

        Rank, dense_rank, row_number difference?

select *,
   rank() over (order by 成绩 desc) as ranking,
   dense_rank() over (order by 成绩 desc) as dese_rank,
   row_number() over (order by 成绩 desc) as row_num
from 班级表

 Rank function: 5 digits, 5 digits, 5 digits, and 8 digits, that is, if there is a row with a tied rank, it will occupy the position of the next rank.

dense_rank: 5th, 5th, 5th, and 6th, if there is a tied ranking, the next ranking will not be occupied.

row_num function: 5 digits, 6 digits, 7 digits, and 8 digits, that is, the situation of tied rankings is not considered.

answer:

First use the row_num function to sort, use the school as a group, then use the school group to rank, and then use where to filter the required ranking

select
    *,
    row_number() over (
        partition by
            university
        order by
            gpa
    ) as rn
from
    user_profile

 Since the title requires that the final ranking should be based on the school, so we use oder by at the end, because we use the last place, so we use the cn ranking as 1, because the sorting defaults to ascending order.

select
    device_id,
    university,
    gpa
from
    (
        select
            *,
            row_number() over (
                partition by
                    university
                order by
                    gpa
            ) as rn
        from
            user_profile
    ) as univ_min
where
    rn = 1
order by
    university;

 

Guess you like

Origin blog.csdn.net/weixiwo/article/details/130136315