How to become a qualified CRUD engineer? (Send the book at the end of the article)

It was a summer in 1970.

A relational data model was created by Edgar Frank Codd, a senior researcher at IBM's San Jose Research Laboratory, who published an article in Communications of ACM called A Relational Model of Data for Large Shared Data Banks. Today, relational databases based on this model are still the primary way businesses store and process data. It can even be said that the vast majority of IT systems perform data addition, deletion, modification, and query operations around the database.

目前主流的关系型数据库包括MySQL、Oracle、Microsoft SQL Server、PostgreSQL以及SQLite等. Although the specific implementations of these database management systems are different, they all use SQL (Structured Query Language) as the standard language for accessing and manipulating databases.

1、SQL

In 1974, Donald D. Chamberlin and Raymond F. Boyce, also from IBM, developed the initial version of SQL based on the relational model: SEQUEL (Structured English Query Language). SEQUEL was designed for IBM's original quasi-relational database management system, SystemR.

In 1986, the American National Standards Institute (ANSI) first published the SQL standard; then the ISO standards organization created the "database language SQL" standard in 1987. After several revisions in 1989, 1992, 1996, 1999, 2003, 2006, 2008, 2011, 2016, and 2019, today's SQL standard contains a large number of features that span thousands of pages. 以下是SQL发展过程中的一些关键节点:

insert image description here
SQL was the first commercial implementation of the relational schema, and one of the most successful. SQL is a standard language for accessing and operating relational databases. All relational databases can use SQL statements for data access and control. Many big data platforms (including Flink, Spark, Hive, etc.) also provide SQL support.

SQL statements are very close to natural language (English), and we only need to master the functions of a few simple English words, such as SELECT, INSERT, UPDATE, DELETE, etc., to complete most of the database operations. For example, the following is a simple query statement:

SELECT emp_id, emp_name, salary
FROM employee
WHERE salary >= 10000
ORDER BY emp_id;

Even for beginners who have not learned SQL, it is not difficult to understand the function of the statement as long as we know the meaning of a few English words. This statement finds employees whose monthly salary (salary) is greater than or equal to 10000 in the employee table (employee), returns the employee's job ID (emp_id), name (emp_name), and monthly salary (salary), and displays them sorted by job ID.

It can be seen that the SQL statement is very simple and intuitive, all of which are composed of simple English words, because it has taken into account the needs of non-technical personnel at the beginning of its design. There are only a few main SQL statements, and in many cases only one SELECT statement is required.

Perhaps because of its simplicity and ease of use, many people think that SQL is simply CRUD. But in fact, as early as 1999, SQL supported general table expressions (WITH statements) and recursive queries, user-defined types, and many online analysis functions, and then it added window functions, MERGE statements, XML data types, JSON documents Storage (SQL/JSON), complex event and stream data processing (MATCH_RECOGNIZE clause), and multidimensional arrays (SQL/MDA), etc., the latest SQL standards are customizing graph storage (SQL/PGQ) related capabilities.

2. General table expressions

We take a general table expression (WITH statement) as an example to introduce how to use SQL statements to analyze friend relationships in social networks (WeChat, Facebook, etc.). Here's a simple friend relationship network:

insert image description here

In the next case study, we use the t_user table to store user information:


user_id|user_name
-------|---------
      1|刘一       
      2|陈二       
      3|张三
...

where user_id is the user ID and user_name is the user name.

Friendship relationships are stored in the t_friend table, and each friend relationship stores two records. E.g:


user_id|friend_id
-------|---------
      1|        2
      2|        1
      4|        1
...

Among them, user_id is the user ID, and friend_id is the user ID of the friend.

We first introduce how to check mutual friends. The following sentence finds the mutual friends of "Zhang San" and "Li Si":


WITH f1(friend_id) AS (
  SELECT f.friend_id
  FROM t_user u
  JOIN t_friend f ON (u.user_id = f.friend_id AND f.user_id = 3)
),
f2(friend_id) AS (
  SELECT f.friend_id
  FROM t_user u
  JOIN t_friend f ON (u.user_id = f.friend_id AND f.user_id = 4)
)
SELECT u.user_id AS "好友编号", u.user_name AS "好友姓名"
FROM t_user u
JOIN f1 ON (u.user_id = f1.friend_id)
JOIN f2 ON (u.user_id = f2.friend_id);

We define two CTEs in the query, f1 represents the friends of "Zhang San", and f2 represents the friends of "Li Si". The main query statement returns their common friends by connecting these two result sets. The results returned by the query are as follows:


好友编号|好友姓名
-------|-------
    1|刘一

Social software usually implements the function of recommending friends. On the one hand, they may read the user's mobile phone address book and find users who have been registered in the system but are not friends of the user to recommend. On the other hand, the system can find out users who are not friends with the user, but have mutual friends to recommend.

For example, the following statement returns users who can be recommended to "Chen Er":

WITH friend(id) AS (
  SELECT f.friend_id
  FROM t_user u
  JOIN t_friend f ON (u.user_id = f.friend_id AND f.user_id = 2)
),
fof(id) AS (
  SELECT f.friend_id
  FROM t_user u
  JOIN t_friend f ON (u.user_id = f.friend_id)
  JOIN friend ON (f.user_id = friend.id AND f.friend_id != 2)
)
SELECT u.user_id AS "用户编号", u.user_name AS "用户姓名",
       count(*) AS "共同好友"
FROM t_user u
JOIN fof ON (u.user_id = fof.id)
WHERE fof.id NOT IN (SELECT id FROM friend)
GROUP BY u.user_id, u.user_name;

We define two CTEs in the query, friend represents the friend of "Chen Er", and fof represents the friend of "Chen Er"'s friend (excluding "Chen Er" himself). The main query statement excludes users who are already friends of "Chen Er" in fof through the WHERE condition, and counts the number of common friends of the recommended user and "Chen Er". The results returned by the query are as follows:

用户编号|用户姓名|共同好友
-------|--------|-------
      4|李四    |   2
      7|孙七    |   1
      8|周八    |   2

Based on the query results, we can push "Chen Er" 3 people he might know, and tell him how many common friends he has with these users.

In sociology, there is a theory of Six Degrees of Separation, which means that any two people on earth can be connected by a chain of relationships within six layers. In 2011, Facebook calculated the average gap between any two independent users of 721 million active users visited in a month to be 4.74.

Let's take "Zhao Liu" and "Sun Qi" as an example to find the friendship chain between them:


-- MySQL
WITH RECURSIVE relation(uid, fid, hops, path) AS (
  SELECT user_id, friend_id, 0, CONCAT(',', user_id , ',', friend_id)
  FROM t_friend
  WHERE user_id = 6
  UNION ALL
  SELECT r.uid, f.friend_id, hops+1, CONCAT(r.path, ',', f.friend_id)
  FROM relation r
  JOIN t_friend f
  ON (r.fid = f.user_id)
  AND (INSTR(r.path, CONCAT(',',f.friend_id,',')) = 0)
  AND hops < 6
)
SELECT uid, fid, hops, substr(path, 2) AS path
FROM relation
WHERE fid = 7
ORDER BY hops;

where relation is a recursive CTE. The initialization statement is used to find the friends of "Zhao Liu", the first recursive return to the friends of "Zhao Liu" friends, and so on. We limit the number of relationship levels hops to less than 6, the path field stores relationship chains separated by commas, and the INSTR function is used to prevent the formation of A->B->A loops.

The results returned by the query are as follows.


uid|fid|hops|path           
---|---|----|---------------
  6|   7|   2|6,4,1,7        
  6|   7|   3|6,4,1,8,7      
  6|   7|   3|6,4,3,1,7      
  6|   7|   4|6,4,3,1,8,7    
  6|   7|   4|6,4,3,2,1,7    
  6|   7|   5|6,4,1,2,5,8,7  
  6|   7|   5|6,4,3,2,1,8,7  
  6|   7|   5|6,4,3,2,5,8,7  
  6|   7|   6|6,4,1,3,2,5,8,7
  6|   7|   6|6,4,3,1,2,5,8,7
  6|   7|   6|6,4,3,2,5,8,1,7

The most recent relationship between "Zhao Liu" and "Sun Qi" is through the connection between "Li Si" and "Liu Yi".

Alternatively, we can count the number of people with the smallest average gap between any two users:

-- MySQL
WITH RECURSIVE relation(uid, fid, hops, path) AS (
  SELECT user_id, friend_id, 0, CONCAT(',', user_id , ',', friend_id)
  FROM t_friend
  UNION ALL
  SELECT r.uid, f.friend_id, hops+1, CONCAT(r.path, ',', f.friend_id)
  FROM relation r
  JOIN t_friend f
  ON (r.fid = f.user_id)
  AND (INSTR(r.PATH, CONCAT(',',f.friend_id,',')) = 0)
)
SELECT AVG(min_hops)
FROM (
  SELECT uid, fid, MIN(hops) min_hops
  FROM relation
  GROUP BY uid, fid
) mh;

The results returned by the query are as follows.


avg(min_hops)
-------------
       0.8214

The test dataset we provide is small, with an average separation of 0.8 people between any two people.

In addition to friend relationships, general table expressions can also be used to analyze fan following relationships in Weibo, Zhihu and other software. Other common use cases include generating sequences of numbers, traversing organizational graphs, and querying subway, flight-to-transit route maps, and more.

3. SQL programming ideas

If you think SQL is simply CRUD, that was a concept 40 years ago. Although SQL is a language developed based on the relational model, after decades of development, it is no longer limited to the relational model. In order to help everyone understand and learn modern SQL language and programming ideas, not just limited to the simple functions provided by traditional SQL, the book "SQL Programming Ideas" came into being.

insert image description here
Based on the author's more than ten years of work experience and knowledge sharing, this book comprehensively covers from basic SQL query to advanced analysis, from database design to query optimization, etc. A SQL knowledge point. This book adopts the brand-new SQL:2019 standard, keeps up with industry development trends, helps readers unlock the most cutting-edge SQL skills, and provides the implementation and differences of 5 mainstream databases. Finally, the book introduces support for the new SQL:2019 standard for document storage (JSON), row pattern recognition (MATCH_RECOGNIZE), multidimensional arrays (SQL/MDA), and graph storage (SQL/PGQ).

insert image description here

This book is suitable for IT practitioners who need to complete data processing in their daily work, including SQL beginners, intermediate and senior engineers with a certain foundation, and even experts who are proficient in a certain database product.

You can grab it directly on Jingdong:
insert image description here

由于我跟这本书籍的作者是朋友,再加上刚出版没多久,所以赠送了我几本,我要这么多也没用,就免费赠送给我的粉丝们了。

包邮,免费送4本!

包邮,免费送4本!

包邮,免费送4本!

包邮,免费送4本!

Gift Rules:

在文末评论, 为什么想要这本书(There is no limit to the number of words, no likes are required, of course, the number of words and likes can also increase the probability of winning), I will randomly select 4 of my fans to send 4 books, they must be my fans, if you don't follow me , then there is no way;

If you want to know the award status faster, you can add my community Q group: 1093169351.

Guess you like

Origin blog.csdn.net/qq_17623363/article/details/121235483