Case description: In update, the difference between MySQL inner join and left join, small result set drives large result set

scene description

Take a scenario as an example:

Document A: Downstream sub-table (small data magnitude)
Document B: Downstream main table (small data magnitude)
Document C: Midstream sub-table (small data magnitude)
Document D: Midstream main table (small data magnitude)
Document E : Upstream sub-table (small data magnitude)
Document F: Upstream main table (larger data magnitude than other tables)

Requirement: brush a certain field of document F onto a certain field of document A. From A to F, they can all be related in the form of id connection index. But the connection order from A to F must be from A to F. for example:

The connection of these tables can be demonstrated with the following example:

a join b on a.id = b.id
b join c on b.id = c.mainId
c join d on c.id = d.tableId
d join e on d.id = e.tid
e join f on e.tid = f.code

The difference between inner join and left join

When we write the update statement, we definitely want to use join to connect the tables. But is it better to use inner join or left join?

  • left join:
    select a.*,b.* from a left join b on a.id = b.id , these two tables are connected. According to the subsequent on condition , if b.id in table b does not match a.id = b.id, then all data columns in table a will be displayed. Then table b does not have this kind of data, so b.* in sql will be filled with null

  • inner join:
    select a.*,b.* from a inner join b on a.id = b.id , these two tables are connected. According to the following on conditions , if b.id in table b does not match a.id = b.id, then some data columns in table a (do not match a.id = b. id condition) will not be displayed.

According to the above definition, left join is often used in select statements; this is to prevent table A from having fewer matching records, and in order to display the entire table A, left join is used.

As shown below:

Insert image description here

Understanding small result sets from an index perspective drives large result sets

Regardless of left join or inner join, it is important to note that the small result set drives the large result set. When table a joins table b,

Let’s look at the SQL from the previous example:

select a.*,b.* from a left join b on a.id = b.id

Assume that the magnitude of table a is 1 million items and the magnitude of table b is 100 items. When I connect like this, the big table drives the small table; just look at the number of searches:

When using the following on condition to connect two tables, you must first use the B+ tree index to match; take the 1 million order of magnitude of table a, compare them one by one -> B+ tree -> match the records of table b. Assuming that B+ tree searches for 100 entries in table b requires 2 searches, then the final number of searches is: 1 million * 2 times

If a small table drives a large table:

select a.*,b.* from b left join a on a.id = b.id

Then we will take the 100 records of table b and compare them one by one -> B+ tree -> match the records of table a. Assuming that B+ tree searches for 1 million entries in table a requires 3 searches, then the final number of searches is: 100 * 3 times

From the perspective of index matching, the efficiency of small result sets in driving large result sets has not been optimized at all. We need to consciously put the small watch on the left and the big watch on the right

But if you use inner join, MySQL will do internal optimization and automatically put the small table in the front and the big table in the back. In other words, no matter how you write it, the efficiency will be the same. However, left join cannot be automatically optimized, so you need to pay attention to this!

The update statement often uses inner join instead of left join.

For example, the following SQL:

(Task goal: update a table field with f table field)

update a 
inner join b on a.id = b.id
inner join c on b.id = c.mainId
inner join d on c.id = d.tableId
inner join e on d.id = e.tid
inner join f on e.tid = f.code
set a.Demand_orgid = f.req_org_id
where xxx = xxx;

In principle, inner join must be used for update.

Looking at the above SQL, assume that you use left join for all associations , because what you end up updating is the field of a; assuming that table a is in the process of left join, because a certain point cannot match table f, then use table f In the process of field updating the fields in table a, if any link fails to match, all fields in table f will be filled with null. In the end, if table a cannot match the data in table f, it will be updated to null!

But if you use inner join to update the fields of table a with the fields of table f, once any link cannot be matched, then all the data in table a that cannot match table f will not be displayed, that is to say, it will not be displayed. renew.

If you think about it, you can't even match the data column. What else are you updating? Are you updating null? Based on the above reasons, inner join actually meets the needs

Besides! Left join needs to consider the size relationship of these tables, which one is larger and which one is smaller. The small result set drives the large result set. But for inner join, there is no need to consider this issue at all, because inner join MySQL will do internal optimization to automatically put the small table in front and the big table in the back.

Guess you like

Origin blog.csdn.net/weixin_44757863/article/details/129621584