Climbing the ladder: database index

Mysql official original words: Index is a sorted data structure that helps MySQL obtain data efficiently.
Data structure website: www.cs.usfca.edu

1. Index data structure: red-black tree, Hash, B+ tree

Index data structure:

  • Binary tree
    • Disadvantages: A single side of the data stacking tree appears, and the binary tree degenerates into a singly linked list.
  • Red black tree
    • Concept: also called a binary balanced tree, when one side of the tree is too long, it will automatically balance;
    • Disadvantages: When the amount of data is relatively large, the data insertion speed is slow, and the number of disk read and write IOs is larger.
  • Hash
    • Concept: scattered storage structure, scattered and irregularly stored in the disk.
    • Disadvantages: But when using range search (greater than less than), the performance is lower.
  • B-Tree
    • concept:
      • The leaf nodes have the same depth, and there is no pointer connection between the leaf nodes
      • All index elements are not repeated
      • The index of the elements in the node is sorted ascending from left to right
  • B+Tree
    • Non-leaf nodes do not store data, only store indexes (redundancy), and can put more indexes
    • Leaf node contains all index fields
    • The leaf nodes are connected with pointers to improve the performance of interval access

2. Index concept, database storage engine

Mysql index:

The first-level index is a resident index, which is stored directly in the memory (RAM)

Query the data size of a single node of index data sql:

show global status like ‘Innodb_page_size’;

Mysql storage engine:

MyISAM storage engine (non-clustered index)

Table index and table data are stored separately: the file storing the index is MYI, and the file storing the data is MYD.
Insert picture description here

On the index leaf node, the physical address corresponding to the index and data is stored, which is also called a non-clustered index.
Insert picture description here

InnoDB index implementation (clustered index):
  • The table data file itself is an index structure file organized by B+Tree (the index and data are in the same file)Insert picture description here

  • Clustered index-leaf nodes contain complete data records

  • InnoDB tables must have a primary key, and it is recommended to use an integer auto-incrementing primary key
    (because the index structure keeps the left side of the binary tree smaller than the right side, the use of integer auto-increment can facilitate inserting and finding data)

  • Non-primary key index structure, the leaf node stores the primary key value (consistency and storage space saving)

If the InnoDB storage engine table does not create a primary key, the bottom layer will select a column in the table as the primary key; if no suitable column (unique) is found, a hidden column will be created as the primary key.

3. The underlying data structure of the joint index and the optimization principle of the leftmost prefix

Joint index (composite index):

Underlying storage structure: When multiple fields are used, the first field is used for sorting. If the first field is the same and cannot be sorted, the second field is used and so on to achieve B+TREE that increases from left to right at each level in principle.Insert picture description here

The leftmost prefix optimization is used for the joint index:

When querying: If the first field in the condition does not use a composite index, the index will not be used, because looking at the second or third field, the entire table is out of order and cannot be searched using the index.
As shown above: if the query condition does not use "10002", the joint index will not be used.

4. Mysql index optimization military regulations

Extract network data~~~

(1) Core military regulations
(1) Do not do operations in the database: CPU calculations must be moved to the business layer
(2) Control single table data volume: single table records are controlled at 1000w
(3) Control column number: the number of fields is controlled within 20
( 4) Balance paradigm and redundancy: In order to improve efficiency, sacrifice paradigm design and redundant data
(5) Reject 3B: Reject big SQL, big things, and large quantities

(2) Field military regulations

(6) Make good use of the numeric type
tinyint(1Byte)
smallint(2Byte)
mediumint(3Byte)
int(4Byte)
bigint( 8Byte )
bad case: int(1)/int(11)

(7) Convert characters to numbers
Use int instead of char(15) to store ip

(8) Use enum or set first,
for example: sexenum ('F','M')

(9) Avoid the use of NULL fields.
NULL fields are difficult to query and optimize
the index of NULL fields. Additional space is needed
. Composite index of NULL fields is invalid
bad case:
namechar(32) default null
ageint not null
Good case:
ageint not null default 0

(10) Less use of text/blob
varchar performance will be much higher than text. Blobs are
really unavoidable, please disassemble the table

(11) Do not store pictures in the database: do you need to explain?

(3) Index military regulations

(12) Use indexes cautiously and reasonably.
Improve queries and slow down updates.
Indexes must not be as many as possible (if you don’t need to add them, you must add them)
. Too many covering records are not suitable for indexing, such as "gender"

(13) A prefix index must be built for character fields

(14)
Bad case of not doing column operations on the index :
select id where age +1 = 10;

(15) Innodb primary key recommends the use of auto-increment
columns (SK: not approved by bloggers) primary key to build a clustered index,
primary keys should not be modified,
strings should not be used as primary keys
If primary keys are not specified, innodb will use unique and non-null index instead

(16) No foreign key
, please be bound by the program guarantee

(4) SQL military regulations

(17) The sql statement is as simple as possible.
One sql can only be operated on one cpu.
Large statements can split small statements to reduce the lock time.
One large sql can block the entire library

(18) Simple affairs

The transaction time is as short as possible.
Bad case:
upload picture transaction

(19) Avoid the use of trig/func
triggers and functions without
client programs instead

(20) No need to select *
consumption of cpu, io, memory, bandwidth.
This program is not scalable

(21)
The efficiency of rewriting OR to IN() or is that the number of
log(n) level
in messages when the message is in level n is recommended to be controlled within 200
select id from t where phone='159' or phone='136';
=>
select id from t where phone in ('159′, '136′);

(22) OR is rewritten as UNION
mysql index merge is very mentally retarded
select id from t where phone = '159' or name ='john';
=>
select id from t where phone='159'
union
select id from t where name= 'jonh'

(23) Avoid negative%

(24) Use count(*) with caution

(25) Same as above

(26) The higher the limit efficient paging
limit, the lower the efficiency
select id from t limit 10000, 10;
=>
select id from t where id> 10000 limit 10;

(27) Using union all instead of union
union has deduplication overhead

(28) Use less join

(29) Use group by to
group and
automatically sort

(30) Please use the same type comparison

(31) Using load data to guide data
load data is about 20 times faster than insert;

(32) Break up batch updates

(33)新能分析工具
show profile;
mysqlsla;
mysqldumpslow;
explain;
show slow log;
show processlist;
show query_response_time(percona)

Guess you like

Origin blog.csdn.net/qq845484236/article/details/107753982