MySQL database index, transaction and storage engine

One, MySQL index

(1) The concept of index

The index is a sorted list, in which the value of the index and the physical address of the row containing the value of the data are stored (similar to the C language linked list pointing to the memory address of the data record through a pointer)
After using the index, you can locate the data of a row without scanning the entire table, but first find the physical address corresponding to the row of data through the index table and then access the corresponding data, thus speeding up the query speed of the database
The index is like a table of contents of a book, you can quickly find the content you need according to the page number in the table of contents
Index is a method of sorting the values of one or several columns in a table
The purpose of indexing is to speed up the search or sorting of records in the table

(2) The role of index

After setting the appropriate index, the database uses various fast positioning technologies to greatly speed up the query speed. This is the main reason for creating all
When the table is large or the query involves multiple tables, the use of indexes can increase the query speed by thousands of times
Can reduce the IO cost of the database, and indexing can also reduce the sorting cost of the database
By creating a unique index, you can ensure the uniqueness of each row of data in the data table
Can speed up the connection between the table and the table
When using grouping and sorting, the time for grouping and sorting can be greatly reduced

Side effects of indexing:

Indexes require additional disk space
For the MyISAM engine, the index file and the data file are separated, and the index file is used to save the address of the data record
The InnoDB engine table data file itself is the index file
It takes more time to insert and modify data, because the index also changes accordingly

(3) Principles of index creation

Indexes can increase the speed of database queries, but they are not suitable for creating indexes under all circumstances.
Because the index itself consumes system resources, if there is an index, the database will first perform an index query and then locate a specific data row. If the index is used improperly, it will increase the burden on the database.

in accordance with:

The primary key and foreign key of the table must have indexes. Because the primary key is unique, the foreign key is related to the primary key of the child table, which can be quickly located during query
Tables with more than 300 rows of records should have indexes. If there is no index, you need to traverse the table again, which will seriously affect the performance of the database
Tables that are often connected to other tables should be indexed on the connection field
Fields with poor uniqueness are not suitable for indexing
Fields that are updated too frequently are not suitable for index creation
Fields that often appear in the where clause, especially the fields of large tables, should be indexed
Indexes should be built on highly selective fields
Indexes should be built on small fields. Do not build indexes for large text fields or even long fields

(4) Classification and creation of indexes

Ordinary index: the most basic index type, there are no restrictions such as uniqueness

直接创建索引：
CREATE INDEX 索引名 ON 表名 (列名[(length)]);

#(列名(length))：
length是可选项，下同。如果忽略 length 的值，则使用整个列的值作为索引。
如果指定使用列前的 length 个字符来创建索引，这样有利于减小索引文件的大小。

#索引名建议以“_index”结尾。

修改表方式创建：
ALTER TABLE 表名 ADD INDEX 索引名 (列名);

创建表的时候指定索引
CREATE TABLE 表名 ( 字段1 数据类型,字段2 数据类型[,...],INDEX 索引名 (列名));

Unique index: similar to ordinary index, but the difference is that each value of the unique index column is unique

The unique index allows null values (note that it is different from the primary key). If it is created with a composite index, the combination of column values must be unique.
Adding a unique key will automatically create a unique index.

直接创建唯一索引：
CREATE UNIQUE INDEX 索引名 ON 表名(列名);

修改表方式创建
ALTER TABLE 表名 ADD UNIQUE 索引名 (列名);

创建表的时候指定
CREATE TABLE 表名 (字段1 数据类型,字段2 数据类型[,...],UNIQUE 索引名 (列名));

Primary key index: is a special unique index, must be designated as PRIMARY (primary)

创建表的时候指定
CREATE TABLE 表名 ([...],PRIMARY KEY (列名));

修改表方式创建
ALTER TABLE 表名 ADD PRIMARY KEY (列名);

Composite index (single-column index and multi-column index): it can be an index created on a single column, or an index created on multiple columns

Need to meet the leftmost principle, because the where condition of the select statement is executed from left to right in turn,
so when using the select statement to query, the order of the fields used in the where condition must be consistent with the order in the composite index, otherwise the index will not take effect

CREATE TABLE 表名 (列名1 数据类型,列名2 数据类型,列名3 数据类型,INDEX 索引名 (列名1,列名2,列名3));

select * from 表名 where 列名1='...' AND 列名2='...' AND 列名3='...';

Full text index (FULLTEXT): suitable for fuzzy query, can be used to retrieve text information in an article

FULLTEXT index can only be used in MyISAM engine. After version 5.6, innodb engine also supports FULLTEXT index.
Full-text indexes can be created on columns of CHAR, VARCHAR, or TEXT type. Only one full-text index is allowed per table.

直接创建索引
CREATE FULLTEXT INDEX 索引名 ON 表名 (列名);

修改表方式创建
ALTER TABLE 表名 ADD FULLTEXT 索引名 (列名);

创建表的时候指定索引
CREATE TABLE 表名 (字段1 数据类型[,...],FULLTEXT 索引名 (列名));

#数据类型可以为 CHAR、VARCHAR 或者 TEXT

使用全文索引查询
SELECT * FROM 表名 WHERE MATCH(列名) AGAINST('查询内容');

(5) View the index

show index from 表名;
show keys from 表名;

Insert picture description here

The meaning of each field:

Field Name	Field meaning
Table	The name of the table
Non_unique	If the index cannot include repeated words, it is 0; if it can, it is 1
Key_name	The name of the index
Seq_in_index	The column number in the index, starting from 1
Column_name	Column name
Collation	In what way is the column stored in the index, with the value'A' (ascending order) or NULL (no classification)
Cardinality	Estimated number of unique values in the index
Sub_part	Partially indexed columns are the number of characters indexed. The entire column is indexed, then NULL
Packed	Indicates how the keywords are compressed. If not compressed, then NULL
Null	If the column contains NULL, it contains YES. If not, the column contains NO
Index_type	Used index methods (BTREE, FULLTEXT, HASH, RTREE)
Comment	Remarks

(6) Delete the index

直接删除索引
DROP INDEX 索引名 ON 表名;

修改表方式删除索引
ALTER TABLE 表名 DROP INDEX 索引名;

删除主键索引
ALTER TABLE 表名 DROP PRIMARY KEY;

Two, MySQL transaction

(1) The concept of affairs

A transaction is a mechanism, an operation sequence, which contains a set of database operation commands, and submits or revokes operation requests to the system together with all the commands as a whole, that is, this set of database commands are either executed or not executed
A transaction is an inseparable logical unit of work. When performing concurrent operations on the database system, the transaction is the smallest control unit
Transactions are suitable for scenarios where multiple users are simultaneously operating database systems, such as banks, insurance companies and securities trading systems, etc.
The transaction ensures the consistency of the data through the integrity of the transaction

(2) ACID characteristics of the transaction

ACID refers to
the four characteristics of transactions in a reliable database management system (DBMS) : Atomicity, Consistency, Isolation, and Durability. These are several characteristics that a reliable database should have.

Atomicity: refers to the transaction is an indivisible unit of work, the operations in the transaction either all happen, or do not happen

A transaction is a complete operation, and the elements of the transaction are inseparable.
All elements in the transaction must be committed or rolled back as a whole.
If any element in the transaction fails, the entire transaction will fail.

Consistency: refers to the integrity constraints of the database are not destroyed before the transaction begins and after the transaction ends

When the transaction is complete, the data must be in a consistent state.
Before the transaction begins, the data stored in the database is in a consistent state.
In the ongoing transaction, the data may be in an inconsistent state.
When the transaction is successfully completed, the data must be returned to a known consistent state again.

Isolation: in a concurrent environment, when different transactions manipulate the same data at the same time, each transaction has its own complete data space

All concurrent transactions that modify data are isolated from each other, indicating that the transaction must be independent, and it should not depend on or affect other transactions in any way.
A transaction that modifies data can access the data before another transaction using the same data starts, or after another transaction using the same data ends.

Persistence: After the transaction is completed, the changes made by the transaction to the database are persisted in the database and will not be rolled back

Refers to whether the system fails or not, the result of transaction processing is permanent.
Once the transaction is committed, the effect of the transaction will be permanently retained in the database.

There are several types of interactions between transactions, namely:

Dirty read: A transaction reads uncommitted data of another transaction, and this data may be rolled back.
Non-repeatable read: Two identical queries within a transaction return different data. This is caused by the commit of other transaction modifications in the system during the query.
Phantom read: A transaction modifies the data in a table, and this modification involves all data rows in the table. At the same time, another transaction also modifies the data in this table. This modification is to insert a new row of data into the table. Then, the user operating the previous transaction will find that there are still unmodified data rows in the table, as if an illusion occurred.
Lost update: Two transactions read the same record at the same time. A first modifies the record, and B also modifies the record (B does not know that A has modified it). After B submits the data, the modification result of B overwrites the modification result of A.

Mysql and transaction isolation level:

read uncommitted: read uncommitted data: does not solve dirty read
read committed: read the submitted data: can solve dirty read
repeatable read: reread read: can solve dirty read and non-repeatable read-mysql default
serializable: serialization: can solve dirty read, non-repeatable read and virtual read-equivalent to lock table

mysql默认的事务处理级别是 repeatable read 
而Oracle和SQL Server是 read committed 

查询全局事务隔离级别：
show global variables like '%isolation%';
SELECT @@global.tx_isolation;

查询会话事务隔离级别：
show session variables like '%isolation%';
SELECT @@session.tx_isolation; 
SELECT @@tx_isolation;

设置全局事务隔离级别：
set global transaction isolation level read committed;

设置会话事务隔离级别：
set session transaction isolation level read committed;

Summary: In transaction management, atomicity is the foundation, isolation is the means, consistency is the goal, and persistence is the result.

(3) Transaction control statement

BEGIN 或 START TRANSACTION：显式地开启一个事务。

COMMIT 或 COMMIT WORK：提交事务，并使已对数据库进行的所有修改变为永久性的。

ROLLBACK 或 ROLLBACK WORK：回滚会结束用户的事务，并撤销正在进行的所有未提交的修改。

SAVEPOINT S1：使用 SAVEPOINT 允许在事务中创建一个回滚点，一个事务中可以有多个 SAVEPOINT；“S1”代表回滚点名称。

ROLLBACK TO [SAVEPOINT] S1：把事务回滚到标记点。

(4) Use set to set the control transaction

SET AUTOCOMMIT=0;						#禁止自动提交
SET AUTOCOMMIT=1;						#开启自动提交，Mysql默认为1
SHOW VARIABLES LIKE 'AUTOCOMMIT';		#查看Mysql中的AUTOCOMMIT值

If automatic submission is not turned on:

All operations of mysql connected to the current session will be treated as a transaction until you enter rollback|commit; the current transaction is not considered to be the end. Before the end of the current transaction, the new mysql connection cannot read any operation results of the current session.

If automatic submission is enabled:

MySQL will treat each sql statement as a transaction, and then automatically commit.

Three, MySQL storage engine

（1）MyISAM

MyISAM tables support 3 different storage formats:

Static (fixed length) tables
Static tables are the default storage format. The fields in the static table are all non-variable fields, so that each record has a fixed length. The advantage of this storage method is that the storage is very fast, easy to cache, and easy to recover from failure; the disadvantage is that it usually takes up more space than dynamic tables .
Dynamic tables Dynamic tables contain variable fields, and records are not of fixed length. The advantage of this storage is that it takes up less space, but frequent updates and deletions of records will generate fragmentation. You need to periodically execute
OPTIMIZE TABLE statements or myisamchk -r commands to improve performance , And it is relatively difficult to recover when a failure occurs.
Compressed table The compressed table is created by the myisamchk tool and occupies a very small space. Because each record is compressed individually, there is only a very small access cost.

Production scenarios for MyISAM:

The company's business does not require support of affairs
Businesses that unilaterally read or write more data
MyISAM storage engine data reads and writes are frequent, not suitable for scenarios
Use read and write concurrent access to relatively low business
Businesses with relatively little data modification
Services that do not require very high data service consistency
Server hardware resources are relatively poor

（2）InnoDB

InnoDB features:

Read and write blocking is related to transaction isolation level
Can cache indexes and data very efficiently
Tables and primary keys are stored in clusters
Support partition, table space, similar to oracle database
Support foreign key constraints, full-text index is not supported before 5.5, full-text index is supported after 5.5

InnoDB applicable production scenarios:

Situations that require relatively high hardware resources

Row-level locking, but full table scans will still be table-level locking, such as update table set a=1 where user like'%lic%';

InnoDB does not save the number of rows in the table, such as select count() from table;, InnoDB needs to scan the entire table to calculate how many rows there are, but MyISAM simply reads the number of rows saved.
It should be noted that MyISAM also needs to scan the entire table when the count() statement contains the where condition. For self-growing fields, InnoDB must contain an index with only this field, but a composite index can be built together with other fields in the MyISAM table

(2) Storage engine operation commands

#查看系统支持的存储引擎
show engines;

#查看表使用的存储引擎
方法一：
show table status from 库名 where name='表名'\g

方法二：
use 库名;
show create table 表名;

#修改存储引擎
1．通过 alter table 修改
use 库名;
alter table 表名 engine=MyISAM;

2．通过修改 /etc/my.cnf 配置文件，指定默认存储引擎并重启服务
vim /etc/my.cnf

[mysqld]
default-storage-engine=INNODB

systemctl restart mysql.service

注意：此方法只对修改了配置文件并重启mysql服务后新创建的表有效，已经存在的表不会有变更。

3．通过 create table 创建表时指定存储引擎
use 库名;
create table 表名(字段1 数据类型,...) engine=MyISAM;