MySQL principle

Reprinted: http://www.linuxidc.com/Linux/2014-04/99721.htm 1. MySQL



Foundation

MySQL is an open source relational database management system. The original developer was MySQL AB in Sweden. MySQL 3.23 first entered the administrator's field of vision in 2001 and was widely used after that. In 2008, MySQL was acquired by Sun and released its first post-acquisition version, MySQL 5.1, which introduced partitioning, row-based replication, and a plugin API. The original BerkeyDB engine was removed. At the same time, Oracle acquired InnoDB Oy and released the InnoDB plugin, which later developed into the famous InnoDB engine. In 2010, Oracle acquired Sun, which also brought MySQL under the umbrella of Oracle. After that, Oracle released the first version 5.5 after the acquisition. This version mainly improves performance, scalability, replication, partitioning and support for windows. The current version has been developed to 5.7.

Compared with other databases, MySQL is a bit different, and its architecture can be applied and played well in many different scenarios. It is mainly reflected in the architecture of the storage engine. The plug-in storage engine architecture separates query processing from other system tasks and data storage and extraction. This kind of architecture can choose the appropriate storage engine according to the needs and actual needs of the business.

2. MySQL logical architecture



1. The top layer is some client and connection services, including local sock communication and most communication based on client/server tools similar to tcp/ip. It mainly completes some security schemes similar to connection processing, authorization and authentication, and related. In this layer, the concept of thread pool is introduced to provide threads for clients that have passed authentication and secure access. Also on this layer, SSL-based secure links can be implemented. The server also verifies the operating permissions it has for each client that accesses securely.

2. The second-tier architecture mainly completes large core service functions, such as SQL interfaces, and completes cached queries, SQL analysis and optimization, and execution of some built-in functions. All functions across storage engines are also implemented at this layer, such as procedures, functions, etc. In this layer, the server will parse the query and create the corresponding internal parse tree, and complete the corresponding optimization, such as determining the order of the query table, whether to use the index, etc., and finally generate the corresponding execution operation. If it is a select statement, the server will also query the internal cache. If the cache space is large enough, this can greatly improve the performance of the system in an environment that handles a large number of read operations.

3. Storage engine layer, the storage engine is really responsible for the storage and extraction of data in MySQL, and the server communicates with the storage engine through API. Different storage engines have different functions, so that we can choose according to our actual needs.

4. The data storage layer mainly stores data on the file system running on the raw device and completes the interaction with the storage engine.

3. The concept of concurrency control and locks

When there are multiple operations in the database that need to modify the same data, it is inevitable that dirty reads of the data will occur. At this time, the database needs to have good concurrency control capabilities, all of which are implemented by the server and storage engine in MySQL.

The most effective solution to the concurrency problem is to introduce a lock mechanism. Locks are functionally divided into shared locks and exclusive locks, which are commonly referred to as read locks and write locks. When a select statement is executed, a read lock can be applied, which allows other select operations to be performed, because the data information will not be changed during this process, which can improve the operating efficiency of the database. When the data needs to be updated, it is necessary to apply a write lock, and no other operations are allowed, so as to avoid dirty reads and phantom reads of the data. Locks also have granularity, including table locks and row locks, which complete row locking and table locking respectively during data operations. These characteristics are also different according to different storage engines.

Most transactional storage engines in MySQL are not simple row-level locks. Based on performance considerations, they generally implement multi-version concurrency control (MVCC). This solution is also adopted by mainstream relational databases such as Oracle. It does this by saving a snapshot of a point in time in the data, which ensures that the data seen by each transaction is consistent. For detailed implementation principles, please refer to the third edition of "High Performance MySQL".

4. Transaction

1. Simply speaking, a transaction is a set of atomic SQL statements. This set of statements can be understood as a unit of work, either all or none of them. In MySQL, you can use the following commands to operate transactions:

1.start transaction;
2.select ...
3.update ...
4.insert ...
5.commit;
Note: Auto-commit is enabled by default in MySQL:



2. Transactions have the characteristics of ACID:

Atomicity: All operations in the transaction are either submitted successfully or all failed and rolled back.

Consistency: The database always transitions from one consistent state to another.

Isolation: Changes made by one transaction are not visible to other transactions until they are committed.

Durability: Once a transaction commits, its modifications are permanently stored in the database.

3. Transaction isolation level: Four isolation levels are defined in the SQL standard:

READ UNCOMMITTED (read uncommitted): changes in a transaction are visible to other transactions even if they are not committed

READ COMMITTED: Changes made after a transaction is committed will only be seen by another transaction, which may result in different results for two queries in one transaction.

REPEATABLE READ: Only the current transaction commits to see the modification results of another transaction. Fixed an issue where the results of two queries in a transaction were different.

SERIALIZABLE (serialization): Only one transaction commits before another transaction is executed.

4. In MySQL, the following statements can be used to query and temporarily modify the isolation level:


5. Deadlock: Two or more transactions occupy each other on the same resource and request to lock the resources occupied by each other, resulting in a vicious circle. Some MySQL storage engines can detect deadlocked circular dependencies and generate corresponding errors. The InnoDB engine's solution to deadlocks is to roll back the transaction that holds the fewest exclusive locks.

5. MySQL storage engine and application scheme

1. MySQL adopts a plug-in storage engine architecture, and different storage engines can be set for different tables according to different needs. You can use the following command to display the status information of the table in the database, taking the user table as an example, the display is as follows:



Name: Displays the table name

Engine: Displays the storage engine, the storage engine of the table is MyISAM

Row_format: Displays the row format, for MyISAM there is Dynamic , Fixed and Compressed three. Differently means that there are variable data types in the table, the data type in the table is fixed, and the table is a compressed table environment.

Rows: Display the number of rows in the table

Avg_row_length: Average row length (bytes)

Data_length: Data length (bytes)

Max_data_length: Maximum stored data length (bytes)

Data_free: allocated but unused space, including the space freed

by deleting data Collation Checksum: If enabled, it means the real-time checksum of the entire table. Create_options: Some other options for the creation of the representation Comment: Some additional comment information, the content of which is not the same depending on the storage engine. 2. Storage engine introduction: InnoDB engine: 1. Store data in a tablespace, which consists of a series of data files managed by InnoDB; 2. Supports data and indexes for each table stored in a separate file (innodb_file_per_table ); 3. Support transactions, use MVCC to control concurrency, implement standard 4 transaction isolation levels, and support foreign keys; 4. The index is established based on clustered index, which has high performance for primary key query; 5. Data file platform It is irrelevant and supports data migration in different architecture platforms; 6. It can support real hot backup through some tools. Such as XtraBackup, etc.; 7. Internal optimization, such as adopting predictable read-ahead, can automatically create hash indexes in memory, etc. MyISAM engine: 1. By default in MySQL 5.1, transaction and row-level locks are not supported; 2. Provides a large number of features such as full-text indexing, spatial functions, compression, delayed update, etc.;





































3. After a database failure, the security recovery is poor;

4. For read-only data that can tolerate failure recovery, MyISAM is still very suitable;

5. The scenario of the log server is also suitable, only insert and data read operations;

6. Does not support single Table one file, all data and index content will be stored in two files respectively;

7. MyISAM locks the entire table instead of rows, so it is not suitable for scenarios with many write operations;

8. Support for index caching does not support Data cache.

Archive engine:

1. Only supports insert and select operations;

2. Caches all written data and compresses storage, supports row-level locks but does not support transactions;

3. Suitable for high-speed insertion and data compression, reducing IO operations, and suitable for logging and archive server.

Blackhole engine:

1. No storage mechanism is implemented, the inserted data will be discarded, but the binary log will be stored;

2. It will be used in some special-needed replication architecture environments.

CSV engine:

1. It can open the data stored in CSV files, export the stored data, and use excel to open it;

2. It can be used as a data exchange mechanism and is also frequently used.

Memory engine:

1. Cache data in memory without consuming IO;

2. Store data faster but will not be retained, generally used as temporary table storage.

Federated Engine:

A storage engine capable of accessing data on remote servers. Ability to establish a connection to a remote server.

Mrg_MyISAM engine:

Merge multiple MYISAM tables into one. It does not store data itself, and the data is stored in the middle of the MyISAM table.

NDB Cluster Engine:

dedicated to MySQL Cluster.

3. Third-party storage engine:

1. OLTP class:

XtraDB: An improved version of InnoDB.

PBXT: Similar to InnoDB, but provides engine-level replication and foreign key constraints, and supports SSD storage appropriately.

TokuDB (open source): supports the fractal tree index structure and supports the analysis of massive data.

2. Columnar storage engine: MySQL is row-oriented storage by default.

Infobright: supports tens of terabytes of data, and is designed for data analysis and data warehousing. Data is highly compressed.

InfiniDB: Can do distributed queries across a set of clusters, there are commercial versions but no typical use cases.

3. Community storage engine:

Aria: Solve the problem of MyISAM crash safety recovery and enable data caching.

Groona: Full-text indexing engine.

QQGraph: Developed by Open query to support graph operations, such as finding the shortest distance between two points.

SphinxSE: This engine provides the SQL interface to the Sphinx full-text index search server.

Spider: supports sharding and can implement parallel query based on sharding.

VPForMySQL: supports vertical partitioning.

4. Storage engine selection reference factors

1. Whether there are transaction requirements

If you need transaction support, it is best to choose InnoDB or XtraDB. If it is mainly for select and insert operations, MyISAM is more suitable, and log-type applications are generally used.

2. Backup operation requirements

If the server can be shut down for backup, this factor can be ignored. If online hot backup is required, the InnoDB engine is a good choice.

3. Failure recovery

requirements InnoDB is recommended in scenarios with better recovery requirements, because the probability of MyISAM data corruption is relatively high and the recovery speed is relatively slow.

4. Performance requirements

Some business requirements can only be met by some specific storage engines. For example, geospatial indexes are only supported by the MyISAM engine. Therefore, administrators need to consider compromises in the application architecture requirement environment. Of course, from all aspects, the InnoDB engine should be recommended by default.

5. Table engine conversion method

1. Direct modification


2. Backup modification

Use the mysqldump backup tool to export the data, and modify the storage engine options in the create table statement. Note that modify the table name at the same time.

3. Create Insert


Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326586432&siteId=291194637