[MYSQL] MYSQL learning tutorial (1) Analysis of the logical architecture of MYSQL

After MySQL is installed, it contains a "mysqld" server process, client applications for local or remote connections, and some locally installed mysql non-client programs. Users connect to the MySQL server through the mysql client application to initiate data requests.

When the MySQL client communicates with the server, the client and server can use different operating systems. For example, the client uses Windows and the server uses Linux. The client connects to the server through the TCP/IP protocol.

MySQL is a single-process multi-threaded server. The name of the MySQL process is "mysqld". It is responsible for managing database access on disk and memory, supports multiple storage engines, supports transactional and non-transactional tables, and can optimize memory. use

1. MySQL logical architecture

The components of MYSQL include the following:

  1. Connection pool component
  2. Management services and tools components
  3. SQL interface component
  4. Query Analyzer Component
  5. Query Optimizer Component
  6. Cache component
  7. Plug-in storage engine
  8. physical file

Mysql storage: table-based, not database.
Features of Mysql: plug-in table storage engine

Its structure diagram is as follows:

Insert image description here
The internal structure of MySQL is divided into four layers:

  • Connection layer: Complete client connection (connection processing, authorization authentication)
  • Service layer: perform SQL analysis and optimization
  • Engine layer: Responsible for data extraction and storage
  • Storage layer: stores data on the device's file system and completes interaction with the data engine

2. Connection layer

2.1 Overview

The connection layer assigns a thread to each connection, which is used to control the execution of the query. After the connection is authenticated with username/password, the connection can send SQL queries.

The connection layer accepts application connections through TCP/IP, Unix socket, shared memory, and Named pipes protocols. The connection protocol is implemented through client libraries and drivers, and the speed of the connection protocol varies depending on local settings. Among the above protocols, MySQL can transmit messages between networks through the TCP/IP protocol, and other protocols only support local use (the client and server must be on the same host)

In addition to being used for remote connections between networks, the TCP/IP protocol can also be used for local connections. When using the TCP/IP protocol, you need to use the IP address or DNS name to identify the host, and use the port number to identify the service. The default port number for MySQL is 3306. When the host name uses "localhost", MySQL will think that the user uses the Unix socket to communicate. When the "127.0.0.1" IP address is used, the TCP/IP protocol will be used to communicate.

Unix socket communication is a form of inter-process communication used on one end of a two-way communication link formed between two processes on the same machine, requiring the server to create a socket file through which the client connects. For example:

mysql -S var/lib/mysql/mysql.sock -uroot -P

When users use window, they can connect through shared memory and Named pipes. When using shared memory, the server creates a shared memory block that the client process uses to communicate with the server.

The way Named pipes work on Windows is similar to Unix sockets. The server creates a named pipe, and the client establishes a connection with the server through the named pipe.

connection thread

The server creates a connection thread for each active client connection. All statements executed through that client will use this thread. When the client disconnects, the server will destroy the thread. When the server creates and destroys threads, it must allocate a dedicated memory structure in advance for client connections. When connections are frequently established and destroyed, it will have a performance impact on the system. MySQL provides a thread pool plug-in in the enterprise version. This plug-in can manage threads in groups. Each group of threads only allows one short-running statement at any point in time. The thread group can create an additional thread for long-running statements. , and be able to prioritize statements based on transaction relationships

2.2 Communication methods

Common communication mechanisms:

  • Full duplex: can send and receive data at the same time, such as making phone calls
  • Half-duplex: Refers to a certain moment when data is either sent or received, not at the same time. For example, early walkie-talkies
  • Simplex: Only data can be sent or data can only be received. For example, one-way street

The MySQL client/server communication protocol is "half-duplex":

  • At any moment, either the server is sending data to the client, or the client is sending data to the server. These two actions cannot occur at the same time.
  • Once one end starts sending a message, the other end must receive the entire message before it can respond to it, so we cannot and do not need to cut a message into small pieces and send them independently, and there is no way to control the flow.

Transmission process:

  • Client ==> Server: The client sends the query request to the server in a separate data packet, so when the query statement is very long, max_allowed_packetparameters need to be set. However, it should be noted that if the query is too large, the server will refuse to receive more data and throw an exception.
  • Server ==> Client: The server usually responds to a lot of data to the user, consisting of multiple data packets; but when the server responds to the client's request, the client must receive the entire return result in its entirety, and cannot simply take the first part. A few results, then ask the server to stop sending
  • In actual development, it is a very good habit to keep queries as simple as possible and only return necessary data, and to reduce the size and number of data packets between communications.
  • This is also one of the reasons why we try to avoid using SELECT * and adding LIMIT restrictions in queries.

2.3 Permission verification

There are 4 tables that control permissions in mysql, namely: user table, db table, tables_priv table, columns_priv table

The verification process of mysql permission table is:

Insert image description here

MySQL permission classification

Global management permissions: Act on the entire MySQL instance level
Database-level permissions: Act on a specified database or all databases
Database object-level permissions: Act on specified database objects (tables, views, etc.) or on all database objects

Insert image description here

2.4 Query connection status

For a MySQL connection, or a thread, there is a state at any time, which indicates what MySQL is currently doing. There are many ways to view the current status, the simplest is the following:

SHOW FULL PROCESSLIST

The execution results are as follows:

Insert image description here

The command column is the status:

Insert image description here

There is a problem with the connection:For problematic connections, kill {id}

2.5 Connection type

  • Long connection: Long connection is relative to short connection. A long connection means that multiple data packets can be sent continuously on a connection. During the connection maintenance period, if no data packets are sent, both parties need to send link detection packets.
  • Short connection: means that when the communicating parties have data exchange, a connection is established. After the data is sent, the connection is disconnected, that is, each connection only completes the sending of one business.

The problem of long connections: After using long connections, as the number of connections continues to increase, the memory usage will increase, because MySQL will occupy memory to manage connection objects during operation, and will not release it until the connection is disconnected. If connections continue to accumulate, it will cause excessive memory usage and be forcibly killed by the system, which means MySQL will restart.

solution

1. Disconnect long links regularly, disconnect after a period of time or after executing a large query that takes up memory, and release memory in sequence.

2. MySQL 5.7+ provides mysql_reset_connection to reinitialize the connection resource. At this time, there is no need to reconnect, and the connection can be restored to the state when it was just created.

MySQL: show processlist detailed explanation

3. Service layer

Service layer: mainly completes most of the core service functions

  • sql interface and complete the cached query (receive the user's sql command and return the results that the user needs to query. For example, select from is to call the SQL Interface)
  • All cross-storage engine functions are also implemented in this layer, such as procedures, functions, etc.
  • At this layer, the server will parse the query and create the corresponding internal parse tree, and complete the corresponding optimization such as determining the order of the query table, whether to use indexes, etc. Finally, the corresponding execution operations are generated. For example, in select statements, the server will also query the internal cache. If the cache space is large enough, it can greatly improve system performance in an environment with a large number of read operations.

3.1 Cache&Buffer query cache

The main function of mysql cache is to improve query efficiency (it was deleted after mysql8.0 version)

3.1.1 Storage form

  • The cache is stored in the form of a hash table of key and value. The key is a specific SQL statement and the value is a collection of results.
  • The cache is stored in a reference table and referenced by a hash value. This hash value includes the following factors, namely the query itself, the database currently being queried, the version of the client protocol, and other information that may affect the returned results.
  • If the query can find the key in the cache, then the value corresponding to the key will be returned directly to the client. If there is no hit, subsequent operations in the parsing, optimization and execution stages will need to be performed, and it will also be cached after execution. (Any differences in characters, such as spaces, comments, etc., will cause cache misses.)

3.1.2 Why was it deleted?

  1. When writing or updating data for a table, all caches of the corresponding table will be invalidated.
  2. For databases with heavy update pressure, the hit rate of the query cache will be very low. Unless your business has a static table that will only be updated once a long time.
  3. You must check whether the cache is hit before querying, which wastes computing resources.
  4. If this query can be cached, after the execution is completed, MySQL finds that the query does not exist in the query cache, and will store the results in the query cache, which will cause additional system consumption.
  5. If the query cache is large or fragmented, this operation may cause a lot of system consumption.

3.1.3 Situations that will not be cached

  1. Query containing functions NOW() and CURRENT_DATE()
  2. Contains any user-defined functions, stored functions, user commands, temporary tables
  3. System tables in mysql database or tables containing any column permissions
  4. For the InnoDB engine, when a statement modifies a table in a transaction, all queries related to this table cannot be cached before the transaction is committed. Therefore, executing transactions for a long time will greatly reduce the cache hit rate.
  5. When SQL_NO_CACHE is set in the query statement, it will not be cached.
  6. When the query result is greater than the value set by query_cache_limit, the result will not be cached

3.1.4 Query cache execution status

show status like 'Qcache%'

Insert image description here

Cache execution status diagram:

Insert image description here
MySQL provides settings for using the cache on demand. Set the parameter query_cache_type to DEMAND so that the query cache is not used for the default SQL statements.

Specify it explicitly with SQL_CACHE, like the following statement:

select SQL_CACHE * from T where ID = 10

3.2 Parser — parser, analyzer

SQL commands are verified and parsed by the parser when passed to the parser.

  1. The SQL statements are analyzed semantically and grammatically, decomposed into data structures, and then classified according to different operation types, and then forwarded to subsequent steps in a targeted manner. The subsequent delivery and processing of SQL statements are based on this structure.
  2. If an error is encountered during the decomposition, it means that the SQL statement is unreasonable.

Analyzer execution process:

In the analyzer, select from wherethese keywords are extracted and matched through the semantic ruler. MySQL will automatically determine the keywords and non-keywords, and identify the user's matching fields and custom statements. Some verifications will also be done at this stage: for example, verifying whether the user table exists in the current database. At the same time, if the userId field does not exist in the User table, an error will also be reported:unknown column in field list.

3.3 Optomizer query optimizer

The SQL statement will use the query optimizer to optimize the query before querying.

It is to optimize the query (sql statement) requested by the client. Based on the query statement requested by the client and some statistical information in the database, it is analyzed based on a series of algorithms to derive an optimal strategy and inform the subsequent program. How to get the result of this query statement.

For example, when there are multiple indexes in the table, decide which index to use; or when a statement has multiple table associations (join), decide the connection order of each table.

Optimize sql statements, make optimal choices based on the execution plan (explain), match appropriate indexes, and select the best execution plan

4. Engine layer

The storage engine is really responsible for the storage and retrieval of data in MySQL. The server communicates with the storage engine through API. Different storage engines have different functions, so we can choose according to our actual needs.

The most important feature that distinguishes MySQL from other databases is the plug-in table storage engine. MySQL plug-in storage engine architecture provides standard management and service support

The storage engine is based on tables rather than databases

5. Storage layer

It mainly stores data on the file system running on the raw device and completes the interaction with the storage engine.

6. Mysql storage engine

6.1 InnoDB storage engine

Starting from mysql database version 5.5.8, the innoDB storage engine is the default storage engine:

  • support affairs
  • Row lock design, support foreign keys
  • innoDB puts all data in a logical table space, and this table space is managed by InnoDB itself like a black box.
  • InnoDB achieves high concurrency through multi-version control (MVCC) and implements the 4 isolation levels of the SQL standard.
  • InnoDB provides high-performance and high-availability functions such as insert buffer, double write, adaptive hash index, and read ahead.
  • Data storage adopts aggregation mode. If the primary key is not specified, a 6-byte ROWID will be generated for the table as the primary key.

6.2 MylSAM storage index

It is the default storage engine before mysql5.5.8 version

  • Transaction and table lock designs are not supported
  • Support full text indexing
  • Buffering only caches index files, not data files.
  • The MyISAM storage engine is composed of MYD and MYI. MYD stores data files and MYI stores index files.

6.3 NDB storage engine

NDB is a cluster storage engine. Its characteristic is that all data is stored in memory, so the primary key search speed is extremely fast.

6.4 Memory storage engine

The Memory storage engine (formerly known as the HEAP storage engine) places table data in memory. If the database restarts or crashes, the data in the table will disappear. Suitable for temporary tables used to store temporary data, as well as latitude tables in data warehouses.

Only supports table locks, poor concurrency performance, and does not support TEXT and BLOLB types

6.5 Federated storage engine

Insert image description here

7. Summary

Comparative item InnoDB MyISAM
primary foreign key support not support
affairs support not support
Row and table locks Row lock. Only one row is locked during operation, and other rows are not affected. Suitable for high concurrency Table lock. Operating on one row also locks the entire table. Not suitable for high concurrency
cache Cache index, real data, higher memory requirements Cache only index
table space big Small
focus point affairs performance

File suffix of storage engine:

Insert image description here

Mysql logical architecture analysis

MySQL logical architecture

Chapter 04 Logical Architecture

Guess you like

Origin blog.csdn.net/sco5282/article/details/132626455