MySQL technical insider InnoDB storage engine study notes Chapter 1 MySQL architecture and storage engine

MySQL can run on almost all operating systems. Although various systems have different implementations at the bottom (such as threads), MySQL can almost guarantee the consistency of the physical architecture on each platform.

Terminology:
1. Database: A collection of physical operating system files or other file types. MySQL database files can be files ending in frm, myd, myi, and ibd. When using NDB, the database file may not be a file on the operating system, but a file stored in memory.
2. Database instance: It is composed of database background processes/threads and shared memory area. The shared memory area can be shared by running background processes/threads. Database power is really used to manipulate database files.

Instances and databases in MySQL usually have a one-to-one correspondence, but in the case of a cluster, a database can be used by multiple instances.

MySQL is a single-process and multi-threaded database, similar to SQLserver, but different from Oracle's multi-process architecture (but the Windows version of Oracle is also a single-process multi-threaded architecture).

Check whether the MySQL process is started:

ps -ef | grep mysqld

When starting the instance, MySQL will read the configuration file and start the database instance according to the parameters. This is similar to the oracle parameter file (spfile), but when Oracle does not have a parameter file, it will prompt that the parameter file cannot be found and the startup fails. In MySQL There is no configuration file, and the instance will be started according to the default parameter settings during compilation.

Check where MySQL reads the configuration file:

mysql --help | grep my.cnf

Run it:
Insert picture description here
MySQL reads the configuration files in the order listed above, from the beginning to the end. If multiple configuration files have the same parameter, the last occurrence of the parameter shall prevail.

The configuration file under Linux is generally placed in /etc/my.cnf; under Windows, the suffix of the configuration file can be either .cnf or .ini. mysql -helpYou can also find the reading location of the configuration file when running under Windows .

There is a datadir parameter in the configuration file, which specifies the path of the database. Under Linux, this parameter defaults to /usr/local/mysql/data. Check the current datadir path:

SHOW VARIABLES LIKE 'datadir'\G;

Run it:
Insert picture description here

The function of \G is to display the results vertically in rows.

The user (directory owner is mysql) and permissions of datadir must be guaranteed, and only mysql users and groups can access.

For MySQL, a database is a collection of data organized in accordance with a certain data model and stored in secondary storage (such as hard disks, CDs); a database instance is an application program, and any user operations on database data are completed through the database instance.

Insert picture description here
From the above figure, MySQL is composed of the following parts:
1. Connection pool component.
2. Management services and tool components.
3. SQL interface components.
4. Query analyzer component.
5. Optimizer components.
6. Cache component.
7. Plug-in storage engine.
8. Physical files.

The storage engine is based on tables rather than databases.

MySQL's plug-in storage engine provides a series of standard management and service support, these standards have nothing to do with the storage engine itself, the storage engine is only the realization of the underlying physical structure.

Each storage engine has its own characteristics, and different storage engine tables are established according to specific applications. For developers, storage engines are transparent, but it is good for developers to understand the differences between different storage engines.

MySQL is open source. You can write your own storage engine based on MySQL's predefined storage engine interface or modify the source code for a certain unsatisfactory storage engine.

Storage engines are divided into official storage engines and third-party storage engines. InnoDB started as a third-party storage engine. It has been acquired by Oracle and is now the most widely used storage engine in OLTP (online transaction processing) applications.

InnoDB supports transactions, mainly for OLTP applications. It features row lock design, supports foreign keys, and supports non-locking reads similar to Oracle.

InnoDB puts data in a logical table space. Starting from MySQL 4.1, it can put each InnoDB storage engine table into a separate ibd file. Similar to Oracle, InnoDB can use raw devices to create its tablespace.

InnoDB uses multi-version concurrency control (MVCC) to achieve high concurrency, and implements the four isolation levels of the SQL standard. The default is repeatable read (REPEATABLE READ). Unlike standard SQL, InnoDB is at the REPEATABLE READ transaction isolation level. Using the Next-Key Lock algorithm, to a certain extent, avoid the generation of phantom reading. InnoDB also provides high-performance functions such as insert buffering, secondary writing, adaptive hash indexing, and pre-reading.

The data storage in InnoDB tables uses an aggregation method, similar to Oracle’s index organized table (IOT). The storage of each table is stored in the order of the primary key. If the primary key is not explicitly specified when the table is defined, InnoDB will generate a six-byte ROWID for each row as the primary key.

MyISAM is the storage engine officially provided by MySQL. It does not support transactions, supports table locks and full-text indexing, and has a fast operation speed for OLAP (Online Analytical Processing).

MyISAM table is composed of MYD and MYI. MYD stores data files and MYI stores indexes. The data files can be further compressed by the myisampack tool. This tool uses Huffman encoding to compress data, so the compressed table is read-only, and you can also use this The tool decompresses the data file.

Before MySQL 5.0, MyISAM supports 4G tables by default. If you need to support MyISAM tables larger than 4G, you need to specify the MAX_ROWS and AVG_ROW_LENGTH attributes. Starting from MySQL 5.0, MyISAM supports 256T single table data by default.

For MyISAM tables, MySQL only caches its index files, and the data file caching is done by the operating system itself, which is different from most databases that use LRU (Least Recently Used) algorithm to cache data. Before MySQL 5.1.23, no matter on 32-bit or 64-bit systems, the maximum buffer size of the cache index can only be set to 4G. In the later versions, 64-bit systems can support index buffers larger than 4G.

MySQL AB acquired the NDB cluster engine from Sony Ericsson, which is the Cluster engine in the figure above. It is similar to Oracle's RAC cluster, but is different from Oracle RAC share everything. Its structure is a share nothing cluster architecture, which can provide higher Level of high availability. The characteristic of NDB is that all data is stored in memory (from MySQL 5.1, non-indexed data can be stored on disk), and the primary key search speed is extremely fast. By adding NDB data storage nodes, database performance can be linearly improved.

NDB's JOIN operation is completed at the MySQL database layer, not at the storage engine layer, which means that complex connection operations require huge network overhead and query speed is very slow.

The Memory storage engine (previously known as the HEAP storage engine) stores data in memory. If the database restarts or crashes, the data in the table will disappear. Temporary tables suitable for storing temporary data and dimension tables in the data warehouse (such as movie tables filled with actors) The actor id in the table is not all the information of the actor, the actor table is the dimension table). It uses hash index instead of B+ tree index by default.

The Memory engine is very fast, but only supports table locks, has poor concurrency performance, and does not support TEXT and BLOB column types, and stores variable-length fields (varchar) in a fixed-length field (char) method, which will waste memory (There is a solution).

MySQL uses the Memory storage engine as a temporary table to store the intermediate result set of the query. If the intermediate result set is larger than the capacity setting of the Memory table or the intermediate result contains a TEXT or BLOB column type field, MySQL will convert it to a MyISAM table and store it on disk. MyISAM tables do not cache data files, so the performance of temporary tables will be low at this time.

The Archive storage engine only supports INSERT and SELECT operations. Starting from MySQL 5.1, it supports indexes. It uses the zlib algorithm to compress data rows and stores them. The compression ratio is generally up to 1:10, which is suitable for data archiving. The Archive engine uses row locks to implement highly concurrent insert operations, but it is not a transaction-safe storage engine. The design goal is to provide high-speed insert and compression functions.

The Federated storage engine does not store data and points to a table on a remote MySQL database server. Similar to SqlServer's linked server and Oracle's transparent gateway, but Federated engine only supports MySQL database tables, not heterogeneous database tables.

The Maria storage engine is newly developed. The design goal is to replace the MyISAM storage engine as the MySQL default storage engine. The developer is one of the founders of MySQL and can be regarded as a follow-up version of MyISAM. Features are caching data and index files, line lock design, providing MVCC function, supporting transaction and non-transaction security options, and better BLOB character type processing performance.

Insert picture description here
Many storage engines do not support transactions. The database principle book mentions that the biggest difference between databases and traditional file systems is that databases support transactions, but MySQL believes that not all applications need transactions, so there are engines that do not support transactions.

View the storage engines supported by MySQL:

SHOW ENGINES;

Run it:
Insert picture description here
You can also check the ENGINES table under the information_schema architecture: the
Insert picture description here
same amount of data, the size of the table: InnoDB> MyISAM> Archive.

MySQL connection is the connection process and the database instance to communicate.

When connecting to MySQL via TCP/IP, the general client and MySQL instance are on different servers:

mysql -h192.168.0.101 -u david -p

The above example indicates that a TCP/IP connection request is initiated to the MySQL instance with the Host IP of 192.168.0.101.

When connecting to a MySQL instance via TCP/IP, MySQL will first check a permission view to determine whether the requesting client IP is allowed to connect to the MySQL instance. This view is under the mysql library and the table name is user:
Insert picture description here
visible from the table above, allowed David connects to this instance under any IP segment, and does not require a password. The above table also shows the access control authority of the root user under each network segment.

If two processes that need to communicate on Windows are on the same server, you can use named pipes. The local connection after SQL server is installed by default also uses named pipes. If MySQL uses named pipes, you need to enable the -enable-named-pipe option in the configuration file. After MySQL 4.1, MySQL provides a shared memory connection method. You need to add –shared-memory in the configuration file. When the client connects, you also need to use the -protocol=memory option.

Under Linux and Unix, Unix domain sockets can be used. It is not a network protocol and can only be used when the MySQL client and the database instance are on the same server. The path of the socket file can be specified in the configuration file, such as -socket=/tmp/mysql.sock, after starting the database instance, check the Unix domain socket file:

SHOW VARIABLES LIKE 'socket';

Run it:
Insert picture description here

After knowing the path of the domain socket file, you can connect in this way:

mysql -udavid -S /tmp/mysql.sock

Guess you like

Origin blog.csdn.net/tus00000/article/details/111933999