System Libraries in MySQL

1.1. Introduction to System Library

MySQL has several system databases. These databases contain some information required during the running of the MySQL server and some running status information. Let's take a look at it now.

performance_schema

This database mainly stores some status information during the operation of the MySQL server, which can be regarded as a performance monitoring of the MySQL server. Including statistics of which statements have been executed recently, how long each stage of the execution process took, memory usage, and other information.

information_schema

This database holds information about all other databases maintained by the MySQL server, such as which tables, which views, which triggers, which columns, and which indexes. These are some descriptive information called metadata.

sys

This database combines information_schema and performance_schema in the form of a view, so that programmers can more easily understand some performance information of the MySQL server.

mysql

It mainly stores MySQL user account and authority information, as well as some stored procedure and event definition information, some log information generated during operation, some help information and time zone information, etc.

1.2.performance_schema

1.2.1. What is performance_schema

MySQL's performance_schema is a feature that runs at a lower level and is used to monitor resource consumption and resource waiting during the running of MySQL Server. It has the following characteristics.

Running at a lower level: The collected things are relatively low-level, such as disk files, table I/O, table locks, and so on.

• performance_schema provides a way to check the internal execution of the Server in real time while the database is running. The tables in the performance_schema database use the performance_schema storage engine. The database mainly focuses on performance-related data during the operation of the database.

• performance_schema monitors the internal execution of the server by monitoring the events of the server. "Events" are anything done in the internal activities of the server and the corresponding time consumption. Use this information to determine where the relevant resources in the server are consumed. In general, an event can be a function call, an operating system wait, a stage of SQL statement execution [such as the parsing (parsing) or sorting (sorting) stage in the process of SQL statement execution], or a collection of the entire SQL statement. Collecting events can conveniently provide information about synchronous invocation of resources such as disk files, table I/O, and table locks by related storage engines in the server.

• The information recorded in the current active events, historical events, and event summary related tables can provide the execution times and usage duration of an event, which can then be used to analyze the events associated with a specific thread or a specific object (such as mutex or file) Activity.

• The performance_schema storage engine uses "detection points" in the Server source code to collect event data. There is no separate thread associated with instrumentation for code in the performance_schema implementation mechanism itself, unlike other functions such as replication or the event scheduler.

The collected event data is stored in tables in the performance_schema database. For these tables, you can use the SELECT statement to query, or you can use the SQL statement to update the table records in the performance_schema database (for example, dynamically modify the configuration table starting with "setup_" of the performance_schema, but it should be noted that the change of the configuration table will take effect immediately, which will affect data collection).

• The data in the performance_schema table will not be persistently stored on the disk, but will be stored in memory. Once the server is restarted, the data will be lost (all data under the entire performance_schema including the configuration table).

1.2.2. performance_schema use

Through the above introduction, I believe you have a clearer understanding of what is performance_schema. Let's start to introduce the use of performance_schema.

1.2.3. Check whether the current database version supports

performance_schema is considered a storage engine, and if available, it should be in

In the INFORMATION_SCHEMA.ENGINES table or the output of the show engines statement, you can see that its Support field value is YES, as shown below.

select * from INFORMATION_SCHEMA.ENGINES;
show engines;

When we see that the value of the Support field corresponding to performance_schema is YES, it means that the current database version supports performance_schema. But if it is confirmed that the database instance supports the performance_schema storage engine, can it be used? NO, unfortunately, performance_schema is not enabled by default in MySQL 5.6 and earlier versions, and is only modified to be enabled by default in MySQL 5.7 and later versions.

After mysqld starts, use the following statement to check whether the performance_schema is enabled (the value of ON indicates that performance_schema has been successfully initialized and is ready for use; the value of OFF indicates that some errors occurred when performance_schema was enabled, and you can check the error log for troubleshooting).

show variables like 'performance_schema';

(If you want to explicitly enable or disable performance_schema , you need to use the parameter performance_schema=ON|OFF to set it and configure it in my.cnf . Note : This parameter is a read-only parameter and needs to be set before the instance starts to take effect)

Now, you can find out which tables exist by querying the metadata related to the performance_schema storage engine in the INFORMATION_SCHEMA.TABLES table, or using the show tables statement under the performance_schema library.

Use the show tables statement to query which performance_schema engine tables exist.

Now, we know that in the current version, there are 87 tables under the performance_schema library,

So what data are these tables used to store? How do we use them to query data? Let's first look at how these tables are classified.

1.2.4. Classification of performance_schema table

The tables under the performance_schema library can be grouped according to different dimensions of monitoring, for example: grouping according to different database objects, grouping according to different event types, or grouping according to event types, and then further grouping according to account, host, program, thread, Users etc. are segmented.

The following describes the tables that record performance event data grouped by event type.

• Statement event record table: A table that records statement event information, including: events_statements_current (current statement event table), events_statements_history (historical statement event table), events_statements_history_long (long statement historical event table), and some summary tables (aggregated summary tables) . Among them, the summary table can be subdivided according to account (account), host (host), program (program), thread (thread), user (user) and global (global).

show tables like 'events_statement%';

• Wait event log table: Similar to the statement event log table.

show tables like 'events_wait%';

• Stage event record table: a table that records statement execution stage events, similar to the statement event record table.

show tables like 'events_stage%';

• Transaction event log table: A table that records events related to transactions, similar to the statement event log table.

show tables like 'events_transaction%';

• Tables that monitor file system layer calls:

show tables like '%file%';

• Tables to monitor memory usage:

show tables like '%memory%';

• Configuration table for dynamically configuring performance_schema:

show tables like '%setup%';

Now, we have an overview of the classification of the main tables in the performance_schema, but how to use these tables to provide performance event data?

1.2.5.simple configuration and use of performance_schema

When the database initialization is completed and started, not all instruments (in the configuration table of the collection configuration item, each item has a switch field, or YES, or NO) and consumers (similar to the collection configuration item, also have a The corresponding event type saves the table configuration item, YES means that the corresponding table saves performance data, and NO means that the corresponding table does not save performance data) are enabled, so all events will not be collected by default.

Maybe the event you want to detect is not turned on and needs to be set. You can use the following two statements to open the corresponding instruments and consumers. Let's take configuration monitoring and waiting event data as an example to illustrate.

To turn on the collector configuration item for waiting events, you need to modify the corresponding collector configuration item in the setup_instruments configuration table.

update setup_instruments set enabled='yes',timed='yes' where name like 'wait%';

Turn on the save table configuration item switch for waiting events, and modify the corresponding configuration item in the setup_consumers configuration table.

update setup_consumers set enabled='yes' where name like 'wait%';

After configuration, we can check what the server is currently doing. You can know by querying the events_waits_current table, each thread in this table contains only one row of data, which is used to display the latest monitoring events (what you are doing) of each thread.

Each thread in the _current table only keeps one record, and once the thread finishes its work, the event information of the thread will no longer be recorded in the table. The history table records the event information of each thread that has been executed, but only 10 event information of each thread is recorded, and any more will be overwritten. * The event information of all threads is recorded in the history_long table, but the total number of records is 10,000 rows, which will be overwritten.

The summary table provides summary information for all events. The tables in this group summarize event data in different ways (for example: by user, by host, by thread, etc.).

1.2.6. View recently failed SQL statements

Using the code to perform certain operations on the database (for example: using the ORM framework of Java to operate the database) reports a syntax error, but the code does not have the function of recording the text of the SQL statement. Can you view the specific text of the SQL statement in the MySQL database layer? See See if there is a typo? At this time, the first thing most people think of is to check the error log. Unfortunately, the error log does not record syntax errors of SQL statements.

In fact, more detailed information is recorded for the execution status of each statement in the statement event record table of performance_schema, for example: events_statements table and events_statements_summary_by_digest table (events_statements table records all execution error messages of statements, while events_statements_summary_by_digest table only records Statistical information is recorded for the statement that errors occurred during the execution of the statement, and the specific error type is not recorded, for example, information about syntax errors is not recorded). Let's see how to use these two tables to query the error statement information.

First, we simulate a syntax error SQL statement, and use the events_statements_history_long table or events_statements_history table to query the syntax error SQL statement:

Then, query the record with error number 1064 in the events_statements_history table

select * from events_statements_history where mysql_errno=1064\G

If you don’t know the error number, you can query the statement records whose number of errors is not 0, and find the SQL_TEXT and MESSAGE_TEXT fields in it (it is the one whose prompt message is a syntax error).

1.2.7. View the latest transaction execution information

We can query the total execution time of a statement through the slow query log, but if there are some large transactions in the database that are rolled back during execution, or are abnormally terminated during execution, the slow query log will not be able to help at this time. At this time, we can use the events_transactions_* table of performance_schema to view the records related to the transaction. In these tables, it is recorded in detail whether a transaction has been rolled back, active (transactions that have not been committed for a long time are also active transactions) or committed.

First of all, it needs to be configured and enabled. Transaction events are not enabled by default.

update setup_instruments set enabled='yes',timed='yes' where name like 'transaction%';
update setup_consumers set enabled='yes' where name like '%transaction%';

Now we open a new session (Session 2) to execute the transaction and simulate a transaction rollback.

Query active transactions. Active transactions represent transaction events currently being executed and need to be queried from the events_transactions_current table.

In the figure below, you can see that there is a record representing the current active transaction event.

Rollback transaction in session 2:

Query the current table of transaction events (events_transactions_current) and the history table of transaction events (events_transactions_history)

It can be seen that a row of transaction event information is recorded in both tables, and the thread whose thread ID is 30 executes a transaction, and the transaction status is ROLLED BACK.

But when we close session 2, the records in the current table of transaction events (events_transactions_current) disappear.

To query, you need to check in the (events_transactions_history_long) table

1.2.8. Summary

Of course, the use of performance_schema is more than what we mentioned above. It can also provide, for example, viewing the execution phase and progress information of SQL statements, the replication function under MySQL cluster, viewing the details of replication error reports, and so on.

For details, please refer to the official website: MySQL :: MySQL 5.7 Reference Manual :: 25 MySQL Performance Schema

1.3.sys system library

1.3.1.sys instructions

The sys system library supports MySQL 5.6 or higher versions, but does not support MySQL 5.5.x and lower versions.

The sys system library is usually provided for professional DBA personnel to troubleshoot some specific problems, and the various queries involved in it will more or less have a certain impact on performance.

Because the sys system library provides some views instead of directly accessing performance_schema, performance_schema must be enabled (set the performance_schema system parameter to ON), and most functions of the sys system library can be used normally.

At the same time, to fully access the sys system library, the user must have administrator privileges for the following databases.

Some features of performance_schema must be enabled if you want to fully use the features of the sys system library. for example:

Enable all wait instruments:

CALL sys.ps_setup_enable_instrument('wait');

Enable the current table for all event types:

CALL sys.ps_setup_enable_consumer('current');

Note: The default configuration of performance_schema can satisfy most of the data collection functions of the sys system library. There is a performance hit to enabling all required features, so it is best to enable only the required configurations.

1.3.2.sys system library use

If you use the USE statement to switch the default database, you can directly use the views under the sys system library to query, just like querying the tables under a certain library. You can also use db_name.view_name, db_name.procedure_name, db_name.func_name, etc. to access objects in the sys system library without specifying a default database (this is called a name-qualified object reference).

There are many views under the sys system library, which perform aggregate calculation and display on the performance_schema table in various ways. Most of these views appear in pairs, two views have the same name, but one view is prefixed with x​

host_summary_by_file_io和 x$host_summary_by_file_io

Represents file I/O performance data based on host summary statistics. The data sources accessed by the two views are the same, but in the statement for creating the view, the view without the x prefix displays the data after the unit conversion of the relevant value (units are milliseconds, seconds, minutes, hours, days, etc.), and views with a ​ prefix display raw data (units are picoseconds).

1.3.3. Check where the slow SQL statement is slow

If we frequently find that a statement is executed slowly in the slow query log, and the reason cannot be found in the table structure, index structure, and statistical information, we can use the trump card in the sys system library: sys.session view combined with performance_schema of waiting events to find out what's wrong. So what is the use of the session view? Use it to view the process list information of the current user session and see what the current process is doing. Note that this view only appears in MySQL 5.7.9.

First, you need to enable functions related to waiting events:

call sys.ps_setup_enable_instrument('wait');
call sys.ps_setup_enable_consumer('wait');

Then simulate it:

Execute in a session

select sleep(30);

In another session, query in the sys library:

select * from session where command='query' and conn_id !=connection_id()\G

Query table addition, deletion, modification, query data volume and I/O time-consuming statistics

select * from schema_table_statistics_with_buffer\G

1.3.4. Summary

In addition, through sys, you can also query and view hot data in the InnoDB buffer pool, check whether there are transaction locks waiting, check unused, redundant indexes, check which statements use full table scans, and so on.

For details, please refer to the official website: MySQL :: MySQL 5.7 Reference Manual :: 26 MySQL sys Schema

1.4.information_schema

1.4.1. What is information_schema

information_schema provides access to database metadata, statistical information, and information about MySQL Server (for example: database name or table name, field data type and access rights, etc.). The information stored in this library can also be called MySQL's data dictionary or system catalog.

There is an independent information_schema in each MySQL instance, which is used to store the basic information of all other databases in the MySQL instance. The information_schema library contains multiple read-only tables (non-persistent tables), so there are no corresponding associated files in the data directory on the disk, and triggers cannot be set for these tables. Although you can use the USE statement to set the default database as information_schema when querying, all tables under this database are read-only, and data change operations such as INSERT, UPDATE, and DELETE cannot be performed.

Query operations for tables under information_schema can replace some SHOW query statements (for example: SHOW DATABASES, SHOW TABLES, etc.).

Note: Depending on the version of MySQL , the number and storage of tables are different. There are a total of 59 tables in the MySQL 5.6 version , and a total of 61 tables under the schema in the MySQL 5.7 version .

In MySQL 8.0 , the data dictionary tables (including some temporary tables of the original Memory engine) under the schema are migrated to the mysql schema , and these data dictionary tables are hidden under the mysql schema and cannot be accessed directly. They need to be accessed through the information_schema The table with the same name is accessed.

All tables under information_schema use Memory and InnoDB storage engines, and are temporary tables, not persistent tables. These data will be lost after the database is restarted. Among the 4 system libraries of MySQL, information_schema is also the only system library that does not have directories and files corresponding to library tables on the file system.

1.4.2.information_schema table classification

Statistical information dictionary table of server layer

(1)COLUMNS

• Provide column (field) information in the query table.

(2)KEY_COLUMN_USAGE

• Provides query for which index columns have constraints.

• The information in this table includes constraint information such as primary key, unique index, foreign key, etc., for example: the column name of the library table where it is located, the column name of the referenced library table, etc. The information in this table is somewhat similar to the information recorded in the TABLE_CONSTRAINTS table, but the TABLE_CONSTRAINTS table does not record the library table column information referenced by constraints, while the KEY_COLUMN_USAGE table records the constraint types that are not in the TABLE_CONSTRAINTS table.

(3)REFERENTIAL_CONSTRAINTS

• Provides the query with some information about foreign key constraints.

(4)STATISTICS

• Provide query statistics about indexes, an index corresponds to a row of records.

(5)TABLE_CONSTRAINTS

• Provide query information about constraints associated with tables.

(6)FILES

• Provide query information related to MySQL data tablespace files.

(7)ENGINES

• Provide information about query engines supported by MySQL Server.

(8)TABLESPACES

• Provide query information about the active table space (mainly record the table space information of the NDB storage engine).

• Note: This table does not provide tablespace information about the InnoDB storage engine. For metadata information about InnoDB tablespaces, please query the INNODB_SYS_TABLESPACES and INNODB_SYS_DATAFILES tables. In addition, starting from MySQL 5.7.8, the INFORMATION_SCHEMA.FILES table also provides metadata information for querying InnoDB tablespaces.

(9)SCHEMATA

• Provide query information about the database list in MySQL Server, a schema represents a database.

The table-level object dictionary table of the server layer

(1)VIEWS

• Provides information about views in the query database. The account that queries this table needs to have the show view permission.

(2)TRIGGERS

• Provide query information about triggers under a certain database.

(3)TABLES

• Provides for querying basic information related to tables within the database.

(4)ROUTINES

• Provides query information about stored procedures and stored functions (excluding user-defined functions). The information in this table corresponds to the information recorded in mysql.proc (if there is a value in the table).

(5)PARTITIONS

• Provides query information about partitioned tables.

(6)EVENTS

• Provides for querying information related to scheduled task events.

(7)PARAMETERS

• Provides parameter information about stored procedures and functions, and information about return values ​​of stored functions. The parameter information is similar to the content recorded in the param_list column in the mysql.proc table.

The mixed information dictionary table of the server layer

(1)GLOBAL_STATUS、GLOBAL_VARIABLES、SESSION_STATUS、

SESSION_VARIABLES

• Provide query global, session-level state variables and system variable information.

(2)OPTIMIZER_TRACE

• Provides information produced by the optimizer trace function.

• The trace function is disabled by default, use the optimizer_trace system variable to enable the trace function. If this function is enabled, each session can only trace the statements executed by itself, and cannot see the statements executed by other sessions, and each session can only record the last traced SQL statement.

(3)PLUGINS

• Provides query information about which plugins are supported by MySQL Server.

(4)PROCESSLIST

• Provides to query some status information about the running process of the thread.

(5)PROFILING

• Provides query information about statement profiling. Its record content corresponds to the information generated by SHOW PROFILES and SHOW PROFILE statements. This table will only record statement performance analysis information when the session variable profiling=1, otherwise this table will not record.

• Note: Starting from MySQL 5.7.2, this table is no longer recommended. It will be deleted in future MySQL versions and replaced by Performance Schema.

(6)CHARACTER_SETS

• Provide the available character sets supported by the query MySQL Server.

(7)SNACKS

• Provides available collation rules supported by query MySQL Server.

(8)COLLATION_CHARACTER_SET_APPLICABILITY

• Provide query for which character set in MySQL Server applies to which collation rules. The query result set is equivalent to the first two field values ​​of the result set obtained from SHOW COLLATION. So far, the table has not been found to have much effect.

(9)COLUMN_PRIVILEGES

• Provide query permission information about columns (fields). The content in the table comes from the mysql.column_priv column permission table (there will be content only after the columns of a table are authorized separately).

(10)SCHEMA_PRIVILEGES

• Provide query information about library-level permissions. Each type of library-level permission records a row of information. The information in this table comes from the mysql.db table.

(11)TABLE_PRIVILEGES

• Provide query information about table-level permissions, the content of which comes from the mysql.tables_priv table.

(12)USER_PRIVILEGES

• Provide information for querying global permissions. The information in this table comes from the mysql.user table.

10.2.4 System dictionary table of InnoDB layer

(1)INNODB_SYS_DATAFILES

• Provide metadata for querying all InnoDB tablespace type files (tablespace ID used internally and path information of tablespace files), including independent tablespace, regular tablespace, system tablespace, temporary tablespace and undo space (if enabled If you set up an independent undo space).

• The information in this table is equivalent to the information in the SYS_DATAFILES table inside the InnoDB data dictionary.

(2)INNODB_SYS_VIRTUAL

• Provide query metadata information about InnoDB virtual generated columns and associated columns, which is equivalent to the information of the SYS_VIRTUAL table inside the InnoDB data dictionary. The row information shown in this table is the information for each column associated with the virtual generated column.

(3)INNODB_SYS_INDEXES

• Provide query metadata information about InnoDB indexes, which is equivalent to the information in the SYS_INDEXES table inside the InnoDB data dictionary.

(4)INNODB_SYS_TABLES

• Provide query metadata information about InnoDB tables, which is equivalent to the information of the SYS_TABLES table inside the InnoDB data dictionary.

(5)INNODB_SYS_FIELDS

• Provide query metadata information about InnoDB index key columns (fields), which is equivalent to the information in the SYS_FIELDS table inside the InnoDB data dictionary.

(6)INNODB_SYS_TABLESPACES

• Provide query metadata information about InnoDB independent tablespaces and common tablespaces (including full-text index tablespaces), which is equivalent to the information of the SYS_TABLESPACES table inside the InnoDB data dictionary.

(7)INNODB_SYS_FOREIGN_COLS

• Provide query status information about InnoDB foreign key columns, which is equivalent to the internal InnoDB data dictionary

SYS_FOREIGN_COLS table information.

(8)INNODB_SYS_COLUMNS

• Provide query metadata information about InnoDB table columns, which is equivalent to the internal InnoDB data dictionary

SYS_COLUMNS table information.

(9)INNODB_SYS_FOREIGN

• Provide query metadata information about InnoDB foreign keys, which is equivalent to the information of the SYS_FOREIGN table inside the InnoDB data dictionary.

(10)INNODB_SYS_TABLESTATS

• Provides a view for querying lower-level state information about InnoDB tables. The MySQL optimizer uses these statistics data to calculate and determine which index to use when querying InnoDB tables. This information is held in data structures in memory that do not correspond to data stored on disk. There is no corresponding system table in InnoDB.

Locks, transactions, statistical information dictionary tables of the InnoDB layer

(1)INNODB_LOCKS

• Provide query information about locks that are being requested by transactions in the InnoDB engine and blocked by other transactions at the same time (that is, lock information that does not wait for locks between different transactions, which cannot be viewed here. For example, when there is only one transaction, it cannot View the lock information added by the transaction). The content in this table can be used to diagnose lock contention information under high concurrency.

(2)INNODB_TRX

• Provide information about querying each transaction (excluding read-only transactions) currently executed in the InnoDB engine, including whether the transaction is waiting for a lock, when the transaction starts, and the text information of the SQL statement being executed by the transaction, etc. (if there is SQL sentence).

(3)INNODB_BUFFER_PAGE_LRU

• Provides information about pages in the query buffer pool. Unlike the INNODB_BUFFER_PAGE table, the INNODB_BUFFER_PAGE_LRU table holds information about how pages in the InnoDB buffer pool are entered into the LRU list, and which pages need to be evicted from the buffer pool when it is not full.

(4)INNODB_LOCK_WAITS

• Provide lock waiting information for querying InnoDB transactions. If the queried table is empty, it means that there is no lock waiting information; if there are records in the queried table, it means that there is a lock waiting, and each row record in the table represents a lock waiting relationship. A lock waiting relationship includes: a transaction waiting for a lock (that is, requesting a lock) and information such as the lock it is waiting for, and a transaction holding a lock (here refers to the lock that the lock waiting transaction is requesting) and information about the locks it holds.

(5)INNODB_TEMP_TABLE_INFO

• Provide query information about InnoDB temporary tables created by users who are currently active in the InnoDB instance (only valid for users who have established connections, and the temporary tables corresponding to disconnected user connections will be automatically deleted). It does not provide information about internal InnoDB temporary tables used by the query optimizer. The table is created on first query.

(6)INNODB_BUFFER_PAGE

• Provides queries for information about pages in the buffer pool.

(7)INNODB_METRICS

• It provides more detailed performance information for querying InnoDB, which is a supplement to InnoDB's performance_schema. By querying this table, it can be used to check the overall health of InnoDB, and can also be used to diagnose performance bottlenecks, resource shortages, and application problems.

(8)INNODB_BUFFER_POOL_STATS

• Provide query status information in some InnoDB buffer pools. The information recorded in this table is similar to the buffer pool statistical information output by the SHOW ENGINEINNODB STATUS statement. In addition, some status variables of the InnoDB buffer pool also provide some of the same values.

The full-text index dictionary table of the InnoDB layer

(1)INNODB_FT_CONFIG

(2)INNODB_FT_BEING_DELETED

(3)INNODB_FT_DELETED

(4)INNODB_FT_DEFAULT_STOPWORD

(5)INNODB_FT_INDEX_TABLE

Compression-related dictionary tables of the InnoDB layer

(1)INNODB_CMP和INNODB_CMP_RESET

• The data in these two tables contains operational status information related to compressed InnoDB table pages. The data recorded in the table provides a reference for measuring the effectiveness of InnoDB table compression in the database.

(2)INNODB_CMP_PER_INDEX和INNODB_CMP_PER_INDEX_RESET

• These two tables record the operating status information related to InnoDB compressed table data and indexes, and use different statistical information for each combination of database, table, and index to provide a reference for evaluating the compression performance and practicability of a specific table data.

(3) INNODB_CMPMEM and INNODB_CMPMEM_RESET

• These two tables record the status information of the compressed pages in the InnoDB buffer pool, providing a reference for measuring the effectiveness of InnoDB table compression in the database.

1.4.3. information_schema application

View index column information

The INNODB_SYS_FIELDS table provides metadata information for querying about InnoDB index columns (fields), which is equivalent to the information of the SYS_FIELDS table in the InnoDB data dictionary.

The INNODB_SYS_INDEXES table provides metadata information for queries about InnoDB indexes, which is equivalent to the information in the SYS_INDEXES table inside the InnoDB data dictionary.

The INNODB_SYS_TABLES table provides metadata information for querying about InnoDB tables, which is equivalent to the information of the SYS_TABLES table in the InnoDB data dictionary.

Assume that you need to query related information such as the index column name, composition, and index column order of the InnoDB table order_exp under the lijin library,

Then you can use the following SQL statement to query

SELECT
    t. NAME AS d_t_name,
    i. NAME AS i_name,
    i.type AS i_type,
    i.N_FIELDS AS i_column_numbers,
    f. NAME AS i_column_name,
    f.pos AS i_position
FROM
    INNODB_SYS_TABLES AS t
JOIN INNODB_SYS_INDEXES AS i ON t.TABLE_ID = i.TABLE_ID
LEFT JOIN INNODB_SYS_FIELDS AS f ON i.INDEX_ID = f.INDEX_ID
WHERE
    t. NAME = 'lijin/order_exp';

The columns in the result are all well understood, the only one that requires additional explanation is i_type(INNODB_SYS_INDEXES.type), which is a numeric ID representing the index type:

0 = secondary index

1=cluster index

2 = unique index

3 = primary key index

32 = full text index

64 = spatial index

128 = Secondary index with virtual generated columns.

1.5.mysql system library in Mysql

1.5.1. Permission system table

Because authority management is the responsibility of the DBA, we can get a general understanding of the tables in this part. In the mysql system library, the MySQL access authority system table is placed in the mysql library, mainly including the following tables.

• user: Contains user accounts, global permissions, and other non-privilege lists (Security Configuration field and Resource Control field).

• db: Permission table at the database level. The permission information recorded in this table represents whether the user can use these permissions to access all objects (tables or stored procedures) under the database to which access is granted.

• tables_priv: table-level privilege table.

• columns_priv: table of privileges at the column level.

• procs_priv: Stored procedure and function privilege table.

• proxies_priv: Proxy user privilege table.

hint:

To change the content of the permission table, you should use account management statements (such as: CREATE USER , GRANT , REVOKE* *, etc.) to modify indirectly. It is not recommended to directly use DML statements to modify the permission table. **

(grant, revoke statement will change the relevant records in the permission table after execution, and at the same time update the related objects in the memory that record user permissions. The dml statement directly modifies the permission table only modifies the permission information in the table, and needs to execute flush privileges; to update the memory A related object that holds user permissions)

1.5.2. Statistics table

The persistent statistics function is to store the statistical data in the memory to the disk, so that the statistical information can be quickly re-read when the database is restarted without re-executing the statistical information, so that the query optimizer can use these persistent statistical information Accurately select the execution plan (if there is no persistent statistical information, the statistical information in the memory will be lost after the database is restarted, and the statistical information needs to be recalculated when the database is accessed next time, and the recalculation may be due to Differences in estimates result in changes in the query plan, which in turn results in changes in query performance).

How to enable the persistence function of statistics? When innodb_stats_persistent = ON, the persistent function of statistical information is enabled globally, which is enabled by default.

show variables like 'innodb_stats_persistent';

If you want to turn off the persistent statistics function of a certain table separately, you can modify it through the ALTER TABLE tbl_name STATS_PERSISTENT = 0 statement.

1.5.2.1.innodb_table_stats

The innodb_table_stats table provides query statistics related to table data.

select * from innodb_table_stats where table_name = 'order_exp'\G

database_name: database name.

• table_name: table name, partition name, or subpartition name.

• last_update: Indicates the last time InnoDB updated the statistics row.

• n_rows: Estimated number of data record rows in the table.

• clustered_index_size: The size of the primary key index, estimated in pages.

• sum_of_other_index_sizes: The total size of other (non-primary key) indexes, estimated in pages.

1.5.2.2.innodb_index_stats

The innodb_index_stats table provides statistics related to queries and indexes.

select * from innodb_index_stats where table_name = 'order_exp';

The table fields have the following meanings.

• database_name: database name.

• table_name: table name, partition table name, sub-partition table name.

• index_name: Index name.

• last_update: Indicates the last time InnoDB updated the statistics row.

• stat_name: Statistical information name, and its corresponding statistical information value is stored in the stat_value field.

• stat_value: save the statistical information value corresponding to the stat_name field of the statistical information name.

• sample_size: The number of sample pages for the statistics estimate provided in the stat_value field.

• stat_description: The description of the statistic specified in the stat_name field.

It can be seen from the query data of the table:

• The stat_name field has the following statistical values.

■ size: When the stat_name field is the size value, the stat_value field value indicates the total number of pages in the index.

■ n_leaf_pages: When the stat_name field is n_leaf_pages value, the stat_value field value indicates the number of index leaf pages.

■ n_diff_pfxNN: NN stands for numbers (such as 01, 02, etc.). When the stat_name field is n_diff_pfxNN, the value of the stat_value field indicates the number of unique values ​​in the first column of the index (that is, the first index column of the index, starting from the first column in the index definition sequence). For example: when NN is 01, the value of the stat_value field indicates the number of unique values ​​of the first column of the index; when NN is 02, the value of the stat_value field indicates the number of unique values ​​of the combination of the first and second columns of the index ,So on and so forth. Additionally, in the case of stat_name=n_diff_pfxNN, the stat_description field shows a comma-separated list of computed index statistics fields.

• From the description information "id" of the stat_description field whose index_name field value is PRIMARY data row, it can be seen that the statistical information of the primary key index only includes the columns explicitly specified when the primary key index is created.

• From the description information "insert_time, order_status, expire_time" of the stat_description field whose index_name field value is u_idx_day_status data row, it can be seen that the statistical information of the unique index only includes the columns explicitly specified when creating the unique index.

• From the description information "order_no,id" of the stat_description field whose index_name field value is idx_order_no data row, it can be seen that the statistical information of ordinary indexes (non-unique auxiliary indexes) includes explicitly defined columns and primary key columns.

Note that in the above description, such as leaf pages, the first index column of the index, etc., these things are explained in the index chapter, and will not be elaborated here.

1.5.3. Log table

MySQL's log system includes: ordinary query log, slow query log, error log (records error messages when the server starts, runs, and stops), binary log (logic log that records data changes during server operation), and relay log (records the main library data change log obtained from the main library by the library I/O thread), DDL log (records the metadata change information when the DDL statement is executed. In MySQL 5.7, it only supports writing to files, and in MySQL 8.0 it supports Write to the innodb_ddl_log table. In MySQL5.7, only the general query log and slow query log support writing to the table (also supports writing to the file), and can be saved to the mysql.general_log table and mysql.slow_log by setting log_output=TABLE table, other log types only support writing to files in MySQL 5.7.

1.5.3.1. general_log

The general_log table provides execution record information for querying common SQL statements, and is used to check what SQL statements the client has executed on the server.

Disabled by default

show variables like 'general_log';

turn on

set global log_output='TABLE'; -- 'TABLE,FILE' means to output to the table and file at the same time 
set global general_log=on; 
show variables like 'general_log';

After executing a query arbitrarily

select * from mysql.general_log\G

1.5.3.2. slow_log

The slow_log table provides SQL statements whose query execution time exceeds the setting value of long_query_time, statements that do not use indexes (need to enable the parameter log_queries_not_using_indexes=ON) or management statements (need to enable the parameter log_slow_admin_statements=ON).

show variables like 'log_queries_not_using_indexes';
show variables like 'log_slow_admin_statements';

turn on

set global log_queries_not_using_indexes=on;
set global log_slow_admin_statements=on;
show variables like 'log_queries_not_using_indexes';
show variables like 'log_slow_admin_statements';

We already know that the slow query log can help locate SQL statements that may have problems, so as to optimize the SQL statement level. But the default value is off, we need to open it manually.

show VARIABLES like 'slow_query_log';

set GLOBAL slow_query_log=1;

Turn on 1, turn off 0

But how slow is slow? A threshold can be set in MySQL, and all SQL statements whose running time exceeds this value are recorded in the slow query log. The long_query_time parameter is this threshold. The default value is 10, representing 10 seconds.

show VARIABLES like '%long_query_time%';

Of course you can also set

set global long_query_time=0;

The default is 10 seconds, here is set to 0 for the convenience of demonstration

Then we test it, just write a SQL

select * from mysql.slow_log\G

1.5.4. Statistics in InnoDB

We often used some statistical data when we were talking about the query cost. For example, we can see the statistical data about the table through SHOW TABLE STATUS, and we can see the statistical data about the index through SHOW INDEX. So how do these statistical data come from? ? How are they collected?

1.5.4.1 Statistical data storage method

InnoDB provides two ways of storing statistical data:

Persistent statistics, which are stored on disk, that is, these statistics are still there after the server is restarted.

Non-permanent statistical data, which are stored in memory, are cleared when the server is shut down, and are re-collected in certain appropriate scenarios after the server is restarted .

MySQL provides us with the system variable innodb_stats_persistent to control which method is used to store statistical data. Before MySQL 5.6.6, the value of innodb_stats_persistent was OFF by default, which means that InnoDB statistics were stored in memory by default. In later versions, the value of innodb_stats_persistent was ON by default, that is, statistics were stored on disk by default.

SHOW VARIABLES LIKE 'innodb_stats_persistent';

However, recent MySQL versions basically do not use memory-based non-permanent statistical data, so we will not delve into it.

However, InnoDB collects and stores statistics in units of tables by default, that is to say, we can store the statistics of some tables (and the index statistics of the table) on disk, and store the statistics of other tables in memory. How did you do it? We can specify the statistical data storage method of the table by specifying the STATS_PERSISTENT attribute when creating and modifying the table:

CREATE TABLE 表名 (...) Engine=InnoDB, STATS_PERSISTENT = (1|0);

ALTER TABLE 表名 Engine=InnoDB, STATS_PERSISTENT = (1|0);

When STATS_PERSISTENT=1, it means that we want to permanently store the statistical data of the table on disk, and when STATS_PERSISTENT=0, it means that we want to temporarily store the statistical data of the table in memory. If we do not specify the STATS_PERSISTENT attribute when creating a table, the value of the system variable innodb_stats_persistent is used as the value of the attribute by default.

1.5.4.2 Disk-based persistent statistics

When we choose to store the statistical data of a table and the table index on the disk, we actually store these statistical data in two tables:

SHOW TABLES FROM mysql LIKE 'innodb%';

As you can see, these two tables are located under the mysql system database, where:

innodb_table_stats stores statistical data about the table, and each record corresponds to the statistical data of a table.

innodb_index_stats stores statistical data about the index, and each record corresponds to the statistical data of a statistical item of an index.

innodb_table_stats

Take a look directly at what each column in the innodb_table_stats table does:

database_name database name

table_name table name

last_update The last update time of this record

The number of records in the n_rows table

clustered_index_size The number of pages occupied by the clustered index of the table

The number of pages occupied by other indexes of the sum_of_other_index_sizes table

Let's look directly at the contents of this table:

SELECT * FROM mysql.innodb_table_stats;

The values ​​of several important statistics items are as follows:

The value of n_rows is 10350, indicating that there are about 10350 records in the order_exp table. Note that this data is an estimated value.

The value of clustered_index_size is 97, indicating that the clustered index of the order_exp table occupies 97 pages, and this value is also an estimated value.

The value of sum_of_other_index_sizes is 81, indicating that the other indexes of the order_exp table occupy a total of 81 pages, and this value is also an estimated value.

Collection of n_rows statistical items

InnoDB counts how many rows there are in a table like this:

According to a certain algorithm (not purely random), select several leaf node pages, calculate the number of primary key value records in each page, and then calculate the average number of primary key value records in a page multiplied by the number of all leaf nodes. The n_rows value of the table.

It can be seen that the accuracy of this n_rows value depends on the number of pages sampled during statistics. MySQL uses a system variable named innodb_stats_persistent_sample_pages to control the number of pages sampled when calculating statistics when using permanent statistics. The larger the value is set, the more accurate the calculated n_rows value will be, but the time-consuming statistics will be the longest; the smaller the value is set, the less accurate the calculated n_rows value will be, but the statistics will take less time. Therefore, in actual use, we need to weigh the pros and cons. The default value of this system variable is 20.

By default, InnoDB collects and stores statistical data in units of tables. We can also set the number of sampling pages for a table separately. The setting method is to specify the statistical data of the table by specifying the STATS_SAMPLE_PAGES attribute when creating or modifying the table. Storage method:

CREATE TABLE table name(...) Engine=InnoDB, STATS_SAMPLE_PAGES = specific number of sampling pages;

ALTER TABLE table name Engine=InnoDB, STATS_SAMPLE_PAGES = specific number of sampling pages;

If we do not specify the STATS_SAMPLE_PAGES attribute in the statement that creates the table, the value of the system variable innodb_stats_persistent_sample_pages will be used as the value of the attribute by default.

The collection of clustered_index_size and sum_of_other_index_sizes statistical items involves very specific knowledge of the InnoDB table space and the details of storing page data, so we will not go into details.

innodb_index_stats

Take a look directly at what each column in the innodb_index_stats table does:

desc mysql.innodb_index_stats;

Field name description

database_name database name

table_name table name

index_name index name

last_update The last update time of this record

stat_name The name of the statistical item

The value of the statistical item corresponding to stat_value

sample_size the number of pages sampled for generating statistics

The description of the statistical item corresponding to stat_description

Each record in the innodb_index_stats table represents a statistical item of an index. Maybe this will make you a little confused about what this statistical item refers to. Don't worry, let's take a look at the index statistics of the order_exp table:

SELECT * FROM mysql.innodb_index_stats WHERE table_name = 'order_exp';

First check the index_name column, which indicates which index the record belongs to. From the results, we can see that the PRIMARY index (that is, the primary key) occupies 3 records, and the idx_expire_time index occupies 6 records.

For records with the same index_name column, stat_name indicates the name of the statistical item for the index, stat_value shows the value of the index on the statistical item, and stat_description refers to describing the meaning of the statistical item. Let's take a look at the statistics of an index:

n_leaf_pages: Indicates how many pages the leaf nodes of the index occupy.

size: Indicates how many pages the index occupies in total.

n_diff_pfxNN: Indicates how many unique values ​​the corresponding index column has. The NN in it looks a bit strange, what do you mean?

In fact, NN can be replaced with numbers like 01, 02, 03.... For example, for u_idx_day_status:

n_diff_pfx01 indicates how many unique values ​​in a single column of insert_time are counted.

n_diff_pfx02 indicates how many unique values ​​are combined in the two columns of statistics insert_time and order_status.

n_diff_pfx03 indicates how many unique values ​​are combined in the three columns of statistics insert_time, order_status, and expire_time.

n_diff_pfx04 indicates how many non-repeating values ​​are combined in the four columns of statistics key_pare1, key_pare2, expire_time, and id.

For ordinary secondary indexes, it is not guaranteed that the index column values ​​are unique. For example, for idx_order_no, the key1 column may have many records with repeated values. At this time, only by adding the primary key value to the index column can the secondary index records with the same value of the two index columns be distinguished.

There is no such problem for the primary key and the unique secondary index. They can ensure that the index column values ​​​​are not repeated, so there is no need to count the number of unique values ​​​​added to the primary key value after the index column. Such as u_idx_day_statu and idx_order_no.

When calculating the number of unique values ​​contained in some index columns, it is necessary to sample some leaf node pages, and the sample_size column indicates the number of sampled pages.

For a joint index with multiple columns, the number of pages sampled is: innodb_stats_persistent_sample_pages × number of index columns.

When the number of pages to be sampled is greater than the number of leaf nodes in the index, a full table scan is used directly to count the number of unique values ​​in the index column. So you can see in the query results that the values ​​of the size column corresponding to different indexes may be different.

Update statistics regularly

As we continue to add, delete, and modify the table, the data in the table is also changing, and the statistics in the innodb_table_stats and innodb_index_stats tables are also changing. MySQL provides the following two ways to update statistics:

Turn on innodb_stats_auto_recalc.

The system variable innodb_stats_auto_recalc determines whether the server automatically recalculates statistics. Its default value is ON, that is, the function is enabled by default. Each table maintains a variable, which records the number of records added, deleted, and modified to the table. If the number of changed records exceeds 10% of the table size, and the function of automatically recalculating statistical data is turned on, Then the server will recalculate the statistics and update the innodb_table_stats and innodb_index_stats tables. However, the process of automatically recalculating statistical data occurs asynchronously, that is, even if the number of changed records in the table exceeds 10%, automatic recalculating of statistical data will not occur immediately, and may be delayed for a few seconds before calculation.

Once again, InnoDB collects and stores statistical data in units of tables by default. We can also set the attribute of whether to automatically recalculate statistics for a certain table. The setting method is to specify STATS_AUTO_RECALC when creating or modifying a table Attributes to indicate how statistics are stored for the table:

CREATE TABLE 表名 (...) Engine=InnoDB, STATS_AUTO_RECALC = (1|0);

ALTER TABLE table name Engine=InnoDB, STATS_AUTO_RECALC = (1|0);

When STATS_AUTO_RECALC=1, it means that we want the table to automatically recalculate the statistics. When STATS_AUTO_RECALC=0, it means that we don't want the table to automatically recalculate the statistics. If we do not specify the STATS_AUTO_RECALC attribute when creating the table, the value of the system variable innodb_stats_auto_recalc is used as the value of the attribute by default.

Manually call the ANALYZE TABLE statement to update statistics

If the value of the innodb_stats_auto_recalc system variable is OFF, we can also manually call the ANALYZE TABLE statement to recalculate the statistics. For example, we can update the statistics on the order_exp table like this:

ANALYZE TABLE order_exp;

The ANALYZE TABLE statement will immediately recalculate the statistical data, that is, the process is synchronous. This process may be particularly slow when there are many indexes in the table or there are too many sampled pages. It is best to run it when the business is not very busy.

Manually update the innodb_table_stats and innodb_index_stats tables

In fact, the innodb_table_stats and innodb_index_stats tables are equivalent to an ordinary table, and we can add, delete, modify and query them. This also means that we can manually update the statistics of a table or index. For example, if we want to change the statistical data about the number of rows in the order_exp table, we can do this:

Step 1: Update the innodb_table_stats table.

Step 2: Let the MySQL query optimizer reload the data we have changed.

After updating innodb_table_stats, you simply modify the data of a table. You need to let the MySQL query optimizer reload the data we have changed. Just run the following command:

 
 

FLUSH TABLE order_exp;

Guess you like

Origin blog.csdn.net/m0_70299172/article/details/130549905