mysql server monitoring

For any system, monitoring is an important part. The database is a core component of all systems, the stability of the database determines the stability of the system to some extent, therefore, for the monitoring of the database, is particularly important. Common open source monitoring software Nagios, Zabbix. These monitoring software, or to provide a database monitoring plug-in, or allowing users to develop their own monitoring scripts to the database in the form of plug-ins, and scripting language support is varied, the user can according to their own habits, to choose their own monitoring software, as well as write their own monitoring scripts.

Data in this chapter focus on what we have MySQL database monitoring? And how to monitor these resources to monitor?

Knowing this, regardless of any monitoring software, you can complete the development and deployment of MySQL script of their own.

For MySQL, the most basic monitoring should include the following:

Database service availability monitor (connected via a network to a database and determines whether the database is external service provider);
database performance monitoring (QPS, TPS, the number of threads concurrently monitoring);
the main monitor (the main from the copy from the copy monitor link status, master-slave replication latency monitoring on a regular basis to confirm whether the same master data copied from);
space monitoring of server resources (monitoring disk space (either data log directory or directory is filled, will cause MySQL not available), CPU usage, memory usage, network usage and IO case of Swap partition, etc.);
1. data availability monitoring
first, let's look at how to confirm whether the database can be connected via a network. MySQL use of native SQL database file server connection, does not mean that you can be able to connect to MySQL through TCP / IP network protocol. Are you sure the database may be connected through a network, usually using one of the following ways:

Scheme 1: Use mysqladmin -umonitor_user -p -h ping command on a remote server, to verify that the server can connect monitored;
Scheme 2: Using telnet ip db_port manually confirm command monitored server can connect;
Scheme III: establish a database connection to use the program to verify that the server can be monitored through a network connection, this is the best way.
Can connect to the database does not mean that the database is available, so it needs to confirm whether the database can read and write.

How to confirm whether the database can read and write?

Read_only parameters regularly check whether the main database off;
the establishment of monitoring tables and table data changes;
if only monitor the database is readable only need to perform a simple query select @@ version;
you can connect to the MySQL number of threads is limited and how to monitor the number of connections to the database?

show variables like 'max_connections'; // Get MySQL acceptable maximum number of connections
show global status like 'Threads_connected'; // get the value of the system variable Threads_connected, records the current number of connections to the database

, for example, when the alarm can Threads_connected / max_connections > 0.8, it is necessary alarms.

2. Data Performance Monitor
Performance Monitor is different from availability monitoring, performance monitoring is more concerned about the trend database performance, so when the script development performance monitoring is carried out, you need to pay attention to the good performance recorded state monitoring process the collected database information to use when analyzing database performance trends.

For performance monitoring, it may be concerned about the most is the QPS and TPS.

QPS = (Queries2 - Queries1) / (Uptime_since_flush_status2 - Uptime_since_flush_status1)

TPS = ((Com_insert2 + Com_update2 + Com_delete2) - (Com_insert1+ Com_update1 + Com_delete1)) /
(Uptime_since_flush_status2 - Uptime_since_flush_status1)

Obtaining these parameters:

Status like, Ltd. Free Join Show 'Queries'
Show Status, Ltd. Free Join like 'Uptime_since_flush_status'
Show Status, Ltd. Free Join like 'Com_insert'
Show Status, Ltd. Free Join like 'Com_update'
Show Status, Ltd. Free Join like 'Com_delete'

how to monitor the number of concurrent database requests?

Typically, the performance of the database system increases with the number of concurrent processing requests decreases. Therefore, number of concurrent requests and usually requires utilization of CPU and other indicators combined analysis.

The current number of concurrent requests the database may be acquired by show global status like 'Threads_running'. The number of concurrent treatment usually much smaller than the number of threads at the same time connected to the database.

Normally the number of concurrent requests is very stable, if we find that a certain amount of time between concurrent increases suddenly, then you need to check whether there is abnormal database, such as database appear in the case of a lot of blocking, it is very likely that phenomenon.

How to monitor blocking the InnoDB?

Query block time more than 20 seconds of SQL:

SELECT b.trx_mysql_thread_id AS 'blocked thread'
, b.trx_query AS 'blocked SQL'
, c.trx_mysql_thread_id AS 'blocked thread'
, c.trx_query AS 'blocking SQL'
, (UNIX_TIMESTAMP () - UNIX_TIMESTAMP (c.trx_started) ) AS 'blocking time'
the FROM A information_schema.innodb_lock_waits
the JOIN information_schema.innodb_trx B = the ON a.requesting_trx_id b.trx_id
the JOIN information_schema.innodb_trx a.blocking_trx_id the ON = C c.trx_id
the WHERE (UNIX_TIMESTAMP () - UNIX_TIMESTAMP (c.trx_started)) > 20

3. master-Slave replication monitor
is mainly in our database architectures, rely heavily on MySQL's replication mechanism from the case, then the master-slave replication monitoring will become an essential part.

How to monitor the state of the master-slave replication link?

For monitoring master-slave replication, it must rely on the basic command show slave status.

How to monitor delay from the master copy?

Between the main participation copied from the server there must be some delay, the delay is normally very small, substantially less than one second. So for applications, the impact is not great, especially for some of the main application from the delay-insensitive. If for some reason, there is a big delay in the main from between the server, it will affect the normal use of the application. It must be on the master-slave replication delay some monitoring.

If you find the delay between the server continued to increase, then we need some checks, find the cause and solve. Under normal circumstances, can be monitored from a master copy of the delay by the following method:

The use of information show slave status command returns:

Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0

Seconds_Behind_Master is the master-slave replication delay in seconds. This acquisition method is relatively simple, but the results are not accurate, because this value is based on the synchronization to the master server and have binlog from the difference between the server time binlog log executed again from the server, so there are many cases can lead to inaccurate data, such as when a network problem occurs, there are a lot of on the primary server is not synchronized to the binlog from the server, and has been synchronized to the binlog on the server have been completely reused over, this situation between master and slave is there is a considerable delay, but by the command show slave status but can not find this delay.

So in order to find more accurate delay, we need another way:

This method requires the use of multiple threads to the main program at the same time to check the state of the server. Execution show master status command on the primary server to get the binary log file information and offset on the primary server:

MySQL> Show Master Status \ G
***************************** 1. Row *********** ******************
file: MySQL-bin.001099
Position: 302 055 050

perform the show slave status command from the server to obtain the master binary log file and offset information sent by :

Master_Log_File: MySQL-bin.001099
Read_Master_Log_Pos: 301 855 050

and an offset binary information and in performing show slave status obtaining command from the server has completed the transmission of primary server:

Exec_Master_Log_Pos: 301.85505 million
Relay_Log_Space: 301.85535 million

by comparing the above three information, can know whether the primary there is a delay from among a large number of the server. If the file name (File and Master_Log_File) and offset (Position and Read_Master_Log_Pos) are the same, indicating that the current primary never existed any delays.

When each repair done when copying from the master, should check the consistency from the master copy of the data. So how do you verify that the master copy of the data is consistent from?

Here it is necessary to use Percona company released MySQL toolset pt-table-checksum:

Checksum the Table-U-pt = DB_USER, the p-= 'db_password' \
--databases MySQL \
--replicate test.checksums

-databases parameter specifies the name of the database, -replicate parameter specifies that you want to create checksums in this repository under test tables, and writes data to this detection table. Note that this command only need to run on it in the main library, it will automatically find all the information from the library at the main library, and all of the specified data from the library database to detect.

Accounts can be used to establish a database:

GRANT SELECT,PROCESS,SUPER,REPLICATION SLAVE ON *.* TO 'db_user'@'ip' IDENTIFIED BY 'db_passwor

Guess you like

Origin www.cnblogs.com/shixiuxian/p/11223352.html