PostgreSQL lock wait troubleshooting

illustrate

In databases, and are commonly used MVCCto ensure transaction consistency and improve concurrency. Locating and troubleshooting lock problems are also skills that database operation and maintenance personnel must know. This article introduces how PostgreSQL troubleshoots lock blockage problems.

1. PostgreSQL locks

It is recommended to read the official documentation here for an introduction to PostgreSQL lock mode.

2. Deadlock problem

If the business logic is not designed carefully, it may lead to serious lock waits or deadlocks. PostgreSQL automatically detects deadlock situations and automatically rolls back one of the transactions for processing so that the other transactions are committed. Please see the case below:

postgres= select * from pgbench_accounts where aid in (3, 4, 5);
 aid | bid | abalance |                                        filler                                        
-----+-----+----------+--------------------------------------------------------------------------------------
   3 |   1 |        0 |                                                                                     
   4 |   1 |        0 |                                                                                     
   5 |   1 |        0 |                                                                                     
(3 rows)
Business one Business two
begin; begin;
update pgbench_accounts set abalance = 1 where aid = 3; update pgbench_accounts set abalance = 1 where aid = 4;
- update pgbench_accounts set abalance = 1 where aid = 3; – blocked
update pgbench_accounts set abalance = 1 where aid = 4; update executed successfully
commit – automatic rollback commit;

ERROR: deadlock detected
DETAIL: Process 127589 waits for ShareLock on transaction 702; blocked by process 127570.
Process 127570 waits for ShareLock on transaction 701; blocked by process 127589.
HINT: See server log for query details.
CONTEXT: while updating tuple (0,4) in relation “pgbench_accounts”

PostgreSQL has an automatic deadlock detection mechanism. Like MySQL, after a deadlock is detected, if it is greater than the lock timeout, one of the transactions will be automatically rolled back. The default is 1s:

select current_setting('deadlock_timeout');

MySQL can obtain historical deadlock information through show engine innodb status, and PostgreSQL can use parameter control log_lock_waitsparameters to output deadlock information to the log:

2023-07-03 16:39:26.037 CST [127589] LOG: process 127589 detected deadlock while waiting for ShareLock on transaction 704 after 1000.332 ms
2023-07-03 16:39:26.037 CST [127589] DETAIL: Process holding the lock: 14138. Wait queue: .
2023-07-03 16:39:26.037 CST [127589] CONTEXT: while updating tuple (0,4) in relation “pgbench_accounts”
2023-07-03 16:39:26.037 CST [127589] STATEMENT: update pgbench_accounts set abalance = 1 where aid = 4;
2023-07-03 16:39:26.037 CST [127589] ERROR: deadlock detected
2023-07-03 16:39:26.037 CST [127589] DETAIL: Process 127589 waits for ShareLock on transaction 704; blocked by process 14138.
Process 14138 waits for ShareLock on transaction 703; blocked by process 127589.
Process 127589: update pgbench_accounts set abalance = 1 where aid = 4;
Process 14138: update pgbench_accounts set abalance = 1 where aid = 3;
2023-07-03 16:39:26.037 CST [127589] HINT: See server log for query details.
2023-07-03 16:39:26.037 CST [127589] CONTEXT: while updating tuple (0,4) in relation “pgbench_accounts”
2023-07-03 16:39:26.037 CST [127589] STATEMENT: update pgbench_accounts set abalance = 1 where aid = 4;
2023-07-03 16:39:26.038 CST [14138] LOG: process 14138 acquired ShareLock on transaction 703 after 25470.277 ms
2023-07-03 16:39:26.038 CST [14138] CONTEXT: while updating tuple (0,3) in relation “pgbench_accounts”
2023-07-03 16:39:26.038 CST [14138] STATEMENT: update pgbench_accounts set abalance = 1 where aid = 3;

3. Lock problem monitoring

3.1 pg_stat_activity

This view is used to observe database sessions. It has the same function as processlist in MySQL. You can also use it to observe lock waits:

select pid, 
	   pg_blocking_pids(pid),
	   wait_event_type,wait_event,
	   query 
from pg_stat_activity;
Session 1 Session 2
Begin;
delete from pgbench_accounts where aid = 3;
select * from pgbench_accounts where aid = 3 for update;

At this time, Session 2 is blocked by Session 1. Use pg_stat_activity to query:

  pid   | pg_blocking_pids | wait_event_type |     wait_event      |                          query                           
--------+------------------+-----------------+---------------------+----------------------------------------------------------
 127589 | {
    
    }               | Client          | ClientRead          | delete from pgbench_accounts where aid = 3;
  14138 | {
    
    127589}         | Lock            | transactionid       | select * from pgbench_accounts where aid = 3 for update;
 129038 | {
    
    }               |                 |                     | select pid,                                             +
        |                  |                 |                     |    pg_blocking_pids(pid),                               +
        |                  |                 |                     |    wait_event_type,wait_event,                          +
        |                  |                 |                     |    query                                                +
        |                  |                 |                     | from pg_stat_activity;

You can clearly see that the select...for update statement is blocked by the session with pid = 127589.

3.2 Blocking the view

Because this SQL is very long, it is recommended to create a view before use to facilitate subsequent use:

create view v_locks_monitor as   
with    
t_wait as    
(    
  select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,a.classid,a.granted,   
  a.objid,a.objsubid,a.pid,a.virtualtransaction,a.virtualxid,a.transactionid,a.fastpath,    
  b.state,b.query,b.xact_start,b.query_start,b.usename,b.datname,b.client_addr,b.client_port,b.application_name   
    from pg_locks a,pg_stat_activity b where a.pid=b.pid and not a.granted   
),   
t_run as   
(   
  select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,a.classid,a.granted,   
  a.objid,a.objsubid,a.pid,a.virtualtransaction,a.virtualxid,a.transactionid,a.fastpath,   
  b.state,b.query,b.xact_start,b.query_start,b.usename,b.datname,b.client_addr,b.client_port,b.application_name   
    from pg_locks a,pg_stat_activity b where a.pid=b.pid and a.granted   
),   
t_overlap as   
(   
  select r.* from t_wait w join t_run r on   
  (   
    r.locktype is not distinct from w.locktype and   
    r.database is not distinct from w.database and   
    r.relation is not distinct from w.relation and   
    r.page is not distinct from w.page and   
    r.tuple is not distinct from w.tuple and   
    r.virtualxid is not distinct from w.virtualxid and   
    r.transactionid is not distinct from w.transactionid and   
    r.classid is not distinct from w.classid and   
    r.objid is not distinct from w.objid and   
    r.objsubid is not distinct from w.objsubid and   
    r.pid <> w.pid   
  )    
),    
t_unionall as    
(    
  select r.* from t_overlap r    
  union all    
  select w.* from t_wait w    
)    
select locktype,datname,relation::regclass,page,tuple,virtualxid,transactionid::text,classid::regclass,objid,objsubid,   
string_agg(   
'Pid: '||case when pid is null then 'NULL' else pid::text end||chr(10)||   
'Lock_Granted: '||case when granted is null then 'NULL' else granted::text end||' , Mode: '||case when mode is null then 'NULL' else mode::text end||' , FastPath: '||case when fastpath is null then 'NULL' else fastpath::text end||' , VirtualTransaction: '||case when virtualtransaction is null then 'NULL' else virtualtransaction::text end||' , Session_State: '||case when state is null then 'NULL' else state::text end||chr(10)||   
'Username: '||case when usename is null then 'NULL' else usename::text end||' , Database: '||case when datname is null then 'NULL' else datname::text end||' , Client_Addr: '||case when client_addr is null then 'NULL' else client_addr::text end||' , Client_Port: '||case when client_port is null then 'NULL' else client_port::text end||' , Application_Name: '||case when application_name is null then 'NULL' else application_name::text end||chr(10)||    
'Xact_Start: '||case when xact_start is null then 'NULL' else xact_start::text end||' , Query_Start: '||case when query_start is null then 'NULL' else query_start::text end||' , Xact_Elapse: '||case when (now()-xact_start) is null then 'NULL' else (now()-xact_start)::text end||' , Query_Elapse: '||case when (now()-query_start) is null then 'NULL' else (now()-query_start)::text end||chr(10)||    
'SQL (Current SQL in Transaction): '||chr(10)||  
case when query is null then 'NULL' else query::text end,    
chr(10)||'--------'||chr(10)    
order by    
  (  case mode    
    when 'INVALID' then 0   
    when 'AccessShareLock' then 1   
    when 'RowShareLock' then 2   
    when 'RowExclusiveLock' then 3   
    when 'ShareUpdateExclusiveLock' then 4   
    when 'ShareLock' then 5   
    when 'ShareRowExclusiveLock' then 6   
    when 'ExclusiveLock' then 7   
    when 'AccessExclusiveLock' then 8   
    else 0   
  end  ) desc,   
  (case when granted then 0 else 1 end)  
) as lock_conflict  
from t_unionall   
group by   
locktype,datname,relation,page,tuple,virtualxid,transactionid::text,classid,objid,objsubid ;

Still in section 2.1, the artificial lock waiting scene, we use this view to see the effect:

select * from v_locks_monitor;
-[ RECORD 1 ]-+------------------------------------------------------------------------------------------------------------------------------------------------------
locktype      | transactionid
datname       | postgres
relation      | 
page          | 
tuple         | 
virtualxid    | 
transactionid | 705
classid       | 
objid         | 
objsubid      | 
lock_conflict | Pid: 127589                                                                                                                                          +
              | Lock_Granted: true , Mode: ExclusiveLock , FastPath: false , VirtualTransaction: 3/382737 , Session_State: idle in transaction                       +
              | Username: postgres , Database: postgres , Client_Addr: NULL , Client_Port: -1 , Application_Name: psql                                               +
              | Xact_Start: 2023-07-03 16:51:06.570461+08 , Query_Start: 2023-07-03 16:51:27.61686+08 , Xact_Elapse: 00:13:38.834827 , Query_Elapse: 00:13:17.788428 +
              | SQL (Current SQL in Transaction):                                                                                                                    +
              | delete from pgbench_accounts where aid = 3;                                                                                                          +
              | --------                                                                                                                                             +
              | Pid: 14138                                                                                                                                           +
              | Lock_Granted: false , Mode: ShareLock , FastPath: false , VirtualTransaction: 4/43636 , Session_State: active                                        +
              | Username: postgres , Database: postgres , Client_Addr: NULL , Client_Port: -1 , Application_Name: psql                                               +
              | Xact_Start: 2023-07-03 16:56:05.622813+08 , Query_Start: 2023-07-03 16:56:05.622813+08 , Xact_Elapse: 00:08:39.782475 , Query_Elapse: 00:08:39.782475+
              | SQL (Current SQL in Transaction):                                                                                                                    +
              | select * from pgbench_accounts where aid = 3 for update;

This view will be sorted by lock size. To quickly unblock the lock, terminate the PID corresponding to the largest lock.

select pg_terminate_backend(127589);

4. Lock related parameters

4.1 deadlock_timeout

The default is 1s, which means that the pg database only performs deadlock detection when the lock timeout is greater than 1s.

4.2 log_lock_waits

It is turned off by default. If this parameter is turned on, the information that the lock timeout exceeds deadlock_timeout will be recorded in the log.

4.3 lock_timeout

Lock wait timeout, the default is 0 which means lock timeout is disabled. If this parameter is set >0, it means that if the session still cannot obtain the relevant lock resource after waiting for the lock resource Ns, the execution of the relevant statement will be terminated.

4.4 idle_in_transaction_session_timeout

For an idle transaction, how long will it take to automatically terminate it? The unit is milliseconds. The default value is 0, which means this parameter is disabled. Session termination will release all lock resources of the session.

postscript

Reference:

Guess you like

Origin blog.csdn.net/qq_42768234/article/details/131516728