postgresql insert ddl execution process analysis

 


Table of contents

foreword

overall process

call stack

Execution Interface Description

Detailed Process Decomposition

ExecInsert for the insert of ordinary tables;

Insert tuple process, call table_tuple_insert (heapam_tuple_insert);

The process of heap_insert:

Brief description of the index creation process

end


foreword

This article is based on the analysis and interpretation of the postgresql 15 code, and the demonstration is carried out on the centos8 system.


overall process

(1) If it is a partition table, you need to find the corresponding subtable;

(2) Whether the table to be inserted has an index, you need to open the corresponding index table;

(3) Processing before row insert triggers

(4) Processing instead of row insert triggers

(5) If it is a table of Fdw (foreign-data wrapper) interface, it is processed separately

(6) Handle tuple check and insert buffer

(7) If it is a partition table, the partition key changes, causing the tuple to be moved to a new partition, and the delete/insert of the data is processed;

(8) Processing after row insert triggers

(9) Processing with check option;

(10) generate return

call stack

ExecInsert(ModifyTableState * mtstate, ResultRelInfo * resultRelInfo, TupleTableSlot * slot, TupleTableSlot * planSlot, EState * estate, _Bool canSetTag)

ExecModifyTable(PlanState * pstate)

ExecProcNode(PlanState * node)

ExecutePlan(_Bool execute_once, DestReceiver * dest, ScanDirection direction, uint64 numberTuples, CmdType operation, _Bool use_parallel_mode, PlanState * planstate, EState * estate)

standard_ExecutorRun(QueryDesc * queryDesc, ScanDirection direction, uint64 count, _Bool execute_once)

ProcessQuery(PlannedStmt * plan, const char * sourceText, ParamListInfo params, QueryEnvironment * queryEnv, DestReceiver * dest, QueryCompletion * qc)

PortalRunMulti(Portal portal, _Bool isTopLevel, _Bool setHoldSnapshot, DestReceiver * dest, DestReceiver * altdest, QueryCompletion * qc)

PortalRun(Portal portal, long count, _Bool isTopLevel, _Bool run_once, DestReceiver * dest, DestReceiver * altdest, QueryCompletion * qc)

exec_simple_query(const char * query_string)

PostgresMain(const char * dbname, const char * username)

BackendRun()

BackendStartup()

ServerLoop()

PostmasterMain(int argc, char ** argv)

main(int argc, char ** argv)

After SQL parsing, after generating the execution plan, PortalRun is called for the execution phase, and finally ExecutePlan is called to execute according to each node in the plan. Different nodes have different types, such as merge/sort, etc. (see this Column execution plan sharing), each type has a corresponding execution call, and the execution action of insert is mainly completed by ExecInsert.

Execution Interface Description

static TupleTableSlot *

ExecInsert(ModifyTableContext *context,

           ResultRelInfo *resultRelInfo,

           TupleTableSlot *slot,

           bool canSetTag,

           TupleTableSlot **inserted_tuple,

           ResultRelInfo **insert_destrel)

slot, which stores tuples that need to be inserted;

**inserted_tuple, tuple after successful insertion;

**inserted_destrel is the output parameter, the table after the new tuple is successfully inserted, because there may be different partition tables;

The function return value is NULL if the insertion fails.

Detailed Process Decomposition

Postgresql performs hierarchical processing during the entire processing, which involves two layers:

One is the execution layer, and the corresponding call is ExecInsert

The second is the heapam layer, which is the operation for the heap type table; here is room for expansion design to support multiple types of tables; the corresponding call of the heapam layer is table_tuple_insert

  • ExecInsert for the insert of ordinary tables;

(1) If there is an automatically generated column, perform data generation;

(2) With check option type judgment and processing;

(3) Check constraints

(4) Check the partition table constraints to see if they meet the partition conditions;

(5) Whether there is a conflict handling strategy, and if necessary, check and deal with it;

(6) If there is no conflict handling strategy, insert tuple and create index normally;

  • Insert tuple process, call table_tuple_insert (heapam_tuple_insert);

(1) Assemble the tuple data recorded from the slot into a tuple to be inserted;

(2) Get table OID

(3) Call heap_insert, find free space, insert tuple into page, record wal, etc.;

(4) Record the tid of the inserted tuple to the slot; it is convenient to insert the index later;

(5) If there is an application for tuple space, release it here;

  • The process of heap_insert:

(1) Prepare the tuple, and continue to fill in infomask/xmin/xmax, etc. on the assembled tuple; for the compressible one, when the current tup is greater than the toast storage threshold, it must be assembled into a toast storage tuple;

(2) Find a buffer with free space to insert the current tuple; at the same time, check whether it is visible and mark the vm file; (please refer to the article in this column for the process of finding free space)

(3) Check for serialization level conflicts;

(4) Open the key code area; put the tuple in the buffer;

(5) Update visibility information;

(6) Dirty buffer;

(7) Write WAL log; end key code area;

(8) release buffer, vmbuffer;

(9) If the tuple of the system table is inserted, invalidatecache is required;

(10) Record t_cid to the returner and release the applied tuple;

  • Brief description of the index creation process

(1) After the tuple is successfully inserted, get the tid of the tuple;

(2) Call ExecInsertIndexTuples to execute indextuple insertion; this is a call at the execution layer;

(3) Each table can have multiple indexes, so scan the data dictionary to find out how many indexes there are; then insert each index separately;

(4) For each index, call the corresponding Indexam layer interface for processing; here call index_insert, which internally calls indexRelation->rd_indam->aminsert, which is initialized when the index is created. For details about the index, see the content of this column.


end

Author email: [email protected]
If there are any mistakes or omissions, please point them out and learn from each other.

Note: Do not reprint without consent!

Guess you like

Origin blog.csdn.net/senllang/article/details/130472467