Postgresql kernel source code analysis - how indexes can speed up queries

 

  • Column content: postgresql kernel source code analysis
  • Personal homepage: senllang's homepage
  • Motto: Tian Xingjian, the gentleman strives for self-improvement;

Article directory

foreword

Ways to speed up queries

The content of the index record

How do indexes correspond to data

Index and data creation order

end



foreword

This article is based on the analysis and interpretation of the postgresql 15 code, and the demonstration is carried out on the centos8 system.


Ways to speed up queries

In a big way, it is mainly to reduce the amount of data retrieved, so how to reduce it?

On the one hand, it is to optimize the execution plan and reduce the amount of processed data at the logical level. On the other hand, each tuple needs to find the location from the table file. How to find the location more accurately instead of traversing the entire file.

The index master solves the second situation. At the same time, no real data is recorded in the index, so its size is very small, and it can be quickly loaded into the memory for query, which greatly reduces the disk IO during the query process;

Let's see how the index is done?


The content of the index record

The contents of different index records are inconsistent. The following takes the btree index as an example:

Take a look at the insert operation code:

/* insert the tuple normally */
table_tuple_insert(resultRelationDesc, slot,

                     estate->es_output_cid,

                      0, NULL);

/* insert index entries for tuple */

if (resultRelInfo->ri_NumIndices > 0)

recheckIndexes = ExecInsertIndexTuples(resultRelInfo,

                                    slot, estate, false,

                                    false, NULL, NIL);

After the data is inserted into the block, in heapam_tuple_insert, the tid of the tuple will be recorded in the slot, which is the location information of the tuple;


 

   /* Perform the insertion, and copy the resulting ItemPointer */

    heap_insert(relation, tuple, cid, options, bistate);

    ItemPointerCopy(&tuple->t_self, &slot->tts_tid);

In ExecInsertIndexTuples, the slot->tts_tid will be passed to the index to form an index record

ItemPointer tupleid = &slot->tts_tid;

Then start the insert operation of the index record below       

satisfiesConstraint =

            index_insert(indexRelation, /* index relation */

                         values,    /* array of index Datums */

                         isnull,    /* null flags */

                         tupleid,   /* tid of heap tuple */

                         heapRelation,  /* heap relation */

                         checkUnique,   /* type of uniqueness check to do */

                         indexUnchanged,    /* UPDATE without logical change? */

                         indexInfo);    /* index AM may need this */

Generate indextuple, use the following structure, there are two parts: one is heaptuple tid, the other is flag and length;

typedef struct IndexTupleData

{

    ItemPointerData t_tid;      /* reference TID to heap tuple */

    /* ---------------

     * t_info is laid out in the following fashion:

     *

     * 15th (high) bit: has nulls

     * 14th bit: has var-width attributes

     * 13th bit: AM-defined meaning

     * 12-0 bit: size of tuple

     * ---------------

     */

    unsigned short t_info;      /* various info about tuple */

} IndexTupleData;               /* MORE DATA FOLLOWS AT END OF STRUCT */

typedef IndexTupleData *IndexTuple;


How do indexes correspond to data

  • The first is HOT, heap only tuple.

In theory, each data tuple corresponds to an index. Postgresql has optimized it here. For update, if the new and old versions do not cross pages, the index will not be added. Find the old version through the index, and then find the latest tuple through the version chain , this type of tuple is a heap only tuple.

  • For the new and old versions stored across pages, an index will be added, which will cause index expansion.


Index and data creation order

From the code point of view, the data is inserted into the data block first, and after the table_tuple_insert operation is completed, the tuple location information is obtained, and then the index operation is performed.         

  /* insert the tuple normally */

  table_tuple_insert(resultRelationDesc, slot,

                      estate->es_output_cid,

                       0, NULL);

   /* insert index entries for tuple */

   if (resultRelInfo->ri_NumIndices > 0)

         recheckIndexes = ExecInsertIndexTuples(resultRelInfo,

                                                slot, estate, false,

                                                 false, NULL, NIL);


end

Author email: [email protected]
If there are any mistakes or omissions, please point them out and learn from each other.

Note: Do not reprint without consent!

Guess you like

Origin blog.csdn.net/senllang/article/details/129077636