Database performance optimization must read, hash index design of AntDB-M database

One of the key technologies to speed up database access is indexing. The design and use of indexes greatly affects the performance of the database. AntDB-M supports Hash and BTree index types. This article mainly explains the related design of Hash index, and gives some usage suggestions.

1. Related concepts

  • bucket

    A container for locating index records. Each element in the container records the unique number of a table record. If the element is empty, it means that there is no table record at the current index position.

  • bucket size

    The number of unique record numbers that can be directly stored in a bucket.

  • bucket subscript

    The bucket subscript is obtained by taking the modulo of the Hash value obtained after Hash calculation of the index column and the bucket size.

  • index record linked list

    Due to the existence of Hash conflicts, different table records may have the same bucket subscript calculated by Hash. The index records with the same bucket subscript are all stored in a doubly linked list, that is, the index record linked list, and the elements on the bucket are the index record linked list head.

2.  Memory structure

AntDB-M's Hash index memory structure design is very simple and efficient. Each Hash index consists of two parts: a bucket and a linked list of index records.

  • bucket

    The bucket is a one-dimensional array, each element in the array is the unique number of the table record, and the unique number of the index record is the same as the unique number of the table record.

Figure 1: Barrel

  • index record linked list

    The AntDB-M index record linked list is used to record index records with the same hash value. The memory structure is a three-layer structure: 1) primary address; 2) secondary address; 3) linked list nodes; the organizational form of this structure is similar to that of table data (refer to "Memory Structure of AntDB-M Design"), The records in the linked list can be quickly located by the unique number of the record. 

    The nodes of the index record linked list are composed of three parts: the previous record number, the next record number, and transactions (only for unique indexes). The record number recorded in the bucket is the record number of the head node of the linked list.

Figure 2: Index record linked list

It can be seen from the above that the entire structure of the Hash index only involves record numbers and transaction information. The structure is very light, and the memory can be expanded on demand without taking up too much memory.

3. Index processing

3.1 Persistence

AntDB-M index includes two parts of data: index metadata and index records. When doing data persistence, only the index metadata will be persisted.

3.2 Initialization

When the AntDB-M database service starts, first load the table data records, then load the index metadata, and finally initialize the index records in memory according to the index metadata.

3.3 Insert index records

When a record is inserted into a table, the table record is inserted first, and then the index record is inserted. New records are added to the head of the index linked list to improve insertion speed.

  • normal index

Figure 3: Normal index insert

For ordinary indexes, because there is no need to judge the uniqueness of records, concurrent inserts will not affect each other and can be inserted directly. The newly inserted index records take effect immediately, that is, other transaction queries can be accessed immediately, but the visibility of the table records is controlled by the transaction information and locks on the table records. When the transaction is committed, ordinary indexes do not need to do any processing. When the transaction is rolled back, the index record itself is deleted from the index record linked list.

  • unique index

Figure 4: Unique index insertion

For unique indexes, it is necessary to detect whether there is a unique key conflict. When the same record already exists in the linked list of index records subscripted by the same bucket, and the transaction has not been committed (the transaction field on the index record has a value), then block and wait for other transactions to commit or roll back to detect unique key conflicts. When the transaction is committed, the transaction information on the index record is cleared. If the transaction is rolled back, the index record is deleted. Transaction commit and rollback will notify the blocking waiting transaction.

3.4 Index query

When querying by hash index, first take the modulus of the hash bucket size according to the hash value calculated by the index column, obtain the bucket subscript, then obtain the list of index records under the subscript, obtain the record number of the matching record, and set The number is put into the query context allocated for the current transaction for use by subsequent query traversals. The index record and the data record have the same record number, and the data record can be located immediately according to the record number. The main cost of the whole process is the calculation of the hash value and the comparison of the index record items, and the performance is very high.

3.5 Delete index records

Figure 5: Common index deletion

Figure 6: Unique index deletion

When deleting a record from a table, first delete the table data record (mark the deletion flag, update the transaction information), and then update the transaction on the unique index record as the current transaction. Since ordinary indexes do not have transaction information, no operations will be performed. The purpose of updating the transaction on the index record is to detect the conflict of the uniqueness of the data record when multiple transactions are concurrent. Because the current transaction has acquired the mutex on the record, updating the index record will not affect the consistency of the index record and can be updated directly.

When the transaction is committed, the index records are not deleted immediately, and the index records will still be found through the index when searching. The visibility of table records is still controlled by the transaction information and lock information on the table records (ie MVCC). There will be a separate cleaning thread that will actually delete the data records and index records when the data records are not accessed again by transactions.

When the transaction is rolled back, only the transaction information on the index record needs to be cleared for the unique index.

If it is a unique index, transaction commit and rollback will notify the blocking waiting transaction.

3.6 Updating Index Items

The update of table records will only update the index when it involves the update of the index column. When index updates are involved, the modification of table records will be converted to delete old records first, and then insert new records. The operation on the index is also converted into the above-mentioned insert index item and delete index item. The commit and rollback of the transaction are also the same as the insert index entry and delete index entry above.

3.7 Node Linked List Expansion

The size of the index node linked list space (the number of records) is consistent with the size of the table record space. When the new records in the table exceed the size of the table space and the table space needs to be expanded, the linked list of index nodes is expanded at the same time.

3.8 Bucket reconstruction (rehash)

When creating an index, the initial value of the index bucket size can be specified by the index attribute block_size. If not specified, the current record number of the table shall prevail, and the minimum value is 100000. Regularly check (5 minutes by default, configurable) whether the number of table records exceeds the size of the bucket, and if it exceeds, the bucket will be expanded and the Hash index will be rebuilt.

  • new barrel size

The new bucket size is the smallest prime number larger than the current bucket size.

  • bucket access lock

For each thread, a bucket access node is assigned when it first accesses the index, and the current bucket address and a mutex are set on the node. When the thread starts to access the index, it first applies for the lock on the access node, and releases the lock after the access. This lock ensures that during index access, bucket resources will not be released due to rehash.

  • asynchronous rebuild

Bucket reconstruction is an asynchronous process. During the process of rebuilding buckets, it does not affect the query and update of indexes. This process affects whether to access the old bucket or the new bucket by recording the current migration location. When the data migration is complete and there is no access to the old bucket (no thread holds the bucket access lock for the bucket), the old bucket resources are released.

3.9 Indexes and locks

In order to avoid too many locks occupying too many resources of the system and maintain a high degree of concurrency, each Hash index of the AntDB-M database is allocated with 131 locks, and the current index record is obtained by moduloing the bucket subscript and 131 Lock. In the process of accessing the index (query or modification), it will be locked first.

4. Index Features

4.1 Prefix index

The prefix index refers to the application of the specified column, or the specified prefix length of the specified column as the index.

This feature applies to the following two situations:

1. The index column data is too long

2. The data in the index column is suitable for specific business query scenarios.

AntDB-M can also specify the prefix length of each column, not just the prefix length of the last column, which provides a very flexible indexing capability for the business and facilitates customers to achieve more efficient and faster business implementation.

4.2 Data distribution

For each Hash index, the number of records with the same Hash value at the current location will be counted.

This statistical value is mainly used to judge whether there is a large hash conflict in the current data. It is convenient for the operation and maintenance management personnel of AntDB-M to quickly locate whether it is suitable to establish a Hash index, and whether it needs to further optimize the Hash index.

5. Restrictions

5.1 Range queries are not supported

For Hash indexes, because index records are not sorted, range queries are not supported.

Note: For the range query of the index column field, the whole table will be traversed.

5.2 Does not support fuzzy query

AntDB-M supports fixed-length left prefix matching. However, because Hash calculation itself requires precise values, fuzzy matching is not supported.

5.3 Index columns should not have too many identical values

If there are many columns with the same value, these records will be in the same index record linked list, and each query will traverse all the records in the list for filtering, the more data, the lower the performance.

6. Summary

It is the simple design of the Hash index that allows the AntDB-M database to provide efficient indexing capabilities with less memory and calculations. While improving the database access performance, it greatly reduces the overhead of system resources. However, the Hash index algorithm also has some restrictions on its use, mainly due to its own characteristics. Understanding these characteristics can help business model designers choose the appropriate index type to maximize the performance of the database.

About the AntDB database

The AntDB database began in 2008. On the core system of the operator, it provides online services for more than 1 billion users in 24 provinces across the country. It has product features such as high performance, elastic expansion, and high reliability. It can process millions of transactions per second at peak The communication core transaction has ensured the continuous and stable operation of the system for nearly ten years, and has successfully commercialized in the communication, finance, transportation, energy, Internet of Things and other industries.

Guess you like

Origin blog.csdn.net/weixin_44518445/article/details/131394227