CMU15-445 2022 Fall clearance record—— Project 1: Buffer Pool

Guiding book

Project #1 - Buffer Pool | CMU 15-445/645 :: Intro to Database Systems (Fall 2022) — Project #1 - Buffer Pool | CMU 15-445/645 :: Introduction to Database Systems (Fall 2022)

Task #1:Extendible Hash Table

First of all, you should understand the concept of scalable hash table , you can refer to the following article:

Extendible Hashing (Dynamic approach to DBMS) - GeeksforGeeks — Extendible Hashing (Dynamic approach to DBMS) - GeeksforGeeks

The following blog post is also great:

Make a database: 2022 CMU15-445 Project1 Buffer Pool Manager - Zhihu (zhihu.com)

This article takes integers as an example to describe in detail the expansion process of an expandable hash table.

The characteristics can be shown in the picture below:

image-20230608150659689

It can be seen that the hash performance of the scalable hash table is as follows: the Directories index points to the Buckets that store the real data. Each entry in Directories has a unique id, and the hash function returns the id (the id may change when expansion occurs), mapping the data to the corresponding Bucket.

The number of entries in Directories = 2 ^ Global Depth, Global Depthwhich can also be understood as the number of bits for each entry in the directory. What Local Depthis associated is the number of key-value pairs that can be stored in Buckets.

In the code provided by BusTub, the Bucket data structure is list, and the data type stored in it is. pairFirst, you can try to complete the three functions in the Bucket class: Find, Insert, and Remove, and only operate on the list structure.

Several functions that need to be completed are shown in the figure below:

  • Constructor of extensible hash table:

    image-20230608205747619

    All member variables in this class are shown in the figure below:

    // TODO(student): You may add additional private members and helper functions and remove the ones
    // you don't need.
    
    int global_depth_;    // The global depth of the directory
    size_t bucket_size_;  // The size of a bucket
    int num_buckets_;     // The number of buckets in the hash table
    mutable std::mutex latch_;
    std::vector<std::shared_ptr<Bucket>> dir_;  // The directory of the hash table
    

    According to the default parameters, the initial global_depth = 0 and num_buckets = 1 of the hash table, so what you can think of is to initialize a Bucket in the constructor with a capacity equal to bucket_size, and the initial local_depth = global_depth = 0.

  • Lookup functions for extensible hash tables:

    /**
     *
     * TODO(P1): Add implementation
     *
     * @brief Find the value associated with the given key in the bucket.
     * @param key The key to be searched.
     * @param[out] value The value associated with the key.
     * @return True if the key is found, false otherwise.
     */
    auto Find(const K &key, V &value) -> bool;
    

    After calculating the index based on the key, directly call the search function corresponding to the Bucket.

  • Removal function for extensible hash tables:

    /**
     *
     * TODO(P1): Add implementation
     *
     * @brief Given the key, remove the corresponding key-value pair in the bucket.
     * @param key The key to be deleted.
     * @return True if the key exists, false otherwise.
     */
    auto Remove(const K &key) -> bool;
    

    After calculating the index based on the key, directly call the removal function corresponding to the Bucket.

  • Insertion functions for extensible hash tables:

    /**
     *
     * TODO(P1): Add implementation
     *
     * @brief Insert the given key-value pair into the bucket.
     *      1. If a key already exists, the value should be updated.
     *      2. If the bucket is full, do nothing and return false.
     * @param key The key to be inserted.
     * @param value The value to be inserted.
     * @return True if the key-value pair is inserted, false otherwise.
     */
    auto Insert(const K &key, const V &value) -> bool;
    

    This is probably the most complex function. As per the instructions, you need to consider:

    • Bucket is not full:

      • key already exists, update the value of value
      • The key does not exist, insert it directly
    • If the bucket is full, you need to perform the following steps:

      • Bucket's local_depth = global_depth, increase global_depth, and the directory expansion is doubled (this refers to the expansion of the entire size, which is capacity);

        If local_depth < global_depth, proceed to the next two steps;

      • Increase the local_depth of the corresponding Bucket;

      • Bucket splitting, redistributing the Bucket pointed to by the directory, and the key-value pairs

    And you must understand that insert is essentially a recursive process, because if you want to split, you need to insert the key-value pair into a new Bucket until it is inserted.

    Because the extensible hash table needs to calculate the directory index based on the key value, the following code is the code to calculate the index:

    template <typename K, typename V>
    auto ExtendibleHashTable<K, V>::IndexOf(const K &key) -> size_t {
      int mask = (1 << global_depth_) - 1;
      return std::hash<K>()(key) & mask;
    }
    

    It can be seen that the method of mask calculation is to make the lower bits all be 1, and the bit corresponding to global_depth is 0. For example, if global_depth = 0 at the beginning, it can be calculated that mask = 0(b)when the first split occurs due to Bucket fullness, global_depth = 1, it can be calculated mask = 01(b).

    & mask: This line of code uses the bitwise AND operator (&) to perform a bitwise AND operation on the calculated hash value and the mask, limiting the hash value to the mask range. Since the low- global_depth_order bits are all 1, the bitwise AND operation will retain the low- global_depth_order bits of the hash value, ignore the high-order bits, and obtain the final index position.

    If K = int, then the return result of the hash function is the integer value itself (please search for other types by yourself). Assume key = 3 = 011(b), then index = 11 & 01 = 1. If key = 2 = 10(b), index = 10 & 01 = 0, which is consistent with the conclusion used in the example in the previous article. That is, the low global_depth bits are used to distinguish different keys.

    Another thing to consider is how to extend dir. One thing to consider is that after extending dir, the same key must be placed in the original Bucket. Here is an example from the course:

    image-20230611205534726

  • Reallocation Bucket function of extensible hash table:

    According to the documentation, this function may or may not be applicable, depending on the specific implementation.

    /**
    * @brief Redistribute the kv pairs in a full bucket.
    * @param bucket The bucket to be redistributed.
    */
    auto RedistributeBucket(std::shared_ptr<Bucket> bucket) -> void;
    

    There are two processing methods. One is to retain the original bucket and create a new bucket. It should be noted here that if the elements in the original bucket are redistributed, if the calculated bucket is not in the original bucket, you need to delete the elements in the original bucket. corresponding value;

    Another way to deal with it is to create two new buckets, so that you don't have to worry about key-value duplication.

    The first version of the split function:

    // 拆分,分裂之前dir已经扩容完毕了,
    // 且由于bucket是shared_ptr,目前dir中是存在两组完全相同的bucket的(但只需要其中两个相同的就可以完成分裂)
    // 把原来的bucket分为两个,并把原本bucket中的[key, value]安置好
    template <typename K, typename V>
    auto ExtendibleHashTable<K, V>::RedistributeBucket(std::shared_ptr<Bucket> bucket) -> void {
          
          
      auto new_bucket = std::make_shared<Bucket>(bucket_size_, bucket->GetDepth());
      int mask = (1 << bucket->GetDepth()) - 1;
      // 在dir中给这个新的bucket找到合适的位置
      size_t old_bucket_index;
      bool find = false;
      for (size_t i = 0; i < dir_.size(); ++i) {
          
          
        if (!find && dir_[i] == bucket) {
          
          
          find = true;
          old_bucket_index = i;
        } else if (find && dir_[i] == bucket) {
          
          
          if ((old_bucket_index & mask) !=
              (i & mask)) {
          
            // 这一步不可缺少,因为多次扩容之后,有可能mask低位是有可能重复出现几次的
            dir_[i] = new_bucket;
          }
        }
      }
    
      // 重新分配old_bucket中的内容
      // 此处也是一个坑,我采用的方法是不修改原有的bucket,
      // 如果采取的是创建两个新的bucket,就可以通过hash与mask相与来完成重新分配
    	std::list<std::pair<K, V>> list = bucket->GetItems();
      for (const auto &it : list) {
          
          
        if (dir_[IndexOf(it.first)] == bucket) {
          
          
          continue;
        }
        std::pair<K, V> temp = it;
        bucket->Remove(it.first);
        InsertHelper(temp.first, temp.second);
      }
      num_buckets_++;
    }
    

    The second version of the split function:

    template <typename K, typename V>
    auto ExtendibleHashTable<K, V>::RedistributeBucket(std::shared_ptr<Bucket> bucket) -> void {
      auto bucket_0 = std::make_shared<Bucket>(bucket_size_, bucket->GetDepth());
      auto bucket_1 = std::make_shared<Bucket>(bucket_size_, bucket->GetDepth());
    
      int mask = 1 << (bucket->GetDepth() - 1);
      // 重新分配原bucket中的元素
      for (const auto &item : bucket->GetItems()) {
        size_t hash_key = std::hash<K>()(item.first);
        if ((hash_key & mask) == 0) {
          bucket_0->Insert(item.first, item.second);
        } else {
          bucket_1->Insert(item.first, item.second);
        }
      }
    
      for (size_t i = 0; i < dir_.size(); ++i) {
        if (dir_[i] == bucket) {
          if ((i & mask) == 0) {
            dir_[i] = bucket_0;
          } else {
            dir_[i] = bucket_1;
          }
        }
      }
      num_buckets_++;
    }
    

ALL

Regarding locking, this article uses the one proposed in C++17 scoped_lock. To put it simply, compared to C++11 lock_guard, although it also uses the RAII mechanism, it can accept 0 or more mutex locks. Both are slightly less flexible than those that support delayed locking and timeout unlocking unique_lock.

However, the default function given in the course is used where locking is required scoped_lock, and only one lock is passed in. I still don't quite understand why this is done.

std::scoped_lock - cppreference.com — std::scoped_lock - cppreference.com

https://chat.openai.com/share/bd8dcbf6-5c8b-4e58-920f-646cde8b6b3b

If the entire table is locked, it means that only one thread can operate the entire hash table at the same time, and the performance will definitely be relatively poor.

In theory

1. A read-write lock should be added to a table, which can be read at the same time, but if a thread is trying to modify the table, other threads will be blocked;

2. When operating on a Bucket, multiple threads can read at the same time, that is, Find, but if a thread tries to modify the content in the Bucket, other threads that want to operate the same Bucket will be blocked.

There are still many details to explore here, the overall idea is somewhat similar to 意向锁the mechanism.

Task #2:LRU-K Replacement Policy

First, you can try to complete question 146 on LeetCode 146. LRU cache - LeetCode (LeetCode) , to understand the mechanism of LRU briefly.

This Task needs to be completed by the LRU-K algorithm. K means the number of visits. LRU itself removes elements that have not been used for the longest time. The LRU-K algorithm gives priority to removing elements that have not reached K visits. Otherwise, it will Elements that have been accessed K times are eliminated according to the LRU algorithm.

It is explained in the experimental instructions backword k-distancethat it refers to the time difference between the current timestamp and the timestamp of the kth forward access. Frames that are less than the number of k historical accesses are assigned positive infinity +inf. backward k-distanceWhen the replaceable frame backward k-distanceIf both +infare true , the frame with the earliest timestamp is eliminated. This is also mentioned in the original paper " The LRU-K page replacement algorithm for database disk buffering ".

image-20230614215639077

One thing to note is that LRUKReplacerthe maximum size of is consistent with the buffer pool size, that is, all frames can be considered for eviction; but the actual LRUKReplacersize is represented by the size of the frames that can be evicted, and only when the frame is marked as evictable , the size of replacer will increase.

In terms of implementing LRU-K alone, an important distinction point for frame elements is the number of accesses k. According to this idea, all frames can be divided into two parts:

  • History queue: used to store frames that can be eliminated, that is, the number of accesses has not reached k
  • Cache queue: used to store frames that cannot be removed, that is, the number of accesses has reached k

Referring to the LRU algorithm, both queues can consider using the hash + linked list data structure. The difference lies in the need to move elements between the two linked lists.

In addition, two hash tables are needed, one records the number of times the current frame has been accessed, and the other records whether the current frame can be eliminated.

For historical queues and cache queues, according to the guidance, the LRU algorithm should be used to remove elements from the cache queue; while for the historical queue, "the element with the longest access time relative to the current timestamp" should be removed. In principle, it should be Record the timestamp of each frame, using an algorithm similar to first-in-first-out. The focus here is on the oldest one. Assume that there are two elements A and B, and the access timestamps are and k = 3respectively. According to the LRU algorithm, element A should be removed, but if the access time is the longest, element B should be removed.{2,3}{1,4}

Since I still use the hash + linked list algorithm, which is similar to LRU, there is no need to take any action for elements that have been accessed less than k times, and they can be directly removed from the end of the queue.

The functions to be implemented are as follows:

  • Used to eliminate a certain data frame:

    /**
    * @brief Find the frame with largest backward k-distance and evict that frame. Only frames that are marked as 'evictable' are candidates for eviction.
    *
    * A frame with less than k historical references is given +inf as its backward k-distance.
    * If multiple frames have inf backward k-distance, then evict the frame with the earliest
    * timestamp overall.
    *
    * Successful eviction of a frame should decrement the size of replacer and remove the frame's
    * access history.
    *
    * @param[out] frame_id id of frame that is evicted.
    * @return true if a frame is evicted successfully, false if no frames can be evicted.
    */
    auto Evict(frame_id_t *frame_id) -> bool;
    

    The first condition to be met for elimination is 可以被剔除that history_listthe middle elimination uses the FIFO algorithm and cache_listthe middle elimination uses the LRU algorithm. This needs to match the access record of the frame.

  • frame access record

    /**
    * @brief Record the event that the given frame id is accessed at current timestamp.
    * Create a new entry for access history if frame id has not been seen before.
    *
    * If frame id is invalid (ie. larger than replacer_size_), throw an exception. You can
    * also use BUSTUB_ASSERT to abort the process if frame id is invalid.
    *
    * @param frame_id id of frame that received a new access.
    */
    void RecordAccess(frame_id_t frame_id);
    

    There are four cases here:

    • First visit: history_listrecord the frame in, and record the position at the head of the team according to the first-in, first-out principle;
    • The number of accesses has not reached k: according to the FIFO idea, no operation is required;
    • The number of visits has just reached k: history_listmove the elements in to cache_list;
    • The number of visits exceeds k: according to the idea of ​​​​LRU, the recorded cache_listfirst position of the team;
  • Used to set whether a certain frame can be eliminated

    /**
    * @brief Toggle whether a frame is evictable or non-evictable. This function also
    * controls replacer's size. Note that size is equal to number of evictable entries.
    *
    * If a frame was previously evictable and is to be set to non-evictable, then size should
    * decrement. If a frame was previously non-evictable and is to be set to evictable,
    * then size should increment.
    *
    * If frame id is invalid, throw an exception or abort the process.
    *
    * For other scenarios, this function should terminate without modifying anything.
    *
    * @param frame_id id of frame whose 'evictable' status will be modified
    * @param set_evictable whether the given frame is evictable or not
    */
    void SetEvictable(frame_id_t frame_id, bool set_evictable);
    

    Set whether a page can be eliminated set_evictableor cannot be eliminated based on the record of whether a certain frame can be eliminated and the value of .

  • Remove all access records for a certain frame

    /**
    * @brief Remove an evictable frame from replacer, along with its access history.
    * This function should also decrement replacer's size if removal is successful.
    *
    * Note that this is different from evicting a frame, which always remove the frame
    * with largest backward k-distance. This function removes specified frame id,
    * no matter what its backward k-distance is.
    *
    * If Remove is called on a non-evictable frame, throw an exception or abort the
    * process.
    *
    * If specified frame is not found, directly return from this function.
    *
    * @param frame_id id of frame to be removed
    */
    void Remove(frame_id_t frame_id);
    

    As long as the removable condition is met, that is, frame_id < capacity, and it can be removed, it can history_listbe removed from cache_listor removed according to the number of visits. Don't forget to clear the access record and whether the variable can be eliminated during initialization.

    I encountered a small pit here. The recommended way is to thorw std::exception()throw an exception, but in order to get more detailed information during the execution process, I adopted throw std::invalid_argument()the method of judging these two conditions at the same time frame_id < 容量,且可被移除, but according to the design idea, this The error levels of the two are not the same. The former will cause the program to crash, while the latter does not necessarily cause the program to crash.

  • Record the current Replacersize, that is, the number of data frames that can be eliminated

    /**
    * @brief Return replacer's size, which tracks the number of evictable frames.
    *
    * @return size_t
    */
    auto Size() -> size_t;
    

    There is nothing much to say about this item. At most, you need to pay attention to concurrency issues.

Regarding locking, I adopted a simple and crude method std::scoped_lock<std::mutex> lock(latch_), that is, using a big lock.

Task #3:Buffer Pool Manager Instance

Anyone who has studied databases probably knows the existence of Buffer Pool. If you don’t understand, you can read the following article:

Uncovering the Buffer Pool | Xiaolin coding (xiaolincoding.com)

In short, BufferPoolManagerInstanceit is responsible for getting database pages from DiskManagerand storing them in memory. When memory is insufficient or there is a clear indication that the buffer contents need to be cleared, dirty pages can be written back to disk.

According to the instructions in the guide, DiskManagerwe do not need to implement the functions of , Buffer Poolwe can just focus on the design of .

In Buffer Pool Manager Instance, the previously completed ExtendibleHashTablesum is needed LRUKReplacer, the hash table is used to complete page_idthe frame_idmapping, and the LRUK replacer's job is to track each page.

The variables in the class are as follows:

/** Number of pages in the buffer pool. */
const size_t pool_size_;
/** The next page id to be allocated  */
std::atomic<page_id_t> next_page_id_ = 0;
/** Bucket size for the extendible hash table */
const size_t bucket_size_ = 4;

/** Array of buffer pool pages. */
Page *pages_;
/** Pointer to the disk manager. */
DiskManager *disk_manager_ __attribute__((__unused__));
/** Pointer to the log manager. Please ignore this for P1. */
LogManager *log_manager_ __attribute__((__unused__));
/** Page table for keeping track of buffer pool pages. */
ExtendibleHashTable<page_id_t, frame_id_t> *page_table_;
/** Replacer to find unpinned pages for replacement. */
LRUKReplacer *replacer_;
/** List of free frames that don't have any pages on them. */
std::list<frame_id_t> free_list_;
/** This latch protects shared data structures. We recommend updating this comment to describe what it protects. */
std::mutex latch_;

One thing that needs to be made clear is that for pages_an array, since the member variable is *pages_, that is, it points to the first address of the array, you can also pages_[id]query an element in it in a similar way. During the implementation process, it will be used frame_idto indicate which number Page;

For the extensible hash table *page_table_, K is page_id_t, V is frame_id_t, and the specific Page object still relies on *pages_the array for management.

When initializing BufferPoolManagerInstancethe object, its implementation is as follows:

BufferPoolManagerInstance::BufferPoolManagerInstance(size_t pool_size, DiskManager *disk_manager, size_t replacer_k,
                                                     LogManager *log_manager)
    : pool_size_(pool_size), disk_manager_(disk_manager), log_manager_(log_manager) {
    
    
  // we allocate a consecutive memory space for the buffer pool
  pages_ = new Page[pool_size_];
  page_table_ = new ExtendibleHashTable<page_id_t, frame_id_t>(bucket_size_);
  replacer_ = new LRUKReplacer(pool_size, replacer_k);

  // Initially, every page is in the free list.
  for (size_t i = 0; i < pool_size_; ++i) {
    
    
    free_list_.emplace_back(static_cast<int>(i));
  }
}

It can be seen that pages_the size of and replacer_the size of are both pool_size. When I was previewing related content, I saw the following picture on Zhihu, which made a good abstraction of the entire system. You can see that it framecontains page.

img

free_listThe index in frame_idindicates which frame is free, and pages_the index in frame_idindicates whether the page at the corresponding position in the pages is dirty. Buffer Pool ManagerThe managed object is also replacer_a frame, and frame_id is used to represent the number of visits to each frame and other related information.

What page_table_is maintained is the mapping page_idto frame_id, because it is invisible to Buffer Pool Managerexternal objects, for example disk_manager_, and only operates specific pages. So to know whether a page is in the Buffer Pool, you also need to pass .frame_idpage_idpage_table_

The functions that need to be implemented are as follows:

  • Get the Page based on the specified Page_id

    /**
    * @brief Fetch the requested page from the buffer pool. Return nullptr if page_id needs to be fetched from the disk but all frames are currently in use and not evictable (in another word, pinned).
    *
    * First search for page_id in the buffer pool. If not found, pick a replacement frame from either the free list or the replacer (always find from the free list first), read the page from disk by calling disk_manager_->ReadPage(), and replace the old page in the frame. Similar to NewPgImp(), if the old page is dirty, you need to write it back to disk and update the metadata of the new page
    *
    * In addition, remember to disable eviction and record the access history of the frame like you did for NewPgImp().
    *
    * @param page_id id of page to be fetched
    * @return nullptr if page_id cannot be fetched, otherwise pointer to the requested page
    */
    auto FetchPgImp(page_id_t page_id) -> Page * override;
    

    buffer poolFind it first page_id, and return directly if you find it and make a good record.

    If it is not found, you need to load the page from the disk into the memory and take a frame from the current buffer poolfree linked list (empty linked list first). Similarly, if this page is a dirty page, you need to flush it back to the disk. . Then, load the page into memory. Don’t forget this step. The first time I tested it, I failed because I forgot this step.replacerNewPgImp()

    If there is no position in the free list and all pages cannot be eliminated, nullptr is returned.

  • unpin operation

    /**
    * @brief Unpin the target page from the buffer pool. If page_id is not in the buffer pool or its pin count is already 0, return false.
    *
    * Decrement the pin count of a page. If the pin count reaches 0, the frame should be evictable by the replacer. Also, set the dirty flag on the page to indicate if the page was modified.
    *
    * @param page_id id of page to be unpinned
    * @param is_dirty true if the page should be marked as dirty, false otherwise
    * @return false if the page is not in the page table or its pin count is <= 0 before this call, true otherwise
    */
    auto UnpinPgImp(page_id_t page_id, bool is_dirty) -> bool override;
    
  • refresh page

    /**
    * @brief Flush the target page to disk.
    *
    * Use the DiskManager::WritePage() method to flush a page to disk, REGARDLESS of the dirty flag.
    * Unset the dirty flag of the page after flushing.
    *
    * @param page_id id of page to be flushed, cannot be INVALID_PAGE_ID
    * @return false if the page could not be found in the page table, true otherwise
    */
    auto FlushPgImp(page_id_t page_id) -> bool override;
    

    If it is no longer in the memory, it does not need to be refreshed, and it can be called and written to the disk in the memory disk_manager_->WritePage().

    Because the instructions explain, there is no need to consider whether the page is a dirty page.

  • Create a new Page

    /**
    * @brief Create a new page in the buffer pool. Set page_id to the new page's id, or nullptr if all frames are currently in use and not evictable (in another word, pinned).
    *
    * You should pick the replacement frame from either the free list or the replacer (always find from the free list first), and then call the AllocatePage() method to get a new page id. If the replacement frame has a dirty page, you should write it back to the disk first. You also need to reset the memory and metadata for the new page.
    *
    * Remember to "Pin" the frame by calling replacer.SetEvictable(frame_id, false)
    * so that the replacer wouldn't evict the frame before the buffer pool manager "Unpin"s it.
    * Also, remember to record the access history of the frame in the replacer for the lru-k algorithm to work.
    *
    * @param[out] page_id id of created page
    * @return nullptr if no new pages could be created, otherwise pointer to new page
    */
    auto NewPgImp(page_id_t *page_id) -> Page * override;
    

    If it is currently buffer poolfull and all pageitems cannot be removed, return directly nullptr.

    If free_listthere is a free frame in the frame, a page can be created on the frame;

    If free_listthere is no idle frame in the frame, but replacer_there is a page marked as evitable(which needs to be converted by the hash table), a page can be eliminated according to the LRU-K algorithm and a new page can be inserted.

    However, when deleting, you need to determine whether the page is dirty. If so, you need to call disk_manager_the corresponding method in to write the page to the disk; if not, you can directly delete it (just pay attention to the relevant variables).

  • delete page

    /**
    * @brief Delete a page from the buffer pool. If page_id is not in the buffer pool, do nothing and return true. If the page is pinned and cannot be deleted, return false immediately.
    *
    * After deleting the page from the page table, stop tracking the frame in the replacer and add the frame back to the free list. Also, reset the page's memory and metadata. Finally, you should call DeallocatePage() to imitate freeing the page on the disk.
    *
    * @param page_id id of page to be deleted
    * @return false if the page exists but could not be deleted, true if the page didn't exist or deletion succeeded
    */
    auto DeletePgImp(page_id_t page_id) -> bool override;
    

    If it is in the memory, page_table_search it first. If you don’t find it, you don’t need to delete it. If it is there, you need to see if it is the pin of the page. If it is greater than 0, it means that other threads are using it and it cannot be deleted. In other cases, , just remove the relevant information of this page from *replacer_, pages_, page_table_, free_list_*.

  • Refresh all pages into disk

    /**
    * TODO(P1): Add implementation
    *
    * @brief Flush all the pages in the buffer pool to disk.
    */
    void FlushAllPgsImp() override;
    

    Just traverse pagesand then call FlushPgImp.

The test round of this Project submit takes a long time, about five or six minutes, so it is best to check carefully and submit again.

EVERYTHING & NOTE

As of June 17, 2023, after I submitted, a total of 543 people had submitted and successfully passed the test, and the longest time actually took 20s ("even" is used here because the shortest time was less than 1s, ranking first What kind of magic is 0.6s...)

I first submitted it on my Ubuntu virtual machine, and the ranking was 340, and it took 3.8s+. Then I submitted it again in my windows local environment, and the ranking reached 357, and it took 3.9s+. After the second submission, the ranking was still 340, so The fluctuation amplitude is about 0.1s.

After a rough analysis, there are two points that can improve the time-consuming:

  • Use finer-grained locks
  • Change the traversal method (but I think this will only have a more obvious time improvement under large data)

My understanding of concurrency scenarios is not very deep. Let’s optimize again when I return from my studies. The goal is within 1s.

Guess you like

Origin blog.csdn.net/qq_41205665/article/details/131259227