One article to get the SkipList jump table

Usage scenarios of ConcurrentSkipListMap

With ConcurrentHashMap, do we need to design other concurrent containers?

Yes, ConcurrentHashMap cannot guarantee the orderliness of our stored data.

Suppose there is such a demand now. We now have an education system that corresponds to countless students uploading their math scores. The data rules uploaded by students are <key: score, value: name>. Then our education system needs to aggregate and sort all the data. At this point, our ConcurrentHashMap is a bit helpless. Our ConcurrentSkipListMap also debuted at this time.

Of course, in addition to our safe and orderly Map: ConcurrentSkipListSet, there is also our safe and orderly Set: ConcurrentSkipListSet. Their bottom layer is implemented by SkipList (jump list).

Design Ideas of ConcurrentSkipListMap

We are now standing at the designer's point of view. If you want to design this ordered concurrent collection, what basic data structures can be used?

The idea of ​​a linked list

The order can be maintained if the bottom layer is implemented with a linked list.

But there are some drawbacks for singly linked lists. E.g:

  1. The lock granularity is too large: we want to store data concurrently, so every time we insert or delete it, we must lock the entire linked list.
  2. Query efficiency: If we want to find one of the data, we can only traverse the linked list from beginning to end. This efficiency will naturally be much lower.

Balance tree ideas​​

Assuming that a balanced tree is used, the order of the data can be achieved, and compared with the linked list, the query efficiency has also been greatly improved (from O(n)->O(logn)). However, the balanced tree still has two problems:

  1. The lock granularity is too large: we want to store data concurrently, so every time we insert or delete it, we must lock the entire tree.
  2. When adding new elements, it is likely to involve the rotation and color changing operations of multiple nodes. (Too much consumption)

Summarize the problems existing in the implementation of the old basic data structure

We found that from the perspective of implementation, there are actually many data structures that can implement this ordered Map container, but there are two common problems:

  1. In a multi-threaded environment, there is too much competition for locks;
  2. In order to maintain order, there will always be queries or increased efficiency problems

Based on these two issues, our JDK developers researched a new data structure-jump table.

The concept of skiplist

First of all, let me tell you that the query efficiency of the skip table can reach the query efficiency of the balanced binary tree, which is O(logn).

what? So fast? Let's take a look at its data structure.

image.png

This is the essence of the skip list, that is, multiple linked lists are maintained at the same time . And the linked list is hierarchical .

Use of skiplist

Suppose, we now want to find 90 in the above figure, then the process is like this:

image.png

All elements of the linked list in the jump list are sorted. When searching, you can start searching from the top-level linked list. Once the element being searched is found to be greater than the current value in the current linked list and smaller than the next value, it will switch to the next level of the linked list to continue searching. This means that in the search process, the search is skipped.

The original query process is to directly traverse the first-level index, and it needs to traverse 6 units to find 90.

Using our jump table, you only need to traverse the 5 purple units in the graph to find 90.

Of course, the difference in this example is not particularly obvious, but with the lengthening of the first-level index, we will have a fourth-level index, a fifth-level index, and a sixth-level index... and then use the advanced index to locate our first-level index. Ascension will become more obvious.

How does skiplist solve our problem?

List the two problems we encountered before:

  1. In a multi-threaded environment, there is too much competition for locks;
  2. In order to maintain order, there will always be problems with the efficiency of queries, or additions and deletions

The balance of the tree is inserted into and deleted often likely to lead to a balanced tree global adjustment; while the jump table insertion and deletion, can only operate on the entire data structure is local .

The benefits of this are: in the case of high concurrency, a global lock is required to ensure the thread safety of the entire balanced tree; and for the jump table, it can be implemented as lock free (the bottom layer is implemented by CAS) . In this way, in a high concurrency environment, you can have better performance. In terms of query performance, the time complexity of the jump table is O(logn) , so in the concurrent data structure, the JDK uses the jump table to implement a Map.

Advantages of skiplist

  1. Lock free can be achieved in a concurrent environment to improve concurrency efficiency.
  2. Query efficiency, after maintaining multiple linked lists, you can easily jump from the upper index to the position of the element in the primary index.
  3. Inserting data only needs to locate the position to be inserted through the unique query mechanism of the jump table, and then modify the front and back pointers of the element at that position to complete the insertion.

Disadvantages of skiplist

In fact, we keep mentioning the shortcomings, that is, the linked list we need to maintain, which is quite space-consuming. Therefore, the design concept of the jump watch itself is to use space for time . I have to sigh, in the era of the price of memory cabbage, more and more design ideas tend to use memory in exchange for higher execution efficiency. Currently open source software Redis and Lucence both use it.

 

Guess you like

Origin blog.csdn.net/weixin_47184173/article/details/115258741