The principle and realization of SkipList

1. What is a jump table

2. The realization of the jump table


1. What is a jump table

    For a singly linked list, even if the data stored in the linked list is ordered, if we want to find some data in it, we can only traverse the linked list from beginning to end. In this way, the search efficiency will be very low, and the time complexity will be very high, which is O(n).

    

    If we want to improve its search efficiency, we can consider the way to build an index on the linked list. Every two nodes extract one node to the upper level, and we call the extracted level the index.

    

    At this time, we assume that we want to find node 8, we can traverse in the index layer first, when we traverse to the node with the value of 7 in the index layer, we find that the next node is 9, then the node 8 to be searched must be in these two Between nodes. We descended to the linked list layer and continued to traverse to find the node 8. Originally, we need to traverse 8 nodes to find the node 8 in the singly linked list, but now we only need to traverse five nodes with the first-level index. From this example, we can see that after adding a layer of index, the number of nodes that need to be searched for a node is reduced, that is to say, the search efficiency is improved, and in the same way, add a level of index.

    

   From the figure, we can see that the search efficiency has improved again. In the example, we have very little data. When there is a large amount of data, we can increase the multi-level index, and its search efficiency can be significantly improved.

    

    The structure of this linked list plus multi-level index is a jump table. Each node not only contains a pointer to the next node, but may contain many pointers to subsequent nodes, so that some unnecessary ones can be skipped. Nodes, thereby speeding up operations such as searching and deleting. For how many pointers to subsequent elements each node in a linked list contains, this process is obtained through a random function generator, which constitutes a jump list. This is why there is a "probability" in the paper "Skip Lists: A Probabilistic Alternative to Balanced Trees", which is to randomly generate the number of pointers to subsequent nodes in a node. The hopping table uses probabilistic equalization technology instead of forced equalization. Therefore, it is more concise and efficient for inserting and deleting nodes than the traditional balanced tree algorithm.     

2. The realization of the jump table

2.1 Define the data structure

struct skipNode {
	int key;
	int value;
	int level;	       // size表示该节点存在的最高层数
	skipNode* *next;   //skipNode* 的数组
	skipNode(int k, int v, int size) : key(k), value(v), level(size) 
    {
		next = new skipNode*[size];
	}
};

class skipList {
public:
	skipList(int, int, float prob = 0.5, int maxNum = 10000);
	~skipList();

	skipNode* find(const int) const;
	void erase(const int);
	void insert(const int k, const int v);

protected:
	// 获得新的level
	int getNewLevel()const;
	// 搜索指定key的附近节点
	skipNode* search(const int) const;	

private:
	float cutOff;	// 用于生成新的层级
	int levels;		// 当前已经分到了多少层
	int maxLevel;	// 层数上限
	int maxKey;		// key值上限
	int dataSize;   // 节点个数

	skipNode* headerNode;
	skipNode* tailNode;
	skipNode** last;	// 因为是单链结构,所以保存查找的节点的前一个节点
};

2.2 Initialize the jump table

/* prob = 每隔 1/prob 个节点分一层
   headKey = 头节点的key值,所有节点key值均不大于此key值
   tailKey = 为节点的key值,所有节点key值均不大于此key值
   maxNum = 跳表最大接受的节点个数
*/
skipList::skipList(int headKey, int tailKey, float prob, int maxNum)
{
	cutOff = prob * RAND_MAX;
	maxLevel = (int)ceil(logf((float)maxNum) / logf(1 / prob));
	levels = 0;
	maxKey = headKey;
	dataSize = 0;

	//初始化头尾节点
	headerNode = new skipNode(headKey, 0, maxLevel + 1);
	tailNode = new skipNode(tailKey, 0, 0);

	//上次对比的节点
	last = new skipNode*[maxLevel + 1];

	//初始化头尾节点指向
	for (int i = 0; i <= maxLevel; i++)
		headerNode->next[i] = tailNode;
}

skipList::~skipList()
{
	skipNode* node;
	// 第0层包含了所有的节点
	while (headerNode != tailNode) 
    {
		node = headerNode->next[0];
		delete headerNode;
		headerNode = node;
	}
	delete tailNode;
	delete[] last;
}

/* 利用随机数是否大于cutoff值获得新的level值,但新的level值不超过maxLevel,防止浪费 */
int skipList::getNewLevel() const
{
	int lev = 0;
	while (rand() <= cutOff)
		lev++;
	return lev <= maxLevel? lev: maxLevel;
}

skipNode* skipList::search(const int theKey) const
{
	if (theKey > maxKey)
		return NULL;
	skipNode* node = headerNode;
	for (int i = levels; i >= 0; i--)
	{
		while (node->next[i]->key < theKey)
			node = node->next[i];
		/* 因为节点处在单链上,所以,这里保存该节点的前一个节点,方便插入 */
		last[i] = node;
	}
	return node->next[0];
}

2.3 Find

    Searching is to find out whether this key appears in the jump table, if it appears, then return its value, if it does not exist, then return it does not exist.

skipNode * skipList::find(const int theKey) const
{
	if (theKey >= maxKey)
		return NULL;

	skipNode* node = headerNode;
	for (int i = levels; i >= 0; i--)
		while (node->next[i]->key < theKey)
			node = node->next[i];

	if (node->next[0]->key == theKey)
		return node->next[0];

	return NULL;
}

2.4 Insert

    Insertion includes the following operations:

  • Find the position that needs to be inserted;
  • Apply for a new node;
  • Adjust the pointer
void skipList::insert(const int k, const int v)
{
	if (k > maxKey) {
		cout << "key = " << k << "不能比maxKey = " << maxKey << "大" << endl;
		return;
	}

	skipNode* node = search(k);

	/* 节点已存在 */
	if (node->key == k)
	{
		node->value = v;
		return;
	}

	/* 节点不存在, 此时node在新加的节点右侧,生成新的层次 */
	int newLevel = getNewLevel();
	if (newLevel > levels)
	{
		newLevel = ++levels;
		last[newLevel] = headerNode;
	}

	skipNode* newNode = new skipNode(k, v, newLevel);
	/* last节点数组保存了该节点的前一个节点, 每一层都插入 */
	for (int i = 0; i <= newLevel; i++)
	{
		newNode->next[i] = last[i]->next[i];
		last[i]->next[i] = newNode;
	}

	dataSize++;
	return;
}

2.5 Delete

    The delete operation is similar to the insert operation and consists of the following 3 steps:

  • Find the node that needs to be deleted
  • Delete node
  • Adjust the pointer
void skipList::erase(const int theKey)
{
	if (theKey > maxKey)
		return;

	skipNode* node = search(theKey);
	if (node->key != theKey)
		return;

	for (int i = 0; i <= levels && last[i]->next[i] == node; i++)
		last[i]->next[i] = node->next[i];

	/* 删除没有数据的层 */
	while (levels > 0 && headerNode->next[levels] == tailNode)
		levels--;

	delete node;
	dataSize--;
}

 2.6 Main function

int main()
{
	skipList list(999, 1000);
	list.insert(1, 100);
	list.insert(3, 300);
	list.insert(2, 200);
	list.insert(7, 700);
	list.insert(6, 600);
	list.insert(4, 400);
	list.insert(5, 500);

	skipNode* node = list.find(3);
	list.erase(3);
	node = list.find(3);
}

 

Guess you like

Origin blog.csdn.net/MOU_IT/article/details/113831820