Algorithm study notes: Trie tree (dictionary tree)

1 Overview

Trie tree, called dictionary tree in Chinese, is an efficient processing algorithm for strings.

The function implemented by the Trie tree is to quickly find out whether a certain string in a bunch of strings is the prefix, suffix, etc. of another string.

2. Detailed

2.1 The concept of Trie tree

The Trie tree is first a tree. For example, the tree below is a Trie tree.

Insert picture description here

This tree is a ab, abd, ac, bdfour character strings.

Then combined with the diagram, we will find that the Trie tree has the following characteristics:

  1. Each string must be the path from the root node to a node.
  2. The root node is a virtual node.
  3. The red node indicates that this letter is the end of a string.

Then according to the above three properties, we can easily insert each string uniquely into the Trie tree without causing ambiguity.

So what is the operation of the Trie tree? How to achieve it?

2.2 Trie tree operation

All the following strings, unless otherwise specified, only contain lowercase letters.

2.2.1 Trie tree storage

We need a structure to store the Trie tree.

Two variables need to be included in the Trie tree: ch [], flag ch[], flagch[],f l a g respectively indicate the child number of the current node and whether the current node is a word node.

In particular, 0 is the super root, and no letter information will be stored.

At the same time we need an init ⁡ \operatorname{init}i n i t function to initialize.

Code:

struct node
{
    
    
	int ch[26], flag;
	void init()
	{
    
    
		for (int i = 0; i <= 25; ++i) ch[i] = 0;
		flag = 0;
	}
}

2.2.2 Trie tree insertion-Insert

Let's see how to insert the Trie tree. Still look at the picture.

Insert picture description here

Suppose we want to insert a string now acd.

So first look at the first one a, there, then we along adown.

Insert picture description here

Then we saw cthere, down.

Insert picture description here

Then we look d, there is no, then we create a new node dfollowed cbehind.

Insert picture description here

Then run out, so the mark dis a word of emphasis, the final trie as follows.

Insert picture description here

The code inserted into the Trie tree is as follows:

void Insert(int k)//k 表示第 k 个字符串,采用字符数组存储,下标从 1 开始
{
    
    
	int len = strlen(a[k] + 1), p = root;
	for (int i = 1; i <= len; ++i)
	{
    
    
		int q = a[k][i] - 'a';
		if (!tree[p].ch[q])
		{
    
    
			++cnt;//计数器 +1
			tree[cnt].init();//预先初始化,多测的时候尤其有用(好像比 memset 快)
			tree[p].ch[q] = cnt;
		}
		p = tree[p].ch[q];
	}
	tree[p].flag = 1;//标记当前为一个单词的结尾
}

It should be noted that if the string is repeated, then the flag flagf l a g should be +1 instead of =1.

2.2.3 Trie tree query-Find

Trie tree query.

It's the same Trie tree just now. Suppose we want to query acdand bdaexists. Of course, when it is official, the query is one by one. Because of the laziness and convenience, I will use the two together.

First look aand bare in the root of his son, then go down.

Insert picture description here

Then watch cand d, in, we continue to go down:

Insert picture description here

Finally look dand afound din the aabsence, it can be judged bdanot the P 1 P_1P1 go down.

Insert picture description here

The first traversal over the last string and P 1 P_1P1The flag flag of the nodef l a g is 1(this is very important, because there may be cases where it is found but not a string), then it exists.

The query operation code is as follows:

bool Find(int k)
{
    
    
	int len = strlen(b[k] + 1), p = root;
	for (int i = 1; i <= len; ++i)
	{
    
    
		int q = b[k][i] - 'a';
		if (!tree[p].ch[q]) return 0;//节点不存在
		p = tree[p].ch[q];
	}
	if (!tree[p].flag) return 0;//这不是一个字符串
	return 1;
}

2.3 The scope of application of Trie tree

Trie tree is suitable for those with multiple modes and multiple text strings strings, asking whether a string with another string prefix and so on. Note that the Trie tree cannot directly perform string matching. It requires a little technology. This is solved by the AC automata. Note that it is not an automatic AC machine

3. Summary

The Trie tree is intuitive and is a simple process of inserting and deleting.

Guess you like

Origin blog.csdn.net/BWzhuzehao/article/details/113183927