Trie && example Xor Sum HDU - 4825 (board)

First, the dictionary tree Description:
Trie tree, that is, the dictionary tree, also known as the key tree or trie is a tree structure, a variation on the hash tree. Typical applications for statistical
and sort a large number of strings (strings, but not limited to), it is often used for text search engine system word frequency statistics. Its advantages are: the largest
minimizes unnecessary string comparison, high-efficiency Bi Haxi query table.
Trie core idea is space for time. Use common prefix string to reduce the cost of the query time to achieve greater efficiency.
It has three basic properties:
1, the root node does not include characters, each node except the root node contains only the outer one character.
2, to a node from the root node through the path connecting the character string corresponding to that node.
3, the characters all child nodes of each node contains is not the same.
 
Second, the dictionary tree time complexity:
Problem:
There is a word in the English store text files, now you need to know some of the given word exists in the file, if there is, it appears how many times?
(Violent methods will not say)
Solution one:
If all the words are stored in a map, each time to find the time complexity is reduced to O (log (n)).
Solution two:
Suppose the sum of all string length n, to build trie time complexity is O (n). Suppose you want to find the string length is k, find the time complexity is:
O (K) (also corresponding to O (1)).
 
Third, the dictionary tree construction:
 
For quite widely circulated on the Internet example, as follows:
    Title: give you 100,000 no longer than 10 words. For each word, we have to judge him haunt appeared, if there has been seeking for the first time appear in the first several locations.
    Analysis: This problem of course can hash to solve, but this article focuses on is the trie, because of the greater its use in certain ways. For example, one word, we have to ask whether it's prefix appeared. In this way hash out the bad, and with a trie it is very simple.
    Now back to the example, if there is it if we use the most stupid way, for each word, we have to go look for it in front of the word. So the complexity of this algorithm is O (n ^ 2). It is clearly unacceptable for the range of 100,000. Now we want a change in thinking. I suppose the word query is abcd, then in front of him in words, to b, c, d, f like the beginning I obviously do not have to be considered. And as long as that begin with a look whether there abcd it. Similarly, in the beginning with a word, as long as we consider b as the second letter, again and again narrow the scope and improve the relevance of such a model tree will gradually clear.
    Like assume b, abc, abd, bcd, abcd, efg, hii these six words, we build this tree is shown below:

 

 

As shown above, for each node traversed from the root to his process is a word, if the node is marked in red, it indicates the presence of the word, or does not exist.

Well, for a word, I just went down his corresponding node from the root, look at whether this node is marked in red you can know whether it appeared before. This node is marked in red, it is equivalent to insert the word.

As a result of our inquiries and insertions can be completed (the query and insert emphasis experience how it is done together, later explained in detail below) together, the time spent is only a word length, in this sample, is 10.

We can see, trie tree nodes in each layer is 26 ^ i-level. So in order to save space. We use a dynamic linked list, or use an array to simulate dynamic. Space takes no more than the word length × number of words.

 

Four, Trie tree Application:

In addition to the introduction of said problems described herein can be applied to solve the Trie, the Trie can solve a problem (excerpt from the article: massive data collection and processing is inscribed Detailed Bit-map):

    • 3, has a file size of a 1G, which each row is a word, the word size is not more than 16 bytes, the size of the memory limit is 1M. Returns the highest frequency of 100 words.
    • 9,1000 Wan strings, some of which are repeated, the need to repeat all removed, retained no duplicate strings. Please how to design and implement?
    • 10, a text file, about one million lines, one word per line, requires the statistics of the top 10 most frequently occurring words, please give thought, given the time complexity analysis.
    • 13, looking for popular queries: all strings retrieved by the search engine will retrieve the log file each time the user uses are recorded, each the length of the query string of 1-255 bytes. Assuming that there are ten million records, repeatable read these query string is relatively high, although the total number is 10 million, but if the removal of duplication and no more than 3 million. The higher the repetition of a query string, indicating more queries its users, the more popular. Please statistics of the 10 most popular query string, required memory usage can not exceed 1G.
      (1) Please describe your ideas to solve this problem;
      (2) Please give the main treatment processes, algorithms, and complexity of the algorithm.

For information on different topics can be changed trie node maintenance done to solve

 

Five examples to explain

Zeus and Prometheus made a game, a set of Prometheus to Zeus, the set contains a positive integer N, followed by Prometheus Zeus will initiate inquiry M times, each time asking contains a positive integer S, after Zeus need to find them in the collection a positive integer K, and K so that the exclusive oR S is the maximum. Prometheus order for Zeus to see human greatness, then agreed to Zeus can turn to humans. Can you prove that human intelligence it?

Input

Comprising a plurality of groups of test data input, each comprising a number of rows of test data.
The first line of the input is an integer T (T <10), T represents the total data set.
The first line of each data inputted two positive integers N, M (<1 = N , M <= 100000), the next row contains N a positive integer, representative for the set of Zeus, then M rows, each row a positive integer S, a positive integer representing Prometheus asked. All positive integers not more than 2 ^ 32.

Output

For each set of data, you first need to output a single line "Case # ?:", where the question marks to be filled at the current number of data sets, a number of groups is calculated from the start.
For each query, the output of a positive integer K, and K so that the maximum value S XOR. Sample Input

2
3 2
3 4 5
1
5
4 1
4 6 5 6
3

Sample Output

Case #1:
4
3
Case #2:
4

answer:

For the first input number n, first to build a dictionary with their binary tree. Then the number of inquiries for m times x, and x each time that try to take a bit value of the opposite node (time to go away from a binary high to low)

Codes are explained

 

Code:

. 1 #include <the iostream>
 2 #include <cstdio>
 . 3 #include <CString>
 . 4 #include <the cstdlib>
 . 5 #include <algorithm>
 . 6  the using  namespace STD;
 . 7 typedef struct Trie * TrieNode; // Note that this line, if the following TrieNode variable declared with the type of pointer variables are Trie 
. 8  struct Trie
 . 9  {
 10      int Val;
 . 11      TrieNode Next [ 2 ];
 12 is      Trie ()
 13 is      {
 14          Val = 0 ;
 15         Memset (Next, NULL, the sizeof (Next));
 16      }
 . 17  };
 18 is  
. 19  void inserts (TrieNode the root, int x) // bit from the high contribution to the low x in binary number 
20 is  {
 21 is      TrieNode P = the root;
 22 is      for ( int I = 31 is ; I> = 0 ; i-- )
 23 is      {
 24          int T = (X >> I) & . 1 ;
 25          IF (p-> Next [T] == NULL) p-> Next [T] = new new  struct Trie ();
 26 is          P = p->Next [T];
 27      }
 28      p-> Val = X;
 29  }
 30  
31 is  int Query (TrieNode the root, int X)
 32  {
 33 is      TrieNode P = the root;
 34 is      for ( int I = 31 is ; I> = 0 ; I - )
 35      {
 36          int T = ((X >> I) & . 1 ) ^ . 1 ;   // because we want the final answer to the exclusive oR as large as possible, and because the XOR operation is: "the same is 0, 1 is different "
 37          // and because we are from a high level to start the search, binary number (10000) to be greater than the binary number (01111), so we have to try to walk (! (x >> i) & 1) that a
 38         // node. This is only one node can not walk will go away ((x >> i) & 1 ) This node
 39          // ((x >> i) & 1 ) is represented by the binary value of x in the i-th position of the 
40          IF (p- > Next [T] == NULL) = T (X >> I) & . 1 ;
 41 is          IF (p-> Next [T]) = p-P> Next [T];
 42 is          the else  return - . 1 ; // go to the end of the can end 
43 is      }
 44 is      return p-> Val;
 45  }
 46 is  
47  void Del (TrieNode the root)   // run to delete a node of each tree 
48  {
 49      for ( int I = 0 ; I < 2 ; ++i)
50     {
51         if(root->next[i])Del(root->next[i]);
52     }
53     delete(root);
54 }
55 
56 int main()
57 {
58     int t,n,m,p=0;
59     scanf("%d",&t);
60     while(t--)
61     {
62         printf("Case #%d:\n",++p);
63         TrieNode root = new struct Trie();
64         scanf("%d %d",&n,&m);
65         for(int i=0 ; i<n ; ++i)
66         {
67             int t;
68             scanf("%d",&t);
69             inserts(root,t);
70         }
71         for(int i=0 ; i<m ; ++i)
72         {
73             int t;
74             scanf("%d",&t);
75             int m = query(root,t);
76             printf("%d\n",m);
77         }
78         Del(root);
79     }
80     return 0;
81 }

 

 

Guess you like

Origin www.cnblogs.com/kongbursi-2292702937/p/12001338.html