How to use the Trie tree to design and practice the input prompt function like Google

Source | Search Technology
Editor | Xiaobai

Both Google and Baidu support the input prompt function to help you enter the content you want quickly and accurately.

As follows: Enter "May 1st", and "May 1st Labor Day" will be prompted.

So how to realize the input prompt function like Google?

Analyze the functional requirements of input prompts

When inputting the preceding word A, I hope to prompt out all highly relevant words with the prefix A. This feature belongs to prefix matching. The trie tree is called a prefix tree. It is a search and sorting tree, which is very suitable for input prompt practice.

Let's take python3 as an example and use the Trie tree to build an input prompt service.

# Python3 program to demonstrate auto-complete  # feature using Trie data structure. # Note: This is a basic implementation of Trie # and not the most optimized one. class TrieNode():     def __init__(self):
        # Initialising one node for trie         self.children = {}         self.last = False
class Trie():     def __init__(self):
        # Initialising the trie structure.         self.root = TrieNode()         self.word_list = []
    def formTrie(self, keys):
        # Forms a trie structure with the given set of strings         # if it does not exists already else it merges the key         # into it by extending the structure as required         for key in keys:             self.insert(key) # inserting one key to the trie.
    def insert(self, key):
        # Inserts a key into trie if it does not exist already.         # And if the key is a prefix of the trie node, just          # marks it as leaf node.         node = self.root
        for a in list(key):             if not node.children.get(a):                 node.children[a] = TrieNode()
            node = node.children[a]
        node.last = True
    def search(self, key):
        # Searches the given key in trie for a full match         # and returns True on success else returns False.         node = self.root         found = True
        for a in list(key):             if not node.children.get(a):                 found = False                break
            node = node.children[a]
        return node and node.last and found
    def suggestionsRec(self, node, word):
        # Method to recursively traverse the trie         # and return a whole word.          if node.last:             self.word_list.append(word)
        for a,n in node.children.items():             self.suggestionsRec(n, word + a)
    def printAutoSuggestions(self, key):
        # Returns all the words in the trie whose common         # prefix is the given key thus listing out all          # the suggestions for autocomplete.         node = self.root         not_found = False        temp_word = ''
        for a in list(key):             if not node.children.get(a):                 not_found = True                break
            temp_word += a             node = node.children[a]
        if not_found:             return 0        elif node.last and not node.children:             return -1
        self.suggestionsRec(node, temp_word)
        for s in self.word_list:             print(s)         return 1
# Driver Codekeys = ["五一", "五一劳动节", "五一放假安排", "五一劳动节图片", "五一劳动节图片 2020", "五一劳动节快乐", "五一晚会", "五一假期", "五一快乐","五一节快乐", "五花肉",        "五行", "五行相生"] # keys to form the trie structure.key = "五一" # key for autocomplete suggestions.status = ["Not found", "Found"]
# creating trie objectt = Trie()
# creating the trie structure with the# given set of strings.t.formTrie(keys)
# autocompleting the given key using# our trie structure.comp = t.printAutoSuggestions(key)
if comp == -1:    print("No other strings found with this prefix\n")elif comp == 0:    print("No string found with this prefix\n")
# This code is contributed by amurdia

Input: May 1st, the input prompt result is as follows:

The results have been achieved, but the order of input prompts after our implementation is a bit different from Google, what should we do?

Generally, the data source for constructing the input prompt is the log data of the query word input by the user, and the number of times each input word is counted , so as to prompt the user according to the popularity of the input word.

Now we add the number of log lexicons to simulate the input effect of Google.

Examples of query words and numbers in the log library are as follows:

五一劳动节 10五一劳动节图片 9五一假期 8五一劳动节快乐 7五一放假安排 6五一晚会 5五一 4五一快乐 3五一劳动节图片2020 2五一快乐 1

Adjust the prompt code to support the number of query words:

# Python3 program to demonstrate auto-complete  # feature using Trie data structure. # Note: This is a basic implementation of Trie # and not the most optimized one. import operatorclass TrieNode():     def __init__(self):                   # Initialising one node for trie         self.children = {}         self.last = False  class Trie():     def __init__(self):                   # Initialising the trie structure.         self.root = TrieNode()         #self.word_list = []         self.word_list = {}      def formTrie(self, keys):                   # Forms a trie structure with the given set of strings         # if it does not exists already else it merges the key         # into it by extending the structure as required         for key in keys:             self.insert(key) # inserting one key to the trie.       def insert(self, key):                   # Inserts a key into trie if it does not exist already.         # And if the key is a prefix of the trie node, just          # marks it as leaf node.         node = self.root           for a in list(key):             if not node.children.get(a):                 node.children[a] = TrieNode()               node = node.children[a]           node.last = True      def search(self, key):                   # Searches the given key in trie for a full match         # and returns True on success else returns False.         node = self.root         found = True          for a in list(key):             if not node.children.get(a):                 found = False                break              node = node.children[a]           return node and node.last and found       def suggestionsRec(self, node, word):                   # Method to recursively traverse the trie         # and return a whole word.          if node.last:             #self.word_list.append(word)            ll = word.split(',')            if(len(ll) >= 2):                self.word_list[ll[0]] = int(ll[1])            else:                self.word_list[ll[0]] = 0          for a,n in node.children.items():             self.suggestionsRec(n, word + a)       def printAutoSuggestions(self, key):                   # Returns all the words in the trie whose common         # prefix is the given key thus listing out all          # the suggestions for autocomplete.         node = self.root         not_found = False        temp_word = ''           for a in list(key):             if not node.children.get(a):                 not_found = True                break              temp_word += a             node = node.children[a]           if not_found:             return 0        elif node.last and not node.children:             return -1          self.suggestionsRec(node, temp_word)           #sort        sorted_d = dict(sorted(self.word_list.items(), key=operator.itemgetter(1),reverse=True))        for s in sorted_d.keys():             print(s)         return 1
# Driver Codekeys = ["五一,4", "五一劳动节,10", "五一放假安排,6", "五一劳动节图片,9", "五一劳动节图片 2020,2", "五一劳动节快乐,7", "五一晚会,5", "五一假期,8", "五一快乐,3","五一节快乐,1", "五花肉,0",        "五行,0", "五行相生,0"] # keys to form the trie structure.key = "五一" # key for autocomplete suggestions.status = ["Not found", "Found"]
# creating trie objectt = Trie()
# creating the trie structure with the# given set of strings.t.formTrie(keys)
# autocompleting the given key using# our trie structure.comp = t.printAutoSuggestions(key)
if comp == -1:    print("No other strings found with this prefix\n")elif comp == 0:    print("No string found with this prefix\n")
# This code is contributed by amurdia

The output is exactly the same as Google:

to sum up:

The above is to use the Trie tree to practice the Google input prompt function. In addition to the Trie tree practice, do we have other methods? Are there other indexes in the search that can implement the input prompt function well?

More reading recommendations

Guess you like

Origin blog.csdn.net/FL63Zv9Zou86950w/article/details/112855409