solr the inverted index

Inverted index:

  Every time a search, the search engine must go through each page, you find a web page contains specified keywords, the workload is enormous, for two main reasons:

  1. Internet web base is very large;

  2. Retrieve if they contain specified keywords is not a very simple thing to each page, it needs to walk through each character of the page.

In order to better establish the mapping between the page being searched keywords and friends of these keywords, the inverted index produced. Simply put, the inverted index reverse order, referring to the index is to find the corresponding keyword from the source instead of retrieving the corresponding keyword from the source.

   A search keyword in order, first from the inverted index of the index table, find the keyword A, where A and then look for the page, since the reverse ordering index table in which keywords may be used to find a binary search, particularly in a distributed under condition data, server clusters, multi-threading technology, high efficiency, so look for pages containing certain keywords becomes very simple.

  Assume a database containing one million records, of which there are 10 records that match the search criteria, if you use the inverted index, you can quickly find these keywords, and navigate to ten records containing these keywords, otherwise, need to traverse one million records, efficiency the difference can be imagined.

Guess you like

Origin www.cnblogs.com/qingmuchuanqi48/p/10939161.html