Inverted and Forward

The search mainly uses three files: diskdata, docinfo and forwardindex.

The diskdata file is an inverted file, used for intersection.

docinfo is the general attribute information of doc (pagerank/publish_time/color signature/size, etc.), which is used for pruning before sorting (that is, simple filtering based on request information).

forwardindex is the forward row information, which is used to find a heavier rank.

When sorting, integrate docinfo and forwardindex information to generate rank_doc.

Of course, the frequently used data is best placed in the memory. Here, the docinfo and forward files are loaded into the memory, and the diskdata is placed in the hard disk (maybe the previous entry part is in the memory), and you can take it as you use it. What data is placed in memory should be combined with the usage scenario.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326294008&siteId=291194637
Recommended