Elasticsearch the inverted index

Speaking before es core search engine is the inverted index, each field will maintain its inverted index (unless explicitly turned off), the inverted index structure is composed as follows:

  • Word dictionary (Term Dictionary): all the documents recorded words, a greater amount of occupation of data, related information is recorded to the discharge from the word to the list, generally implemented using B + Tree structure;
  • Inverted list (Posting List): a collection of documents recording the corresponding word in the dictionary, it consists of an inverted index entries (Posting), inverted index items include:
  1. Document id, the document used to obtain the raw data
  2. Word frequency (TF, Term Frequency), records the number of occurrences of the word in the document, it is considered one of the bases of the relevant points
  3. Position recorded words in the original document data word location , searching for words
  4. Offset (Offset), the recorded words specific start and end positions in the original document data, such as may be used to highlight certain query result


books Previous: elasticsearch the index and documentation

books Next: elasticsearch the word

Published 202 original articles · won praise 571 · Views 1.47 million +

Guess you like

Origin blog.csdn.net/fanrenxiang/article/details/85274287