Shards and Segments for Elasticsearch

Shard (sharding)
      A shard is a Lucene instance and a complete search engine. An index can contain only one shard, but in general, multiple shards are used, and the index can be split into different nodes to share the index pressure.

Each segment in segment
     elasticsearch contains multiple segments, and each segment is an inverted index; when querying, all segment query results will be aggregated and merged, and the final segment query result will be returned;
     after creating the index At the same time, elasticsearch will write the document information to the memory bugffer (for safety, it will also be written to the translog together), regularly (configurable) write the data to the segment cache small file, and then refresh the query to make the segment just written available. check.
Although the written segment is queryable, it has not been persisted to disk. Therefore, there is still the possibility of loss.
      Therefore, elasticsearch will perform the flush operation, persist the segment to disk and clear the translog data (because at this time, the data has been written to the disk and is no longer needed).
When the index data continues to grow, the corresponding segments will also continue to increase, and the query performance may decrease. Therefore, Elasticsearch will trigger the segment merge thread, merge many small segments into larger segments, and then delete the small segments.
     Segments are immutable. When we update a document, we mark the old data as deleted and write a new document. When the flush operation is performed, the deleted records are physically deleted.

Reference: http://stackoverflow.com/questions/15426441/understanding-segments-in-elasticsearch

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326229781&siteId=291194637