3. In-depth explanation of elasticsearch index storage

1. What about the ES index stored in memory?

(1) The introduction of ES index storage in the early ES1.X version:

Original address: 
https://www.elastic.co/guide/en/elasticsearch/reference/1.4/index-modules-store.html 
ES1.X storage module can control the storage method of index data, index can be stored in memory and disk superior. Better performance can be obtained using the in-memory method, but is limited by the actual amount of physical memory available. 
Earlier ES1.X versions had the option to store indexes in memory, but this did not improve performance compared to mmap-based storage, so the Memory storage type has been removed in ES2.X.

(2) The latest ES2.X version (as of: 2016-08-08) Index storage introduction:

Original address: 
https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-store.html 
The Memeory option has been removed in ES2.X version. 
In ES2.X now the mmap system call is used to map indexes in memory. 
And ES's big cows also recommend using the default default value default_fs for index storage.

write picture description here

2. ES2.X file system storage type

Filesystem-based storage is the default index storage method. There are different implementations or storage types. The best automatic selection of operating systems is: mmapfs is used on Windows 64bit systems, simplefs is used on Windows 32bit systems, otherwise the default is to use (hybrid niofs and mmapfs).

The storage type can be specified by modifying the configuration file 
elasticsearch.yml: index.store.type: niofs

Of course, you can also specify when creating an index:

curl -XPUT localhost:9200/my_index -d '{
  "settings": {
  "index.store.type": "niofs"
  }
}';
  • 1
  • 2
  • 3
  • 4
  • 5

Here are all the different storage types supported:

(1) Simple FS (Simple File System)

The Simplefs type is a simple file storage system (mapped to Lucene SimpleFsDirectory) that implements random access files. This implementation has poor concurrency performance (multithreading is a bottleneck). When you need to persist the index, it is best to use niofs.

(2) NIO FS (NIO file system)

The niofs type writes the shard index file to the file system through NIO (mapped to Lucene NIOFSDirectory). It allows multiple threads to read files simultaneously. Not recommended for use on Windows systems due to a bug in the SUN JAVA implementation.

(3) MMap FS (Memory Mapped File System)

The mmapfs type stores shard indexes on the filesystem (mapped to Lucene MMapDirectory) by mapping files into memory (MMAP). 
In the process of memory mapping, a virtual memory space equal to the size of the mapped file will be divided. Before using this class, make sure you have enough virtual address space.

Virtual memory settings under Linux:

 # sysctl -w vm.max_map_count=262144
  • 1

Permanently effective:

 update the vm.max_map_count setting in /etc/sysctl.conf.
  # echo "vm.max_map_count=262144" >> /etc/sysctl.conf && sysctl -p
  • 1
  • 2

(4) Hybrid MMap / NIO FS (default, default_fs)

The default types of default_fs are NIO FS and MMapFS, which will choose the best file system for each type of file. Currently, only files for Lucene's term paths and DOC values ​​are memory-mapped to reduce runtime impact. All other files are opened using Lucene NIOFSDirectory. The address space setting (section called "Virtual Memory Editing") may also apply if your term dictionary is large.

3. ES index storage summary

In one sentence: 
In version 2.X, users do not need to care about whether the index is stored in memory or on the hard disk. Use the default storage setting default_fs to achieve the best performance 
. Use mmapfs on Windows 64bit systems; use simplefs 
on Windows 32bit systems; otherwise 
use default_fs (hybrid niofs and mmapfs) by default, such as Linux systems.

4. Detailed discussion

https://discuss.elastic.co/t/how-to-set-elasticsearch-index-store/57556/2

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324624658&siteId=291194637