table of Contents
1. Common settings
{
"settings": {
"number_of_shards": "3", // 主分片的个数
"number_of_replicas": "1", // 主分片的副本个数
"refresh_interval": "5s" // index-buffer刷新时间
}
}
The most commonly used settings for ES index are the number of shards and the number of replicas, and there is a refresh time. For how refresh is operated, please refer to Elasticsearch inverted index and document adding principle
parameter | Description |
---|---|
index.number_of_replicas | The number of replicas of each primary shard, the default is 1 |
index.number_of_shards | The number of primary shards can only be set when the index is created and cannot be modified |
index.auto_expand_replicas | Automatically allocate the number of replicas based on the number of available nodes, the default is false |
index.refresh_interval | refresh frequency, the default is 1s, -1 to disable refresh |
index.max_result_window | The maximum value of from+size when searching, the default is 10000 |
index.blocks.read_only | True index and index metadata are read-only, false allows writing and metadata changes |
index.blocks.read | true disables index read operations |
index.blocks.write | true disables index write operations |
index.blocks.metadata | true disables index metadata reading and writing |
Refresh_interval is set to -1 to disable refresh. Generally, it is useful when migrating data needs to add documents in bulk.
Two, translog related settings
{
"settings": {
"translog": {
"flush_threshold_size": "2gb",//translog到达2gb刷新
"sync_interval": "30s",//30s刷新一次
"durability": "async"//异步刷新
}
}
}
The setting of the translog part of ES mainly affects the log placement. If you want to ensure that the data is not lost, you must use the synchronization method.
parameter | Description |
---|---|
index.translog.flush_threshold_ops | How many operations do flush once, the default is unlimited |
index.translog.flush_threshold_size | Flush when the translog size reaches this value, the default is 512mb |
index.translog.flush_threshold_period | There is at least one flush in this time, the default is 30m |
index.translog.interval | How many time intervals will the translog size be checked once, the default is 5s |
Three, analyze related settings
{
"settings": {
"analysis": {
"char_filter": {
}, // 字符过滤器
"tokenizer": {
}, // 分词器
"filter": {
}, // 标记过滤器
"analyzer": {
}, // 分析器
"normalizer":{
} // 规范化
}
}
}
When ES adds documents, an important step is analysis. Analysis will perform character filtering, word segmentation, normalization, token filtering and other operations. These components can all be configured in analysis.
For the configuration example of analysis, please refer to the following section. For the configuration of each component in analysis, please refer to the article mentioned above.
Four, complete configuration example
{
"settings": {
"number_of_shards": "3",
"number_of_replicas": "1",
"refresh_interval": "5s",
"translog": {
"flush_threshold_size": "256mb",
"sync_interval": "30s",
"durability": "async"
},
"analysis": {
"analyzer": {
"my_analyzer": {
"char_filter": ["html_strip", "&_to_and", "replace_dot"],
"filter": ["lowercase", "filter_stop_one", "filter_stop_two"],
"tokenizer": "my_tokenizer",
"type": "custom"
}
},
"char_filter": {
"&_to_and": {
"mappings": ["&=>and"],
"type": "mapping"
},
"replace_dot": {
"pattern": "\\.",
"replacement": " ",
"type": "pattern_replace"
}
},
"filter": {
"filter_stop_one": {
"stopwords": "_spanish_",
"type": "stop"
},
"filter_stop_two": {
"stopwords": ["the", "a"],
"type": "stop"
}
},
"normalizer": {
"my_normalizer": {
"char_filter": [],
"filter": ["lowercase", "asciifolding"],
"type": "custom"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "standard",
"max_token_length": 5
}
}
}
}
}