Rebuilding the index will not copy the settings of the source index. You should
_reindex
specify the settings of the target index before execution, including mappings, number of shards, number of replicas, etc.
first example
POST _reindex
{
"source": {
"index": "test"
},
"dest": {
"index": "test-copy"
}
}
_reindex takes a snapshot to rebuild the index. To handle version conflicts, you can specify the version_type attribute in the target index, including two options "internal" and "external". (==The role of these two options, I did not understand==)
By adding properties to the parameters of the target index op_type
and setting these properties to "create", _reindex will only create documents that do not exist in the target index. All existing documents will cause a version conflict, but do not affect the execution of _reindex. It can be set conflicts
to "proceed", and only count the number of documents with conflicting versions. The difference between the two is as follows: Request parameters
POST _reindex
{
"source": {
"index": "test"
},
"dest": {
"index": "test-copy",
"op_type": "create"
}
}
The response result is as follows
{
"took": 2,
"timed_out": false,
"total": 2,
"updated": 0,
"created": 0,
"deleted": 0,
"batches": 1,
"version_conflicts": 2,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": [
{
"index": "test-copy",
"type": "doc",
"id": "2",
"cause": {
"type": "version_conflict_engine_exception",
"reason": "[doc][2]: version conflict, document already exists (current version [1])",
"index_uuid": "8b78uPjKRmuH_2cqSiPKIA",
"shard": "2",
"index": "test-copy"
},
"status": 409
},
{
"index": "test-copy",
"type": "doc",
"id": "1",
"cause": {
"type": "version_conflict_engine_exception",
"reason": "[doc][1]: version conflict, document already exists (current version [1])",
"index_uuid": "8b78uPjKRmuH_2cqSiPKIA",
"shard": "3",
"index": "test-copy"
},
"status": 409
}
]
}
request parameters
POST _reindex
{
"conflicts": "proceed",
"source": {
"index": "test"
},
"dest": {
"index": "test-copy",
"op_type": "create"
}
}
response result
{
"took": 5,
"timed_out": false,
"total": 3,
"updated": 0,
"created": 0,
"deleted": 0,
"batches": 1,
"version_conflicts": 3,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": []
}
Multiple source indexes can be specified, such as "index": ["source_index_1", "source_index_2"]. The number of documents copied from the target index can be limited, query and sort can be used in the source index, and the _source field can be specified
POST _reindex
{
"size":1,
"source":{
"index": "test",
"sort": {
"date": "desc"
},
"query": {
"match": {
"test": "data"
}
},
"_source": ["field1", "field2"]
},
"dest":{...}
}
_reindex supports scripts to modify documents.
If the source document has a field named "flag" and you want to change it to "tag" in the target document, you can execute the following statement
POST _reindex
{
"source": {
"index": "test"
},
"dest": {
"index": "test2"
},
"script": {
"source": "ctx._source.tag = ctx._source.remove(\"flag\")"
}
}
Rebuild index from remote elasticsearch
POST _reindex
{
"source": {
"remote": {
"host": "http://otherhost:9200",
"username": "user",
"password": "pass"
},
"index": "source",
"query": {
"match": {
"test": "data"
}
}
},
"dest": {
"index": "dest"
}
}
A whitelist of allowed remote servers can be configured in elasticsearch.yml:reindex.remote.whitelist: ["first-host:9200", "second-host:9200"]
Remote reconstruction will use a heap buffer with a maximum size of 100Mb. If the size of the documents in the source index is large, the number of each batch should be specified reasonably, that is, the size attribute mentioned earlier.
You can specify socket_timeout
and connect_timeout
, if not specified, the default value of these two parameters is 30 seconds.
POST _reindex
{
"source": {
"remote": {
"host": "http://otherhost:9200",
"socket_timeout": "1m",
"connect_timeout": "10s"
},
"index": "source"
},
"dest": {
"index": "dest"
}
}
For more functions, see the official documentation: https://www.elastic.co/guide/en/elasticsearch/reference/6.1/docs-reindex.html