ElasticSearch performance optimization summary 04

Elasticsearch is how to achieve Master election?

  • Elasticsearch selected master module is responsible ZenDiscovery, mainly comprising the Ping (RPC between the nodes to discover each other) and the Unicast (unicast module contains a list of hosts which nodes need to control the ping) two portions;
  • All can become a master node ( node.master: to true ) according to nodeId dictionary sort, every election they know each node node regarded a row order, and then select the first (bit 0) node, for the time being considered it is the master node.
  • If the number of votes of a node reaches a certain value (can be the master nodes n / 2 + 1) and the node also elect their own, then this node is the master. Otherwise re-election until the above conditions are met.
  • Added: Responsibilities include management of the master node clusters, nodes and indexes, is not responsible for the document-level management; data node can turn off the http function .

Describe in detail the process of Elasticsearch index of documents.

  • Coordinating node involved in the calculation default document ID (also supported by routing), to provide the appropriate routing slice.
shard = hash(document_id) % (num_of_primary_shards)
  • When the node is located fragmented receiving a request from the coordinator node, the request is written to Memory Buffer, and the timing (every 1 second by default) is written into Filesystem Cache, from Momery Buffer to this process is called Filesystem Cache refresh;
  • Of course, in some cases, there is Momery Buffer and Filesystem Cache data may be lost, ES translog through mechanisms to ensure the reliability of the data. Its implementation mechanism after receiving the request, but also written to the translog, when data is written in the Filesystem cache to disk, will be removed, a process called flush;
  • In the flush process, the memory buffer is cleared, contents are written to a new segment, segment fsync will create a new commit point, and the contents flushed to disk, the old translog will be deleted and a new beginning translog.
  • Time to flush trigger is timed trigger (default 30 minutes) or too large (the default is 512M) translog become;

 

Elasticsearch in the deployment, optimization methods which have set for Linux?

  • 64 GB of memory machine is ideal, but the 32 GB and 16 GB machine is also very common. Less than 8 GB would be counterproductive.
  • If you want faster CPUs and more between the nuclear option, choose more cores is better. A plurality of additional core provides much complicated slightly faster than a bit clock frequency.
  • If you can afford SSD, it will go far beyond any rotating media. SSD-based nodes, query and indexing performance has improved. If you can afford, SSD is a good choice.
  • Even if data centers are in close proximity, but also to avoid the cluster across multiple data centers. Absolutely want to avoid clusters across large geographical distances.
  • Be sure to run your application server and JVM JVM is exactly the same. Elasticsearch in several places, the use of native Java serialization.
  • Can avoid excessive fragmentation exchange by setting gateway.recover_after_nodes, gateway.expected_nodes, gateway.recover_after_time when the cluster is restarted, which may make data recovery time from several hours to a few seconds.
  • Elasticsearch default is configured to use a unicast discovery to prevent inadvertent node cluster. Only nodes running on the same machine will automatically form a cluster. Unicast is preferably used in place of multicast.
  • Do not arbitrarily change the size of the garbage collector (CMS) and each thread pool.
  • Put your memory (less than) half to Lucene (but not more than 32 GB!), Set up by ES_HEAP_SIZE environment variable.
  • Memory is swapped to disk on server performance is fatal. If the memory is swapped to disk, a 100 microsecond operation may become 10 msec. Think think so much the operation delay of 10 microseconds add up. Swapping is easy to see how the performance is terrible.
  • Lucene uses a large number of files. Meanwhile, Elasticsearch uses a lot of communication between a node and a socket of the HTTP client. All of this requires sufficient file descriptors. You should increase your file descriptor, set a large value, such as 64,000.

 

Added: stage performance index method

  • Use bulk request and resize: 5-15 MB large amounts of data each time is a good starting point.
  • Storage: Use SSD
  • Segment and merge: Elasticsearch default value is 20 MB / s, mechanical disk should be a good setting. If you are using a SSD, consider raising the 100-200 MB / s. If you are doing bulk import, totally do not care about search, you can completely turn off the merger limiting. It also can increase index.translog.flush_threshold_size settings from the default value of 512 MB to a larger number, such as 1 GB, which can accumulate in the transaction log in a greater segment is triggered when empty.
  • If your search results does not require near real-time accuracy, consider each index index.refresh_interval changed to 30s.
  • If you are in big bulk import, considered by setting index.number_of_replicas: 0 Close the replica.

 

How Elasticsearch For large amounts of data (hundreds of millions of the order) polymerization achieve?

  • Elasticsearch provided cardinality first approximation is a measure of the polymerization. It provides a base field, i.e. the field is distinct or unique number of values. It is based on the HLL algorithm. HLL will first of our input for the hash, and then do a probability estimate based on the results of the hash bits to obtain base. Its characteristics are: the accuracy configured to control the use of memory (more precisely = more memory); small data set is very high accuracy; we can configure the parameters to set a fixed amount of memory required to heavy . Whether or thousands of unique value billions of accuracy memory usage configuration only with your relevant.

 

In the concurrent circumstances, Elasticsearch if to ensure consistent read and write?

  • You can use optimistic concurrency control by the version number, to ensure that the new version will not be covered by the old version, the application layer to deal with specific conflicts;
  • In addition to the write operation, the consistency level support quorum / one / all, default quorum, that is, only when the majority of fragmentation can be used only allow write operations. But even if most of the available, there may be reasons written copy of the network as a result of the failure, so the copy is considered a failure on the reconstruction, fragmentation will be a different node.
  • For read operations, replication may be provided as sync (default), which makes the operation after the primary slice and the slices are completed copies will not return; if set when the async replication, may also request a search by setting parameters for the primary _preference to queries main fragmentation, ensure that the document is the latest version.

 

1.1, the design phase tuning
1) based on the incremental business needs to take to create a template index based on the date, scroll through the roll over API index;
2) using the alias index management;
3) the timing of the index every morning to do force_merge operations to free up space ;
4) take hot and cold separation mechanism, storing hot data to SSD, improve retrieval efficiency; cold data regularly shrink operations, in order to reduce storage;
5) take curator lifecycle management index;
6) only the fields for the needs of sub-word, reasonable set segmenter;
. 7) fully integrate the Mapping phase attributes of each field, the need to retrieve, the need for storage.


1.2, write tuning
1) before writing the number of copies is set to 0;
2) before writing off refresh_interval set to -1, the refresh mechanism is disabled;
3) the writing process: take bulk batch write;
4) write after restoring the number of copies and the refresh interval;
5) to make use of auto-generated id.


1.3, query tuning
1) disable wildcard;
2) Disable batch terms (hundreds scene);
3) full use of inverted index mechanism, can try to type keyword keyword;
. 4) when large amount of data may be based on the first time finalized re-retrieval index;
5) a reasonable set routing mechanism.
 

An inverted index is based on the underlying implementation: FST (Finite State Transducer) data structure.
lucene data structures from version 4+ in large quantities using FST. FST has two advantages:

1) small footprint. By repeated use of the word to the dictionary prefixes and suffixes, compressed memory space;
Fast 2) query speed. O (len (str)) query time complexity.
 

3, elasticsearch index data more how to do, how to tune the deployment?
Interviewer: For large amounts of data operation and maintenance capabilities.
Answer: Planning index data, should be a good program in the early, so-called "design first, code after", so as to effectively avoid the sudden surge in data clusters lead to insufficient treatment capacity caused by online customers to retrieve or other businesses affected .
How to tune, as Question 1 said, refine it here:

3.1 Dynamic index level
template-based + time + rollover api scroll to create an index, for example: the definition of the design phase: blog index template format is: timestamp of the form blog_index_, incremental data every day.

The advantage of this: the amount of data will not cause proliferation of a single index data is very large, close to the power line 32 -1, the index store 2 reaches TB + even greater.

Once the risks of a single large index, storage, etc. has cropped up, so to think ahead as soon as possible to avoid +.

3.2 level storing
cold stored data separation, the heat data (such as last 3 days of the week or data), the remainder being cold data.
For cold data will no longer write new data to be added to shrink force_merge consider periodic compression operation, space-saving storage and retrieval efficiency.

3.3 deployment level
once before without planning, contingency strategies belong here.
ES combined with its support for dynamic expansion of the characteristics of the dynamic new ways to relieve cluster machine pressure Note: If before the main nodes such as planning and reasonable, do not need to restart the cluster can be dynamically added to complete.
 

lucence internal structure

Here Insert Picture Description

 

A detailed description of what Elasticsearch update, and delete documentation process.
(1) deletes and updates are also writes, but Elasticsearch the document is immutable, and therefore can not be deleted or modified to change its display;

(2) each segment on the disk has a corresponding .del file. When the deletion request, the document has not really been deleted, but is marked for deletion in .del file. The document can still match the query, but are filtered out in the results. When a segment consolidation, in .del file is marked as deleted documents will not be written into a new segment.

(3) when a new document is created, Elasticsearch specifies a version number for the document, when performing an update, the old version of the document is marked for deletion in .del file, the new version of the document is indexed to a new segment . Older versions of documents can still match the query, but are filtered out in the results.
 

2, a detailed description about the process Elasticsearch search.
(1) Search is performed as a two-stage process, which we call Query Then Fetch;

(2) at the initial stage of the query, the query is broadcast to each of the index fragment copies (copies of the master slice or slice). The size of each slice and build a local search is performed to match the document from + size of the priority queue.

PS: When will the search query Filesystem Cache, but there are still some data MemoryBuffer, so the search is near real-time.

(3) each slice to return to their priority in the queue ID and a ranking value of all documents to the coordinator node, which incorporate these values ​​to the results list after its priority queue to generate a globally ordered.

(4) The next step is to retrieve phase, the coordinator node identify which documents need to be retrieved and fragmentation related submit multiple GET requests. Each slice is loaded and rich documentation, if necessary, and then return the document to the coordinator node. Once all the documents have been retrieved, the coordinator node returns the results to the client.

(5) Supplementary: Query Then Fetch type of search in the document relevancy scoring when reference is made to the data of this fragmentation, this may not be accurate number of documents in less time, DFS Query Then Fetch adds a pre-processing query asking Term and Document frequency, this score is more accurate, but the performance will be worse
 

 

Published 386 original articles · won praise 2 · Views 9846

Guess you like

Origin blog.csdn.net/kuaipao19950507/article/details/104932126