Elasticsearch heap memory

Reprinted from: https: //www.lbbniu.com/6148.html

1. What is the heap?

Java JVM heap is managed by the biggest piece of memory space, mainly used to store various types of instance of an object.
In Java, the stack is divided into two distinct areas:

新生代 ( Young )、

老年代 ( Old )。

New Generation (Young) has been divided into three regions

Eden、

From Survivor、

To Survivor。

Thus divided JVM is designed to make the object to better manage heap memory, including memory allocation and recovery.
2. What is the role of heap memory?

Created when the virtual machine starts.

The sole purpose of heap memory is to create an object instance, all instances of objects and arrays to be allocated on the heap.

Heap is responsible for garbage collection, and therefore also known as "GC heap", using generational garbage collection algorithm, stack and thus into the new generation of the old era.

The advantage is that heap memory size can be allocated dynamically, survival do not have to tell the compiler in advance, because it is at run-time dynamic allocation of memory, Java's garbage collector automatically taken away these data is no longer in use.

But the disadvantage is due to the slower dynamic allocation of memory access speed at runtime. When the heap memory is full because it will not be extended when thrown java.lang.OutOfMemoryError: Java heap space exception. Solution This happens specifically refer to java tuning.
3, how to heap memory configuration?

By default, Elasticsearch JVM heap memory using the minimum and maximum size is 2 GB (5.X or later).

Early versions of the default 1GB, the official website said: This is obviously not enough.

When transferred to a production environment, the configuration of sufficient capacity to ensure Elasticsearch heap size and performance features are needed.

Elasticsearch will allocate the entire heap jvm.options specified by the Xms (minimum heap size) and Xmx (maximum heap size) is provided.

For example as follows:

Setting way:
Set heap memory in jvm.option profile.

Xms2g 
Xmx2g

Setting way:
setting environment variables.
This may be provided by commenting jvm.options file Xmx and Xms and set these values is accomplished by ES_JAVA_OPTS:

ES_JAVA_OPTS="-Xms2g -Xmx2g" ./bin/elasticsearch 
ES_JAVA_OPTS="-Xms4000m -Xmx4000m" ./bin/elasticsearch

4, the determinants of heap memory

Heap memory values depends on the available memory size on the server.
5, heap memory configuration recommendations

将最小堆大小(Xms)和最大堆大小(Xmx)设置为彼此相等。

Elasticsearch可用的堆越多,可用于缓存的内存就越多。但请注意,太多的堆内存可能会使您长时间垃圾收集暂停。

将Xmx设置为不超过物理内存的50%,以确保有足够的物理内存留给内核文件系统缓存。

不要将Xmx设置为JVM超过32GB。

Recommended size: half the size of the host memory and 31GB, the minimum value.

6, why can not exceed half of the heap memory of the physical machine?

Heap absolutely important for Elasticsearch.
It is a lot of memory data structures used to provide fast operation. But there is another very important memory consumers: Lucene.

Lucene intended use of the underlying operating system to cache data structures in memory. Lucene section (segment) is stored in a single file. Because the segments are static, these files will never change. This makes them very easy to cache, and the underlying operating system will happily heat section (hot segments) reserved for quicker access in memory. These sections include inverted index (for full-text search), and the document values ​​(used in the polymerization).

Lucene performance is dependent on this interaction with the operating system. But if you put all available memory gave Elasticsearch heap, then there will not be any remaining Lucene memory. This will seriously affect performance.

Standard advice is available to 50% of the memory available to Elasticsearch heap, while the other 50% idle. It will not be idle; Lucene would happily devour the rest of the stuff.

If you do not string field on polymerization operation (e.g., you do not need fielddata), may be considered to further reduce the stack. The smaller the heap, you can from Elasticsearch (faster GC) and Lucene (more memory cache) for better performance.
7 heap why can not exceed 32GB?

In Java, all objects are allocated on the heap pointer by reference. Common object pointer (OOP) objects pointing to these, are traditionally local CPU word size: 32-bit or 64-bit, depending on the processor.

For 32-bit system, which means that the maximum heap size is 4 GB. For 64-bit systems, the heap size may become larger, but 64-bit pointers pointer overhead means that just because there is a larger and more waste of space. And worse than wasted space in the main memory and when the various caching (LLC, L1, etc.) between the movement value, the larger the pointer consume more bandwidth.

Java uses a technique called compression oops to solve this problem. And not to the exact byte location in memory, the object reference pointer offset. This means that a 32-bit pointer can refer to four billion objects, rather than four billion bytes. Ultimately, this means that the heap can grow to about 32 GB of physical size, while still using 32-bit pointers.

Once you have crossed this magical ~32 GB boundary, the pointer will switch back to the normal object pointer. Increase the size of each pointer using more CPU memory bandwidth, and actually lost memory. Actually, before use to obtain the same effective compression oops memory stack 32 GB or less, requires approximately 40-50 GB heap allocation.

Above summary is: Even if you have enough memory space, try to avoid the pile across the border to 32GB.
Otherwise it will lead to a waste of memory, reducing the CPU performance, and GC struggling in piles in.
8, my memory is how local tyrants do?

1 suppose, I have a machine with a 1TB RAM!

32 GB important baseline. So how do you do when your machine has a lot of memory? The current super server with 512-768 GB RAM is becoming more common.

First, we recommend avoiding the use of such a large machine.

But if you already have these machines, you have three practical options:

  1. Are you mainly full-text search?

Consider giving Elasticsearch provided 4-32 GB, and Lucene to use the remaining memory by the operating system file system cache. All memory will cache segment, leading to a rapid and full-text search.

  1. You do a lot of sorting / aggregation?

Most aggregation number, date, location and not_analyzed string? You're lucky, your will be done on the document aggregation value memory cache!

Elasticsearch to a place from 4-32 GB of memory, so that the rest of the operating system cache doc values ​​in memory.

  1. Have you analyzed strings were a lot of sorting / polymerization (for example, the words mark or SigTerms)?

    Unfortunately, this means that you need fielddata, which means you need to heap space.

    Consider two or more nodes running on one machine, a huge amount of RAM instead of a node.

    Nevertheless, 50% adhere to rules.

To Tyrant Memory Summary:

因此,如果您的机器具有128 GB的RAM,请运行两个节点,每个节点的容量低于32 GB。这意味着小于64 GB将用于堆,而Lucene将剩余64 GB以上。

如果您选择此选项,请在您的配置中设置cluster.routing.allocation.same_shard.host:true。这将阻止主副本分片共享同一台物理机(因为这会消除副本高可用性的好处)。

9, heap memory optimization suggestions

Method 1: The best way is to completely disable the cross on the system.
This can temporarily complete:

sudo swapoff -a

To permanently disable it, you may need to edit your /etc/fstab.

Second way: the operating system attempts to control the enthusiasm of the swap memory.

If you completely disable the exchange is not an option, you can try to reduce swappiness. This value controls the operating system tries enthusiasm swap memory. This prevents exchange under normal circumstances, but still allows the operating system to swap memory in an emergency situation.

For most Linux systems, this is the value that is configured to use sysctl:

vm.swappiness = 1

The swappiness than 0, because in some kernel versions, can call swappiness 0 OOM killer.

Three ways: mlockall allow the JVM to lock their memory and prevents it from being exchanged operating system.

Finally, if both methods are not feasible, you should enable mlockall. file. This allows the JVM to lock their memory and prevents it from being exchanged operating system. In your elasticsearch.yml, set this:

bootstrap.mlockall:true

10, note

修改JVM相关配置很容易,但容易产生难以测量的不透明效果,并最终将您的群集解调为缓慢,不稳定的混乱

在调试群集时,第一步通常是删除所有自定义配置。大约一半的时间,仅靠这一点就恢复了稳定性和性能。

11, the latest cognitive

wood @ Ctrip
In fact, the amount of memory allocated ES has a limit of 26GB on magic,

This ensures enable zero based Compressed Oops, this is the best performance.

Reference: HTTPS: //elasticsearch.cn/question/3995
https://www.elastic.co/blog/a-heap-of-trouble
12 is, Summary

This is an official website of the principle of integrated document & article, the main purpose is to sort cognition.

切记:宿主机内存大小的一半和31GB,取最小值。

13, reference

基础:http://t.cn/RH4DDYu

设置:http://t.cn/RmKbO1i

建议:http://t.cn/RmKbjsF

注意:http://t.cn/RmKbHp5

堆:http://t.cn/RmKbRji

Guess you like

Origin www.cnblogs.com/sanduzxcvbnm/p/12095809.html