jvm8: Tuning in practice

-JVM parameters

  1. General JVM parameters

-server

If this parameter is not configured, the JVM will automatically select different modes according to the hardware configuration of the application server. The server mode starts slowly, but the runtime speed has been optimized, which is suitable for the JVM running on the server side.

-client

Start-up is relatively fast, but the response during runtime is not optimized in server mode, which is suitable for service development and testing of personal PCs.

-Xmx

Set the maximum value of the java heap, the default is 1/4 of the machine's physical memory. This value determines the maximum available Java heap memory: Too little allocation will cause OOM (Out Of Memory) problems when the application requires a large amount of memory for caching or temporary objects; if the allocation is too large, then the PermSize will be too small And caused by another kind of Out Of Memory. So how to configure is still determined according to the analysis and calculations in the running process, if not sure, the default configuration is still used.

-Xms

Set the size of the Java heap when it is initialized. By default, it is 1/64 of the machine's physical memory. This is mainly determined based on the resources consumed when the application is started. If the allocation is less, the application will slow down the running speed, and the allocation is too much waste.

-XX:PermSize

Initialize the size of the permanent memory area. The full name of the permanent memory area is Permanent Generation space, which refers to the permanent storage area of ​​the memory. The PermGen space is not cleaned up during the program operation. Therefore, if your APP will load a lot of CLASS, PermGen space error is likely to occur. This kind of error is common when the web server pre-compiles the JSP. If you use a large number of third-party jars under your WEB APP, the size of which exceeds the jvm default PermSize size (4M), then this error message will be generated.

-XX:MaxPermSize

Set the maximum size of the permanent memory area.

-Xmn

Set the youth size directly. The available memory size of the entire JVM = the size of the young generation + the size of the old generation + the size of the persistent generation. The permanent generation generally has a fixed size of 64m, so after increasing the young generation, the size of the old generation will be reduced. This value has a greater impact on system performance. Sun officially recommends a configuration of 3/8 of the entire heap.

According to Sun's official ratio, the size of the young generation in the above example should be 2048*3/8=768M.

-XX:NewRatio

Control the size of the default Young generation. For example, setting -XX:NewRatio=3 means that the ratio of the Young generation to the old generation is 1:3. In other words, the sum of Eden and Survivor space is 1/4 of the entire heap size.

The actual setting in the figure, -XX:NewRatio=2, -Xmx=2048, the distribution ratio of the young generation and the old generation is 1:2, that is, the size of the young generation is 682M, and the size of the old generation is 1365M. View the jvm monitoring results of the actual system:

Memory pool name: Tenured Gen

The amount of memory that the Java virtual machine originally requested from the operating system: 3,538,944 bytes

The amount of memory that the Java virtual machine can actually obtain from the operating system: 1,431,699,456 bytes

The maximum amount of memory that the Java virtual machine can obtain from the operating system: 1,431,699,456 bytes. Please note that this amount of memory is not necessarily available.

The amount of memory used by the Java virtual machine at this time: 1,408,650,472 bytes

That is: 1,408,650,472 bytes=1365M, which proves that the above calculation is correct.

-XX:SurvivorRatio

Set the ratio of the size of the Eden area to the Survivor area in the young generation. Set to 4, the ratio of two Survivor areas to one Eden area is 2:4, and one Survivor area occupies 1/6 of the entire young generation. A larger survivor space can allow short-term objects to die out in the young generation as much as possible; if the survivor space is too small, Copying collection will directly transfer them to the old generation, which will speed up the use of space in the old generation and cause frequent complete garbage collection.

As shown below:

The value of SurvivorRatio is set to 3, Xmn is 768M, and the size of each Survivor space is 768M/5=153.6M.

-XX:NewSize

In order to achieve better performance, you should set the size of the pool containing short-lived objects so that the survival time of the objects in the pool does not exceed one garbage collection cycle. The size of the newly generated pool is determined by the NewSize and MaxNewSize parameters. Through this option, you can set the Java new object to produce heap memory. Under normal circumstances, the value of this option is an integer multiple of 1024 and greater than 1MB. The value rule for this value is that, under normal circumstances, this value -XX:NewSize is a quarter of the maximum heap size. Increasing the size of this option value is to increase a larger number of short-lived objects. Increasing the heap memory produced by new Java objects is equivalent to increasing the number of processors. And memory can be allocated in parallel, but please note that garbage collection of memory cannot be processed in parallel. The function is similar to -XX:NewRatio, -XX:NewRatio is to set the ratio and -XX:NewSize is to set the precise value.

-XX:MaxNewSize

Through this option, you can set the maximum heap memory produced by new Java objects. Under normal circumstances, the value of this option is an integer multiple of 1 024 and greater than 1MB, and its function is the same as the above setting new object production heap memory -XX:NewSize. Generally, NewSize and MaxNewSize should be set to the same.

-XX:MaxTenuringThreshold

Set the maximum age of garbage. If set to 0, the young generation object does not pass through the Survivor area and directly enters the old generation. For more applications in the old age, efficiency can be improved. If this value is set to a larger value, the young generation object will be copied multiple times in the Survivor area, which can increase the survival time of the object in the young generation and increase the probability of being recycled in the young generation.

As shown below:

The -XX:MaxTenuringThreshold parameter is set to 5, which means that the object will be copied to the old generation after being copied 5 times in the Survivor area if it has not been recycled.

-XX:GCTimeRatio

Set the garbage collection time as a percentage of the running time of the program. If this parameter is set to n, the formula for garbage collection time as a percentage of program running time is 1/(1+n). If n=19, java can use 5% of time for garbage collection, 1/(1+19 )=1/20=5%.

-XX:TargetsurvivorRatio

The value is a percentage, which controls the proportion of the rescue space allowed. The default value is 50. Larger setting of this parameter can increase the utilization rate of survivor space. When a larger stack uses a lower SurvivorRatio, the value should be increased to 80 to 90 to make better use of the rescue space.

-Xss

Set the stack size of each thread and adjust it according to the memory size required by the application thread. Under the same physical memory, reducing this value can generate more threads. However, the operating system still has a limit on the number of threads in a process, and it cannot be generated indefinitely. The experience value is around 3000~5000. When this option is set to a larger value (>2MB), the performance of the system will be greatly reduced. Therefore, you should be extra careful when setting this value. After adjustment, you should pay attention to observe the performance of the system, and keep adjusting to achieve the best.

After JDK5.0, the stack size of each thread is 1M, and the stack size of each thread is 256K before.

-Xnoclassgc

This option is used to cancel the garbage collection of a specific class by the system. It can prevent that when all references of this class are lost, this class will not be reloaded again while still being referenced, so this option will increase the space of the system heap memory. Disable class garbage collection, the performance will be higher;

  1. Serial collector parameters

-XX:+UseSerialGC:

Set up the serial collector.

  1. Parallel collector parameters

-XX:+UseParallelGC:

The garbage collector is selected as a parallel collector. This configuration is only effective for the young generation. That is, under the above configuration, the young generation uses parallel collection, while the old generation still uses serial collection. Multi-threaded parallel management and recycling of garbage objects are used to improve recycling efficiency and server throughput, which is suitable for multi-processor servers.

-XX:ParallelGCThreads

Configure the number of threads of the parallel collector, that is, how many threads are garbage collected together at the same time. This value is best configured to be equal to the number of processors.

-XX:+UseParallelOldGC:

Adopting the strategy of concurrent collection for the old generation can improve the collection efficiency. JDK6.0 supports parallel collection of old generations.

-XX:MaxGCPauseMillis

Set the maximum pause time for parallel collection of each young generation. If this time cannot be met, the JVM will automatically adjust the size of the young generation to meet this value.

-XX:+UseAdaptiveSizePolicy:

After setting this option, the parallel collector will automatically select the size of the young generation area and the corresponding Survivor area ratio to achieve the minimum response time or collection frequency specified by the target system. This value is recommended to be turned on when using the parallel collector.

  1. Concurrent collector parameters

-XX:+UseConcMarkSweepGC

Specify the use of concurrent cmark sweep gc in the old generation. gc thread and app thread are parallel (pause app thread during init-mark and remark). App pause time is short, suitable for systems with strong interaction, such as web server. It can perform collection operations concurrently, reducing application stop time, and it is also a parallel processing mode, which can effectively utilize the multi-process processing of a multi-processor system.

-XX:+UseParNewGC

Specify the parallel collector to be used in New Generation, which is an upgraded version of UseParallelGC's gc. It has better performance or advantages and can be used with CMS gc.

-XX:+UseCMSCompactAtFullCollection:

Turn on the compression of the old generation. It may affect performance, but it can eliminate fragmentation. In FULL GC, when compressing memory, CMS will not move memory. Therefore, it is very easy to generate fragmentation and cause insufficient memory. Therefore, memory compression will be affected at this time. Enable. It is a good habit to increase this parameter.

-XX:+CMSIncrementalMode:

Set to incremental mode. Suitable for single CPU

-XX:CMSFullGCsBeforeCompaction

Since the concurrent collector does not compress and organize the memory space, it will produce "fragmentation" after running for a period of time, which will reduce the operating efficiency. This value sets how many times the GC will run to compress and organize the memory space.

-XX:+CMSClassUnloadingEnabled

Make CMS collect persistent generation classes instead of fullgc

-XX:+CMSPermGenSweepingEnabled

Make CMS collect persistent generation classes instead of fullgc.

-XX:-CMSParallelRemarkEnabled

In the case of using UseParNewGC, try to reduce the mark time.

-XX:CMSInitiatingOccupancyFraction

Explain that the concurrent garbage collection (CMS) of the old age will be executed when the old age reaches as many as 100%. This parameter setting has great tricks and basically satisfies the formula:

(Xmx-Xmn)*(100-CMSInitiatingOccupancyFraction)/100>=Xmn

There will be no promotion failed. In my application, Xmx is 6000, Xmn is 500, then Xmx-Xmn is 5500 trillion, which means that the old generation has 5500 trillion. CMSInitiatingOccupancyFraction=90 means that the old generation starts to perform concurrent garbage collection of the old generation when 90% is full (CMS), at this time, the remaining 10% of the space is 5500*10%=550 megabytes, so even if all objects in Xmn (that is, 500 megabytes in the young generation) are moved to the old generation, 550 megabytes of space is enough , So as long as the above formula is satisfied, there will be no promotion failed during garbage collection;

If calculated according to the ratio of Xmx=2048, Xmn=768, the value of CMSInitiatingOccupancyFraction cannot exceed 40, otherwise it is prone to promotion failed during garbage collection.

-XX:+UseCMSInitiatingOccupancyOnly

Indicates that the concurrent collector starts the collection only after the initial ratio is used in the old generation

-XX:SoftRefLRUPolicyMSPerMB

Compared with the virtual machine in the client mode (-client option), when the virtual machine in the server mode is used (-server option), the cleaning effort for soft references is slightly worse. The collection frequency can be reduced by increasing -XX:SoftRefLRUPolicyMSPerMB. The default value is 1000, which means one megabyte per second. The soft reference survives longer in the virtual machine than in the customer set. The cleaning frequency can be controlled with the command line parameter -XX:SoftRefLRUPolicyMSPerMB=<N>, which can specify the number of milliseconds that the soft reference of each mega heap free space remains alive (once it is not reachable), which means every mega The soft reference in the free space in the heap will survive for 1 second (after the last strong reference is reclaimed). Note that this is an approximate value, because soft references will only be cleared during garbage collection, and garbage collection does not always happen.

-XX:LargePageSizeInBytes

The size of the memory page cannot be set too large, which will affect the size of Perm.

-XX:+UseFastAccessorMethods

Quick optimization of primitive types, get and set methods are converted to local code.

-XX:+DisableExplicitGC 

Prohibit full gc in java program, such as System.gc() call. It is best to add to prevent the program from being misused in the code, which will impact performance.

-XX:+AggressiveHeap

Special note: (I think it is helpful for making java cache applications)

Trying to use a lot of physical memory

Optimization of long-term large memory usage, able to check computing resources (memory, number of processors)

At least 256MB RAM is required

A large amount of CPU/memory, (in 1.4.1, it has been shown to improve on 4CPU machines)

-XX:+AggressiveOpts

Speed ​​up compilation

-XX:+UseBiasedLocking

Improved performance of the lock mechanism.

Practical articles

   Testing purposes

Test the performance of the tested system when using different garbage collection schemes;

Understand the actual effect of various JVM parameters in performance tuning;

Perform an 8-hour stress test on the selected optimal solution and record the test results;

Test environment preparation

Software and hardware environment for the running of the program under test:

  • D630 4G memory + T7250 dual-core CPU + 160G hard disk;

  • Operating system: windowsXP SP3;

  • IP:11.55.15.51;

Name of the program under test:

  • XXX Bank Procurement Management System V1.1 version;

Program deployment environment:

  • Tomcat6.0.18 for windows;

  • Sun JDK1.6.13 for windows;

  • Oracle10g for windows (run separately on another 640M laptop)

The hardware and software environment in which the performance test tool runs:

  • Operating system: windowsxp sp3

  • Browser version: IE7

  • IP address: 11.55.15.141

  • Performance testing tool: loadrunner9.10

JVM monitoring tools:

  • Use Jconsole for graphical monitoring;

Determine the maximum available memory of the JVM on the machine under test:

Through repeated testing with the java -XmxXXXXM -version command under the command line, it is found that the maximum memory that the JVM can use on the 11.55.15.51 machine is 1592M.

  1. Record test script

Preparation before recording: modify the checkcode.java file, change the randomly generated check code to a fixed check code to facilitate the automatic running of the script, and then replace the compiled checkcode.class file with the class file in the release package.

Select typical business operations for script recording. The typical business operations of each system will be different. After analysis and statistics, select the part with the highest user operation frequency. The content of the script determined after analysis is: record the system login operation and select a paragraph of text as the verification point on the main interface after the login is successful.

Start the VuGen program to record and debug the script according to the definition of the script to ensure that the script can run normally.

Define test scenarios

  • Number of virtual users: 30

  • Continuous running time: 8 hours

  • Virtual user loading and unloading methods: Simultaneously

  • Performance monitoring indicators: response time, throughput, number of successful transactions

  • Perform preliminary performance tests

Use the system default parameters to perform the test, and record the response time, throughput and the number of successful transactions, and monitor the use of the JVM.

  1. Choose a tuning plan

Test data of different garbage collection methods:

Id

NewRatio

SurviorRatio

TransResponse Time

Throughput

Passed Transactions

1

2

25

3.139s

3016230.514

7528

2

1

25

3.161s

2975581.301

7452

3

3

25

2.814s

3334717.818

8383

4

4

25

2.659s

3505592.450

8846

5

5

25

2.860s

3270596.069

8232

6

4

15

2.499s

3765121.986

9426

7

4

5

1.986s

4750776.581

11843

8

4

4

1.968s

4825608.161

11947

9

4

3

2.507s

3770420.243

9388

10

-XX:TargetSurvivorRatio=90 

1.924

4945053.874

12216

11

-Xmx1024M

1.903

4974137.908

12360

Concurrent collection mode, memory usage after ten minutes of running time:

Serial collection mode, running time of ten minutes:

Parallel collection mode, running time of ten minutes:

30-60: 30 concurrent users running continuously for 60 minutes of jvm memory changes screenshots:

Two full GCs (Full GC) occurred at 11:36 and 11:56, because the PS Old Gen was already full at this time, and the JVM automatically reclaimed the memory in the Old Gen.

Based on repeated tests and combined with the business characteristics of the system under test, the following optimal solution was finally finalized for an 8-hour stress test:

JAVA_OPTS=-server

-Xms1024M

-Xmx1024M

-Xmn128M

-XX:NewSize=128M

-XX:MaxNewSize=128M

-XX:SurvivorRatio=20

-XX:MaxTenuringThreshold=10

-XX:GCTimeRatio=19

-XX:+UseParNewGC

-XX:+UseConcMarkSweepGC

-XX:+CMSClassUnloadingEnabled

-XX:+UseCMSCompactAtFullCollection

-XX:CMSFullGCsBeforeCompaction=0

-XX:-CMSParallelRemarkEnabled

-XX:CMSInitiatingOccupancyFraction=70

-XX:SoftRefLRUPolicyMSPerMB=0

–XX:PermSize=256m

-XX:MaxPermSize=256m

-Djava.awt.headless=true

  1. JVM monitoring diagram after tuning

Screenshot of 30Vusers running for 8 hours:

Analysis of test results

For the login operation of the XX bank procurement system, when the NewRatio and SurviorRatio of jvm are set to 4, the performance is the best! On this basis, the performance is also improved to a certain extent after setting -XX:TargetSurvivorRatio=90 and -Xmx1024M.

Examples of performance issues

Performance symptoms

A system officially launched in XX province, the program process will be killed inexplicably after a period of operation, and the system has to be started manually.

  1. Monitoring results

  2. Jmap command to view heap memory allocation and usage

./jmap -heap 31 //31 is the process number of the program

Attaching to process ID 31, please wait...

Debugger attached successfully.

Server compiler detected.

JVM version is 11.0-b12 //Display the version number of jvm

using parallel threads in the new generation. //Description that parallel collection is used in the young generation

using thread-local object allocation.

Concurrent Mark-Sweep GC //Enable CMS collection mode

 

Heap Configuration:

MinHeapFreeRatio = 40

MaxHeapFreeRatio = 70 //These two indicate that the usage ratio of heap memory is between 30% and 60%

MaxHeapSize = 2147483648 (2048.0MB) //The maximum heap size is 2048M

NewSize = 805306368 (768.0MB)

MaxNewSize = 805306368 (768.0MB) //The young generation size is 768M

OldSize = 1342177280 (1280.0MB) //The size of the old generation is 1280M

NewRatio = 8 //This is a bit self-contradictory, 1:8

SurvivorRatio = 3 //The size of the rescue area accounts for one-fifth of the entire young generation

PermSize = 268435456 (256.0MB) //Persistent generation size is 256M

MaxPermSize = 268435456 (256.0MB) //Persistent generation size is 256M

 

Heap Usage:

//The size of the young generation, only one rescue area is calculated here, so 153M is missing

New Generation (Eden + 1 Survivor Space):

capacity = 644284416 (614.4375MB)

used = 362446760 (345.65616607666016MB)

free = 281837656 (268.78133392333984MB)

56.25570803810968% used

//Eden Space size is 614.43-153=460.8M

Eden Space:

capacity = 483262464 (460.875MB)

used = 342975440 (327.0868682861328MB)

free = 140287024 (133.7881317138672MB)

70.97084204743864% used

//The size of the two rescue areas is 153MB, which is consistent with the previous calculation result of the SurvivorRatio parameter setting value.

From Space:

capacity = 161021952 (153.5625MB)

used = 19471320 (18.569297790527344MB)

free = 141550632 (134.99320220947266MB)

12.092338813530219% used

To Space:

capacity = 161021952 (153.5625MB)

used = 0 (0.0MB)

free = 161021952 (153.5625MB)

0.0% used

//The size of the old generation is 1280M, which is consistent with the result calculated according to the parameter configuration.

concurrent mark-sweep generation:

capacity = 1342177280 (1280.0MB)

used = 763110504 (727.7588882446289MB)

free = 579066776 (552.2411117553711MB)

56.85616314411163% used

//The permanent generation size is 256M, and the actual usage is less than 50%. The value can be stabilized after the system has been running for a period of time.

Perm Generation:

capacity = 268435456 (256.0MB)

used = 118994736 (113.48222351074219MB)

free = 149440720 (142.5177764892578MB)

44.32899355888367% used

  1. Top command monitoring results:

Continuous monitoring by using the top command found that the CPU idle ratio at this time was 85.7%, the remaining physical memory was 3619M, and the virtual memory 8G was unused. Continuous monitoring results show that process 29003 occupies system memory continuously increasing, and it has reached the maximum value soon.

  1. Jstat command monitoring results:

Use the jstat command to check the gc recovery status of the process with PID 29003, and it is found that the memory usage of the Old segment has exceeded the set 80% warning line, causing the system to perform FGC every one or two seconds, the number of FGCs Obviously more than the number of YGC, but after each FGC, the proportion of memory occupied by old has not changed significantly-the system tries to perform FGC and cannot effectively reclaim the memory occupied by this part of the object. At the same time, it also shows that there may be problems with the parameter configuration of the young generation, causing most objects to have to be placed in the old generation for FGC operation. This may be related to the long invalidation time of the session configuration of the system.

  1. Stack content printed by Jstack:

In the above figure, a large number of workflow thread locks are found.

In the above figure, a large number of cms thread pool management thread locks are found.

Cause Analysis

After real-time monitoring of the jvm memory, it is found that the reason that the memory in the old generation cannot be effectively recycled is the existence of a large number of thread deadlocks in the stack. It is recommended that the development team carefully review the source code of the com.zzxy.workflow package and the source code of the com.web.csm package to see if there is a thread deadlock defect.

JVM settings of the system

<jvm-options>-XX:+PrintGCApplicationConcurrentTime</jvm-options> <jvm-options>-XX:+PrintGCApplicationStoppedTime</jvm-options>

<jvm-options>-XX:+PrintGCTimeStamps</jvm-options>

<jvm-options>-XX:+PrintGCDetails</jvm-options>

<jvm-options>-Xms2048m</jvm-options>

<jvm-options>-Xmx2048m</jvm-options>

<jvm-options>-server</jvm-options>

<jvm-options>-Djava.awt.headless=true</jvm-options>

<jvm-options>-XX:PermSize=256m</jvm-options>

<jvm-options>-XX:MaxPermSize=256m</jvm-options>

<jvm-options>-XX:+DisableExplicitGC</jvm-options>

<jvm-options>-Xmn768M</jvm-options>

<jvm-options>-XX:SurvivorRatio=3</jvm-options>

<jvm-options>-Xss128K</jvm-options>

<jvm-options>-XX:TargetSurvivorRatio=80</jvm-options>

<jvm-options>-XX:MaxTenuringThreshold=5</jvm-options>

<jvm-options>-XX:+UseConcMarkSweepGC</jvm-options>

<jvm-options>-XX:+CMSClassUnloadingEnabled</jvm-options>

<jvm-options>-XX:+UseCMSCompactAtFullCollection</jvm-options>

<jvm-options>-XX:-CMSParallelRemarkEnabled</jvm-options>

postscript

1. The performance tuning should be targeted, according to the characteristics of the actual business system, and based on the JVM log records for a certain period of time, carry out targeted adjustments, comparisons and observations.

2. Performance tuning is an endless process. It is necessary to comprehensively weigh the tuning cost and the cost of replacing hardware, and use the most economical means to achieve the best results.

3. Performance tuning includes not only the tuning of JVM, but also the tuning and optimization of server hardware configuration, operating system parameters, middleware thread pool, database connection pool, database parameters, and specific database tables, indexes, and partitions.

4. Inter-time tuning is often the result of frequent service downtime when a bottleneck is encountered. It is not just a matter of just changing a few jvm parameters without testing.

5. It is a more economical and quick tuning method to check the performance problems existing in the code through specific tools and correct them.

 

  1. Attachment: Typical configuration of Shede.com

$JAVA_ARGS .= " -Dresin.home=$SERVER_ROOT

-server

-Xms6000M

-Xmx6000M

-Xmn500M

-XX:PermSize=500M

-XX:MaxPermSize=500M

-XX:SurvivorRatio=65536

-XX:MaxTenuringThreshold=0

-Xnoclassgc

-XX:+DisableExplicitGC

-XX:+UseParNewGC

-XX:+UseConcMarkSweepGC

-XX:+UseCMSCompactAtFullCollection

-XX:CMSFullGCsBeforeCompaction=0

-XX:+CMSClassUnloadingEnabled

-XX:-CMSParallelRemarkEnabled

-XX:CMSInitiatingOccupancyFraction=90

-XX:SoftRefLRUPolicyMSPerMB=0

-XX:+PrintClassHistogram

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-XX:+PrintHeapAtGC

-Xloggc:log/gc.log ";
Description:

1. -XX:SurvivorRatio=65536 -XX:MaxTenuringThreshold=0 is to remove the rescue space;

2. -Xnoclassgc disables class garbage collection, the performance will be higher;

3. -XX:+DisableExplicitGC prohibits System.gc() to prevent programmers from calling gc methods by mistake and affecting performance;

4. -XX:+UseParNewGC, the young generation adopts multi-threaded parallel collection, so that the collection is fast;

Guess you like

Origin blog.csdn.net/zhaofuqiangmycomm/article/details/114927455