Talk about the size configuration of JSF business thread pool

1 Introduction

The JSF business thread pool uses JDK's thread pool technology and adopts Cached mode by default (number of core threads is 20, maximum number of threads is 200). In addition, Fixed thread size mode is also provided. Both modes can set the request queue size.

This article aims to provide benchmark results for "JSF business thread pool size configuration" through load testing in a simplified scenario ("single service application") and form some generally applicable conclusions.

The target readers of this article include stress testing engineers, development, deployment and operation and maintenance engineers and architects who need to reasonably configure the JSF thread size. This article does not cover other configuration items of the JSF server, nor does it discuss the reasonable configuration of "composite service applications". You can use the conclusions provided in this article as a reference for designing stress test cases or basic methods for evaluating the size of the business thread pool, so that you can reasonably configure the size of the JSF business thread pool in practice. It should be noted that the reasonable configuration of the JSF business thread pool size should be based on high-fidelity load test results.

"Single-service application" means that the application contains only one provided interface and only one method in the interface.

"Composite service application" refers to an application that contains multiple provided interfaces or an interface that contains multiple methods.

2. Test case description

This benchmark test selected the USF3.0 permission system and customized it into a single service provider. Only one method of the provider was tested, so it can be regarded as a "single service application". In the test, the CPU is used as the core resource of the benchmark test, and taking into account the impact of the JVM garbage collector, simple test data is used to ensure the consistency of each call to the service and to ensure that YGC has regularity (that is, a fixed call amount will cause YGC of 30+ms at a time), without the influence of FGC.

In the design of the test case, all dependent service resources are unlimited to ensure that the service availability rate reaches 100% during the test process. Our key performance indicator is TP99, that is, 99% of service response times must be less than 10ms.

In order to test the performance in different thread pool modes, we used the Cached and Fixed modes of the JSF thread pool, and conducted multiple sets of tests for each mode to find out that the system's maximum load conditions.

Test application : USF3.0 permission system (customized processing)

Test service : com.jd.susf.service.api.SusfPermissionService#findUserInfo, a service returned by querying a piece of data from Redis based on user information.

Hardware configuration : single 4C 8G

Test method : The Forcebot system adopted a ladder pressure method to conduct system load testing on the JSF business thread pool in Cached and Fixed modes.

Formulate SLA requirements : TP99 of service response time <10ms

Note: We have customized the USF3.0 permission system, adjusted the configuration data of the service provider, and only retained com.jd.susf.service.api.SusfPermissionService.

3.Test results and analysis

3.1. System load of cached thread pool

Figure: System load diagram of JSF default thread pool (cached, threads=200) under different number of concurrent users (1-200)

Number of concurrent users TP99 ThroughputTPS CPU utilization (%)
1~23 <8ms linear growth linear growth
24 8ms 6553 99.62
25 11ms 6607 99.83
26~79 Rapid growth Slow growth 99+
80 74ms 6928 99.82
81~199 increase slowly decline slowly 99.82
200 99ms 6230 99.94

Summary: The default JSF thread pool configuration has great risks. The system can support a maximum of 24 concurrencies. If more than 24 concurrencies are reached, the SLA cannot be met.

3.2 System load of fixed thread pool (queue)

Figure: System load diagram of JSF fixed thread pool (fixed+queue) under different number of concurrent users (1-50)

Number of JSF business threads The maximum number of concurrent users that can be supported TP value (50/90/99/999) Throughput (TPS) CPU maximum utilization (%)
4 11 7/8/10/18 1531 27.67
8 25 8/8/10/18 3113 46.45
16 50 8/8/10/21 6228 87.97
20 23 3/4/10/15 6409 99.92
24 22 3/4/7/15 6178 99.86
25 22 3/4/6/15 6182 98.83

Table: JSF fixed business thread pool (fixed+queue) meets the system maximum load (maximum number of concurrent users) of TP99<10ms

summary:

① In fixed thread mode, there is an upper limit on CPU utilization.

② The use of queues can effectively increase the system’s support for concurrency and also improve throughput. However, because tasks are waiting in the queue, the response time of the service will appear "a rising tide lifts all boats", and there is a certain risk.

3.3 System load of fixed thread pool

Figure: System load when the system has the maximum number of concurrent users in JSF fixed thread pool (fixed) mode

Number of JSF business threads Number of concurrent users TP99 Throughput (TPS) CPU maximum utilization (%)
4 4 5 1063 20.26
8 8 5 2216 36.62
16 16 6 4262 68.56
20 20 5 5550 86.22
24 24 8 6711 99.62
25 25 16 6644 98.77
26 26 19 6744 99.93

Summary: Based on the performance of fixed thread pool (fixed), it is necessary to set a reasonable number of threads to balance the full utilization of CPU resources and meet the requirements of SLA. If the number of threads is too small, it will lead to a waste of CPU resources, and if the number of threads is too large, it will not be possible. Meet SLA

4 Conclusion

Based on the test results and data analysis, we draw the following conclusions:

  • The default configuration of the JSF thread pool is risky in scenarios with high concurrency : few servers where JSF services are located in online production environments can meet the SLA even with 200 threads. The thread pool configuration with a maximum of 200 threads puts the server at risk of being overwhelmed in scenarios with high concurrency. Proper configuration of thread pool sizes should come from high-fidelity load testing.
  • A sufficient number of threads can ensure resource (CPU) utilization : Business-type services usually have certain IO operations (network, disk, etc.), and waiting will occur during thread execution. The CPU utilization is not high, and concurrency needs to be increased. Only by increasing the number of threads and allowing more threads to participate in CPU allocation can the CPU utilization be improved. The more IO operations in the service, the longer the waiting time, and the more concurrent threads required. For business services with IO operations, the number of threads for load testing can start from 2N (N is the number of CPU cores of the server).
  • Excessive number of threads will only reduce the SLA of the system : when the number of threads can already utilize 100% of the CPU, if you increase the number of threads, the threads will not be able to obtain enough CPU allocation, so the response time of the service will increase. Within a certain range, TP99 may still meet SLA requirements, and the system throughput will also increase slightly. If you continue to increase the number of threads, TP99 will not be able to meet the system requirements, and the system throughput will begin to decrease.
  • The fixed number of threads can protect the load capacity that the system needs to bear : the fixed number of threads can ensure that the system's CPU utilization is limited to a certain load range, protect the stable operation of the system, and ensure the response time TP99, but it also limits the concurrency capability of the system . Properly setting the queue size can increase the concurrency of the system and will not affect the system TP99. However, it will increase the overall response time of the service and cause unstable changes, which is risky.
  • Let the CPU run at 100% high load : Usually the external SLA commitment of the service is usually higher than the actual performance of the service. This is because we consider the instability of the infrastructure and dependent services. Therefore, even if the CPU has reached 100%, we can still increase the number of threads by a certain amount without affecting the external response time TP99 commitment. This can improve the concurrency capability of the system. Although the system can run under high load, we need to conduct further stability testing to improve the reliability of the system.

In summary, the reasonable configuration of the thread pool size needs to be evaluated and tested based on business requirements and system resource conditions, and reasonable buffer space must be reserved to ensure stable operation of the system and meet user SLAs.

5. Appendix

Appendix 1: Description of statistical indicators and terminology

Number of concurrent users : The number of users who initiate requests at the same time.

TP value (50/90/99/999) : TP value of the client, unit ms, data comes from Forcebot.

Throughput TPS : Data comes from Forcebot.

CPU utilization (%) : Data comes from PFinder.

Number of JSF business threads : The number of threads in the JSF business thread pool, such as: <jsf:server id="jsf" protocol="jsf" threadpool="fixed"  threads ="16" />

fixed/cached : Thread pool type of JSF business thread pool, such as: <jsf:server id="jsf" protocol="jsf"  threadpool="fixed"  threads="200"/>

Finally: The complete software testing video tutorial below has been compiled and uploaded. Friends who need it can get it by themselves [guaranteed 100% free]

Software Testing Interview Document

We must study to find a high-paying job. The following interview questions are from the latest interview materials from first-tier Internet companies such as Alibaba, Tencent, Byte, etc., and some Byte bosses have given authoritative answers. After finishing this set I believe everyone can find a satisfactory job based on the interview information.

Guess you like

Origin blog.csdn.net/wx17343624830/article/details/132830463