[Interface Interface pressure measurement performance analysis and tuning recommendations

Common Internet architecture, generally can see spring + mybatis + mysql + redis with the figure in the company I serve is also true. In general, application interface are direct internal calls, the call between the called interface-oriented programming, applied directly or by a similar transfer service framework dubbo like, often use JSON format data, i.e., each data is also convenient unified and do the conversion between values, or redis the memcached cache generally used, a number of string objects stored or json format. Provide external interfaces, are generally required to conduct stress tests to estimate their performance, and provide guidance for future direction of tuning, the following interfaces is a variety of "strange phenomenon" appeared in the press during the test, the so-called strange, referring to with our normal logic does not match the ideas from the appearance point of view, but its essence is manifested in our program under pressure characteristics are not familiar with the usual attempt to explain the structure of knowledge, this is simply not feasible. The following is my analysis of the measured pressure after a thorough process of data aggregation, a phenomenon which is a lot of pressure common measure, which the analysis and improvement measures I think it has great reference value. Details are as follows :( I interface portion thereof is omitted for security name, but does not affect our analysis, as represented 1N3T further shaped such that a table nginx, 3 sets tomcat, tps specific values ​​before and after optimization of illustration only cf, no real meaning)

 

1 Interface Name: Get a list

Pressure measuring 1.1 phenomenon: a single tps700 multi-application high cpu load

  1.1.1 Analysis:

    Old framework, the average response time is long, the CPU application of high, a large number of internal procedures to map bean conversion between JSON, modify the number of database connections, TPS not improve.

  1.1.2 improvements:

    Reconstruction system, before the operation with dao mybatis Alternatively, to reduce data transfer between the internal bean-map-json, to eliminate useless operation of the internal program.

  1.1.3 improved results:

    Tps able to improve after about 3000, much improvement, but the application of cpu when the pressure measured almost run over, there is room for improvement.

Pressure measuring 1.2 phenomenon: high resource utilization database

  1.2.1 Analysis:

    A single application, database resources can cpu to 50%, 10 tomcat run over the next 10,000 concurrent database cpu, load value is more than 700, but the db 11554 qps but also, and not that much, it was suspected that sql execution cost the cpu, may not be a bar to find sql by index or index fail.

  1.2.2 improvements:

    View SQL files found the following sql:! Select count (1) from orders where order_status_id = 40, change it to select order_id from orders through the program and then the order_status_id = 40 filters out!. ) To get through list.size (. order_status_id even added index, because it is! = Comparison, so it will not go according to the index to find, resulting in high cpu

  1.2.3 improved results:

    Under the same environment (one nginx, 10 units tomcat, 1000 concurrent), TPS becomes 3727 to 3000, a slight increase, the cpu db but decreased, becomes 30%, the effect is obvious

Pressure measuring 1.3 phenomenon: 1N15T, tps4552; 10N15T, tps9608

  1.3.1 Analysis:

    When the back-end are 15 tomcat, when the front desk nginx tps 1 to 4552, by hanging 10 lvs nginx to 9608, growth is not obvious, it nginx and tomcat are not much pressure, unreasonable clustered result, suspected nginx forwarding configuration The problem;

  1.3.2 improvements:

    No further improvement: It may be necessary to adjust the parameters of nginx, nginx previously discovered a great influence on different configurations tps back-end cluster environment

  1.3.3 improved results:

 

 

2 Interface Name: Interface Information Query

Pressure measuring 2.1 phenomenon: a single tps2000, high application cpu, about db of qps15000

  2.1.1 Analysis:

    Old framework, internal procedures have a lot of Bo-map-json mutual conversion

  2.1.2 improvements:

    Remove redundant code, connection pooling bag replacement, using mybatis

  2.1.3 improved results:

    Tps increased from more than 2,000 to more than 8,000, db of qps about 9000, high-pressure measurement applications optimized cpu usage, almost run over.

 

Pressure measuring 2.2 phenomenon: the database no pressure increase after multiple applications unchanged tps

  2.2.1 Analysis:

    1N1T and 1N10T of tps as are 5000, increased the number of errors is increased concurrent application cpu cost 70%, db stress-free, single nginx discovery port filled by ss -s, i.e. the forwarding to nginx connection between tomcat port time-wait state more than 60,000. Nginx bottlenecks.

  2.2.2 improvements:

    Nginx tuning parameters, is connected to a short length connector

  2.2.3 improved results:

    1N3T of tps to go to 17376, cpu pressure tomat of 84%, db of qps18000, cpu69%, applied to the amount of use of the resource base.

 

 

3 Interface Name: Get the Details

Pressure measuring 3.1 phenomenon: a single application tps2600,10 tomcat Taiwan before 3700

  3.1.1 Analysis:

  Increase application server, tps growth is not obvious, and nginx, tomcat, db load is not high, indicating that the server itself is not the bottleneck, consider that the problem is not the network, by monitoring network card network data packet traffic found run over, because this interface will output large amounts of data, so the bottleneck in the network. In addition, the testing process redis have found an error, redis server is a virtual machine, it may have limited service capabilities.

  3.1.2 improvements:

    Gzip compression turned on the tomcat.

  3.1.3 improved results:

    Under the same concurrently (Taiwan nginx, 10 units tomcat, 1000 concurrent), TPS increased from 3,727 to 10,022, an increase of nearly 3 times, the effect is significant.

 

3.2 pressure measurement phenomena: nginx parameter tuning of tps cluster significantly enhance the effect of the 1N10T

  3.2.1 Analysis:

    After the Enable gzip tomcat, 1N10T under tps to 10022, to be further enhanced.

  3.2.2 improvements:

    Optimization nginx:

    l nginx Log Off

    l nginx process the number of worker, from 24 to 16

    l nginx 256 to 2048 by the number of keepalive

  3.2.3 improved results:

  Tps promoted to 13270 of 10022.

Pressure measuring 3.1 phenomenon: 1N5T and 1N10T of tps or less

  3.1.1 Analysis:

    1N10T of tps to 10,003 thousand, 1N5T of tps to 10 002 thousand, or less, tomcat application's resource utilization is not full, cpu is 65%, Db of QPS has reached more than 20,000, and a single server db substantially to the amount, and therefore did not further increase the effect of the application, resulting in only the response time becomes long.

  3.1.2 improvements:

    Single db has not improved, or upgrade server hardware, or separate read and write.

  3.1.3 improved results:

 

 

4 Interface Name: Promotion

4.1 pressure measurement phenomena: accessing data through redis, was over 1000 tps, CPU pressure

  4.1.1 Analysis:

    This interface takes data redis, tps a brain not more than 1000, but the cpu occupancy of 80%, indicating a large number of internal program sequence deserialization operation, may be json serialization problems.

  4.1.2 improvements:

    The net.sf.json into fastjson alibaba's.

  4.1.3 improved results:

    Under the same conditions tps by the more than 1,000 concurrent promoted to more than 5000, increased nearly five times.

  

Pressure measuring 4.1 phenomenon: a long time tps parameters decreased significantly

  4.1.1 Analysis:

    This interface according to acquire data from redis parameters, each parameter redis interaction time, when a set of parameters tps5133, five sets of parameters when tps1169, multiple interactions affect the processing performance.

  4.1.2 improvements:

    Redis get data from the get changed mget, reduce the number of interactions.

  4.1.3 improved results:

    When the five sets of parameters 1N3T pressure measurement TPS9707, pursuant to estimate even a single tomcat, tps can have three or four thousand, performance 3,4 times more than a single way to get the call.

Pressure measuring 4.2 phenomenon: 1N3T tps1 thousand, increase in growth will not significantly tomcat may tps

  4.2.1 Analysis:

    Here it is possible to say, because the cpu nginx server, while not high, but has more than 800 pps k, this time should be the nginx server network traffic becomes a bottleneck. (Just guessing)

  4.2.2 improvements:

    You can add multiple nginx load, front-end plus lvs

  4.2.3 improved results:

 

 

5 Interface Name: Tracking Interface

Pressure measuring 5.1 phenomenon: 1N10T of tps tps 1N3T of less than

  5.1.1 Analysis:

    1N3T at 2000 to 9849 concurrent tps, at this time as the db qps 90000, CPU80%, increased to 10 to tomcat, concurrent 5000, 7813 tps to, db is the qps 19000, cpu75%, load 1, the pressure increase described db under great pressure instead of down, noting nginx server NIC flow reached 885M, instructions are under too much pressure situation, the network is full, packet loss, resulting in db end pressure but down.

  5.1.2 improvements:

    Since the amount of data transfer interface portion is large, there will be noted that the pressure sensing network bottleneck situation.

  5.1.3 improved results:

 

6 Interface Name: Interface backfill

Pressure measuring 6.1 phenomenon: tps less than 500, db of qps3500

  6.1.1 Analysis:

    Despite the lack of db cpu and cpu utilization data of applications, lower application tps should be the bottleneck, and the need to focus on is not the db slow in dealing with queries.

  6.1.2 improvements:

    1. dbcp to the connection pool hikar;

    2. Reduce the log printout

    3.sql optimization, the partial filter condition to execute java code

  6.1.3 improved results:

  Tps grew from less than 500 to more than 1,300.

7 Interface Name: Coupons

Pressure measuring 7.1 phenomenon: the cluster results compared with the results of a single application unreasonable

  7.1.1 Analysis:

    See if there is a bottleneck resource, you can see the next five tomcat cluster, tps to 9952, but the qps to 50,000-60,000 db, cpu utilization rate of 37%, indicating a large number of the database primary key or index query, the general single station db's qps also about 40,000, to increase the number of cluster applications, for tps will not be much impact.

  7.1.2 improvements:

    Can be considered sub-libraries

  7.1.3 improved results:

 

8 Interface Name: Recommended

Pressure measuring 8.1 phenomenon: nginx connection length difference

  8.1.1 Analysis:

    18nginx, 2tomcat when tps8100, this time over the application server port number, in general, Nginx short connection under high concurrency of fill port easily cause problems.

  8.1.2 improvements:

    The long connection to nginx

  8.1.3 improved results:

    tps growth 10733, TPS stability, reduce the ups and downs, but the CPU is exhausted. DESCRIPTION cpu played, even if at this time the code optimization of the tuning on.

9 Interface Name: Query 2

Pressure measuring 9.1 phenomenon: 18N20T of tps before 6842

  9.1.1 Analysis:

    18 sets of nginx, 20 sets of tomcat, tps until 6842, when the application cpu utilization of 85%, indicating that there is a bottleneck cpu, but check this interface does not expand the amount of calculation work, there may be a problem logging levels, frequent cpu the fight log.

  9.1.2 improvements:

    The log level debug level by the level changed info

  9.1.3 improved results:

    The same environment tps 23592. pit father's production environment debug log level by the 6842 growth.

 

Transfer: https://www.cnblogs.com/dimmacro/p/4849729.html

Guess you like

Origin www.cnblogs.com/itplay/p/11372294.html