Interface pressure measurements and ab wrk Practice

| Lead daily operations to do the activities will make a big amount of requests, in order to assess whether the current system can support the estimated amount requested live, then do the pressure test to Sheremetyevo interfaces, usually carried out several rounds of pressure measured, major the system is hidden bug investigation and discovery can optimize the point, and then evaluate the current system load, and then prepare expansion, etc., to make the system stable and reliable support operational activities based on pressure measurement results.

The term pressure test described

1. Throughput

Concepts: concurrent processing server capabilities described quantization unit is reqs / s, refers to the number of requests processed per unit time at a certain number of concurrent users. The maximum number of requests at a certain number of concurrent users that can be processed per unit of time, called maximum throughput. Calculated: Total number of requests / the number of requests processed time spent, i.e., Request per second = Complete requests / Time taken for tests

2. The number of concurrent connections

At some point the number of requests accepted by the server: Concepts

3. The number of concurrent users

Concept: important to distinguish the difference between the concept and the number of concurrent connections, a user may simultaneously generate multiple sessions, i.e. the number of connections.

4. The user requests an average waiting time

Calculated: number of requests processed all the time spent / (total number of requests / number of concurrent users), i.e., Time per request = Time taken for tests / (Complete requests / Concurrency Level)

These are some of the indicators that we measure the pressure of interest, which is an indicator of the throughput we are most concerned about.

1. The pressure measuring tool ab Practice

ab usage presentation

Here are common parameters and usage

-n requests 总请求数
-c concurrency 一次产生的请求数,可以理解为并发数
-t timelimit 测试所进行的最大秒数, 可以当做请求的超时时间
-p postfile 包含了需要POST的数据的文件
-T content-type POST数据所使用的Content-type头信息
复制代码

get请求:ab -n 1000 -c 10 -t 10 "testurl.com/get_user?ui…"

post request: ab -n 1000 -c -T 10 -t 10 -p post.json "the Application / the X--the WWW-form-urlencoded" " testurl.com/add_user " which post.json is your interface requires that the json parameter

{"name":"bob","age":12,"sex":1}
复制代码

Batch scripting pressure sensor interface (multiple simultaneous simultaneous pressure measurement)

In practice, the pressure measurement, we tend to be a lot of pressure measuring interface, and the interface has a lot of simultaneous requests circumstances, so we can not measure a single interface to obtain TPS, you need to write your own script, to be completed simultaneously pressure measurement requirements. The following is a simple batch pressure test script written test.sh:

#!/bin/sh

RESULT_DIR="/data/home/yawenxu/TestResult/"
jsonFileArr="ff0e_00.json ff0e_02.json ff0e_05.json ff0e_06.json ff0e_07.json ff0e_08.json ff0e_10.json ff0e_11.json" # 每个接口post需要的json参数的文件
concurrency=${1-1} #并发数
count=${2-1}  #总请求数
input_file_name=$3
exec_single_ab(){
        if [ -f $1 ];then
                /usr/bin/ab -n $2 -c $3 -p $1 -T "application/x-www-form-urlencoded" "http://$4" >$RESULT_DIR$1$2"_nginx.txt" 2>&1 &
        else
                echo $1" is not exists"
        fi

}
exec_loop_ab(){
        for item_name in $jsonFileArr
        do
                exec_single_ab $item_name $concurrency $count "http://api.com/xxx" &
        done
}

if [ -f $input_file_name ];then
        exec_single_ab $input_file_name $concurrency $count &
else
        exec_loop_ab &
fi

复制代码

This script ensures that multiple interfaces at the same time pressure measurement, pressure measurement at the same time save each report interface to a file for analysis. Run sh test.sh 10000 10000 disposable pressure measurement report file generated as follows:

ff0e_00.json10000_nginx.txt  ff0e_05.json10000_nginx.txt  ff0e_07.json10000_nginx.txt  ff0e_10.json10000_nginx.txt
ff0e_02.json10000_nginx.txt  ff0e_06.json10000_nginx.txt  ff0e_08.json10000_nginx.txt  ff0e_11.json10000_nginx.txt
复制代码

Pressure measurement report to a file as shown below

This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking www.voovlive.com (be patienComCompleted 100 requesComCompleted 200 requesComCompleted 300 requesComCompleted 400 requesCompleted 500 requests
Finished 500 requests


Server Software:        
Server Hostname:        10.228.15.46
Server Port:            3002

Document Path:          /test/get
Document Length:        248 bytes

Concurrency Level:      500
Time taken for tests:   2.523 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Total transferred:      161500 bytes
Total POSTed:           359640
HTML transferred:       124000 bytes
Requests per second:    198.19 [#/sec] (mean)
Time per request:       2522.808 [ms] (mean)
Time per request:       5.046 [ms] (mean, across all concurrent requests)
Transfer rate:          62.52 [Kbytes/sec] received
                        139.21 kb/s sent
                        201.73 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   13   4.1     14      17
Processing:   416  948 516.2    828    2485
Waiting:      415  947 516.4    827    2485
Total:        432  961 513.2    841    2497

Percentage of the requests served within a certain time (ms)
  50%    841
  66%    913
  75%    990
  80%   1213
  90%   2039
  95%   2064
  98%   2476
  99%   2484
 100%   2497 (longest request)
uest)
复制代码

We are more concerned about several points 1.Concurrency Level: 500 // concurrent 2.Time taken for tests: 2.523 seconds // total time spent 3.Complete requests: 500 // completed requests 4.Failed requests: 0 // failed requests 5.Requests per second: 198.19 [# / sec] (mean) // throughput (TPS) 6.Time per request: 2522.808 [ms] (mean) // average time it takes a request of 7.Percentage the requests served within a certain time ( ms) // distribution request time pressure evaluation is mainly measured load current system, it is generally look throughput (TPS) and the success rate is enough, if you want to evaluate the performance of the interface, can be view the first 6 and 7 data. Often we need to change the pressure measured parameters (number of concurrent requests and the total number) and more pressure measured several times, will get accurate TPS.

summary of the issue

ab advantages are as follows:

1. 简单易用
2. 支持post请求,接受json文件作为参数 (方便编写脚本批量测试)
3. 压测量不大的情况下够用(1024个并发以下)
复制代码

But there are some obvious shortcomings

1. not measured duration of pressure, the pressure control speed measurement. ab does not have this feature.
2. The pressure measurement mountains. Mainly ab can only use a single cpu, can only be a single process, and the maximum number of open file system limits each process is 1024, so up to 1024 threads, if a multi-core cpu, these wasted resources.
3. If the request is rejected by the server in the middle of the test report is not complete, can not come to pressure test has been completed requests. The following is a report midway rejected server, very friendly:
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 100.125.176.120 (be patienCompleted 100 requests
Completed 200 requests
ComCompleted 300 requesComCompleted 400 requesComCompleted 500 requesapr_socket_recv: Connection timed out (110)
Total of 536 requests completed
ed

复制代码

2. Pressure measurement tool wrk practice

Ab find these shortcomings, after which can not meet the current requirements of the pressure test, to see a lot of pressure measurement tool, finally found wrk this same ease of use, no ab above drawbacks, the use of higher learning costs a little pressure measurement tool. wrk load tests can be run in one or multi-core CPU, wrk combines scalable epoll event notification system and kqueue and other multi-threaded design.

wrk usage presentation

1. Install

Pressure test is performed on a machine test environment, test environment because the machine can not access the Internet so cumbersome installation tools. Why the test environment to the machine do? because

1. 本机windows环境,安装同样比较麻烦
2. 压测现网服务有些是通过ip压测的,本机无法访问。
3. 测试环境机器性能比较高,并发请求量更大一些。
复制代码

Installation steps

  1. Download the installation package to a local source address
  2. Local Upload Package to the test environment machines
  3. In the test environment unpack, compile, if the compiler error, some less dependent libraries, follow the same steps to install.

2. Main Parameter Description

-t --threads   开启的线程数,用于控制并发请求速度。最大值一般是cpu总核心数的2-4倍
-c --conections 保持的连接数(会话数)
-d --duration 压测持续时间(s)
-s --script 加载lua脚本(post请求写一些参数到脚本里)
-H --header 在请求头部添加一些参数
--latency 压测报告输出请求回包花费时间分布
--timeout 请求的最大超时时间(s),这个很有用
复制代码

Simple example -T4 -c1000 -d30s -T30s --latency WRK www.baidu.com meaning above command is used to simulate four threads 1000 concurrent connections, the entire test for 30 seconds, 30 seconds connection timeout, printing the delay statistics requested.

Batch scripting pressure sensor interface (multiple interfaces at the same time pressure measurement)

1. write lua scripts because the interface needs to json format parameters, so the first to write lua scripts, add the json parameter in lua script, ff0e_05.lua follows

request = function()
    local headers = { }
    headers['Content-Type'] = "application/json"
    body = '{"uin":"123","lang":"en"}'
    return wrk.format('POST', nil, headers, body)
end
复制代码
  1. A simple batch scripting pressure test pressure test script test.sh:
#!/bin/sh

RESULT_DIR="/data/home/yawenxu/wrkTestResult/"
luaFileArr="ff0e_00.lua ff0e_02.lua ff0e_05.lua ff0e_06.lua ff0e_07.lua ff0e_08.lua ff0e_10.lua ff0e_11.lua" # 每个接口post需要的json参数的lua文件
concurrency=${1-1} #开启线程数,用于控制速度
count=${2-1}  #保持连接数
continueTime=${3-1}  # 持续时间
input_file_name=$4
exec_single_wrk(){
        if [ -f $1 ];then
                ./wrk -t $2 -c $3 -d $continueTime --script=$1 --latency --timeout 10 "http://$4" >$RESULT_DIR$1$3"_nginx.txt" 2>&1 &
        else
                echo $1" is not exists"
        fi
}

exec_loop_wrk(){
        for item_name in $luaFileArr
        do
                exec_single_wrk $item_name $concurrency $count "http://api.com/xxx" &
        done
}

if [ -f $input_file_name ];then
        exec_single_wrk $input_file_name $concurrency $count &
else
        exec_loop_wrk &
fi

复制代码

This script ensures that multiple interfaces at the same time pressure measurement, pressure measurement at the same time save each report interface to a file for analysis. Run sh test.sh 12 5000 10 will be generated once the pressure measurement report files of all the interfaces, wherein a pressure measurement report is as follows:

Running 10s test @ http://api.com/xxx
  12 threads and 5000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.27s   386.53ms   1.99s    59.87%
    Req/Sec    53.85     75.74   777.00     93.91%
  Latency Distribution
     50%    1.26s
     75%    1.61s
     90%    1.80s
     99%    1.97s
  3641 requests in 10.10s, 1.88MB read
  Socket errors: connect 0, read 0, write 0, timeout 2146
Requests/sec:    360.51
Transfer/sec:    190.82KB
复制代码

See the above reports, it can be seen the number of connections 5000 to maintain continuous pressure measurement 10s, timeout 10s, this situation is particularly serious interface times the total number of requests timeout 3641 2146, but see Latency Distribution also found 99% of the requests are back to the pack in the 1.97s, this time it should consider nginx access layer forwarding capability is not caused by relatively low. So we can specifically ip pressure measured at our service, without passing through nginx access layer. Finally, pressure test and found that indeed there is nginx traffic forwarding bottleneck evaluation, then we can take steps to expansion of nginx.

In doing scripts pressure measured at the same time, we need to view real-time cpu usage and system load current network services, discovery cpu utilization and load significantly soared. Command: top -d 1, reference may be used Perf CPU performance analysis and flame FIG.

summary of the issue

  1. Pressure measurement Pressure measurement can not be done alone interface, you must simulate the actual situation, there are many interfaces actual pressure measured at the same time requesting the same time, in order to find the system problems
  2. Pressure measurements need to consider many factors.
1. 压测所用机器的性能,每秒发出的请求数能不能达到预估值
2. 压测工具每秒发出的请求数能不能达到预估值,以及能不能持续压测。
3. 压测现网服务,需要考虑到依赖的其他服务,比如接入层(nginx)自身的处理能力,可以对接入层做压测。
4. 压测的时候需要实时查看现网服务的cpu使用率和系统负载情况,这个可以发现编码上的bug。

复制代码

This article first appeared in Denver reproduced, please indicate the original author, if you find this article helpful to you or inspiration, you can ask my coffee. Stakeholders: all software comes to this article are the author of the tools used daily, without any advertising costs.

Guess you like

Origin juejin.im/post/5da18ba5f265da5b7e23f6de
Recommended