From a randomly ordered set of values, what program should be used?

Today did a small experiment, the cause is as follows:

First constructed in redis in the test data as follows:

> zadd my_zset_999 1 35570
(integer) 1
> zadd my_zset_999 2 40617
(integer) 1
> zadd my_zset_999 3 40956
(integer) 1
> zadd my_zset_999 4 41151
(integer) 1
>
> zrange my_zset_999 0 -1 WITHSCORES
1) "35570"
2) "1"
3) "40617"
4) "2"
5) "40956"
6) "3"
7) "41151"
8) "4"
>
> zrange my_zset_999 0 -1
1) "35570"
2) "40617"
3) "40956"
4) "41151"

Test method is very simple computer program running time .

$t1 = microtime(true);
// 代码片段
$t2 = microtime(true);
$t = $t2 - $t1;

Method. 1
zrange Key remove all values -1 0
array_rand () retrieves a value from a random array

Method 2
zcount Key -INF + INF calculating the set number of elements (CNT)
RAND (. 1, CNT) to generate a random number (Random)
zrangebyscoreKey Random Random

Method 3 : Method for the transformation of 2 to
zcardcalculate the number of set key elements (CNT)
RAND (. 1, CNT) to generate a random number (Random)
zrangebyscorekey Random Random

Method 4 : A method for the transformation of a
zrangebyscoreKey -INF + INF
array_rand () retrieves a value from a random array

Methods 1 and 4 are first of all extracted values of the ordered set, a value extracted randomly;
methods 2 and 3 is a random value is taken from the ordered set.

The following is a comparison of the running time of each method.

Tiers 2 and 3, i.e. zcount, and zcardrunning time comparison:

Run time comparison Method 2 / zcount Method 3 / zcard
1st 0.0072240829467773 0.007314920425415
2nd 0.0057311058044434 0.0071389675140381
the 3rd time 0.0065360069274902 0.0071680545806885
4th 0.0047309398651123 0.0075440406799316
第5次 0.0058040618896484 0.0068428516387939
第6次 0.0068061351776123 0.0073769092559814
第7次 0.0070509910583496 0.0070638656616211
第8次 0.008112907409668 0.0076460838317871
第9次 0.0070209503173828 0.0067050457000732
第10次 0.0069761276245117 0.0073142051696777

可以看出 zcountzcard 的波动大,且用时长,所以淘汰方法2,这是因为 zcard 的时间复杂度是 O(1),而 zcount 的时间复杂度是 O(log(N))

方法 1 和方法 3,即 zrangezrangebyscore 的运行时间对比:

运行时间对比 方法1/zrange 方法3/zrangebyscore
第1次 0.0076210498809814 0.0040271282196045
第2次 0.0066070556640625 0.0056281089782715
第3次 0.0062861442565918 0.0061671733856201
第4次 0.0070350170135498 0.0064809322357178
第5次 0.0070219039916992 0.0068569183349609

可以看出方法 2 比方法 1 要快一些。那如果把方法 1 改成用 zrangebyscore 取出所有值,再随机取元素呢,也就是方法 4,再比较方法 4 和方法 3 的运行时间:

运行时间对比 方法4/zrangebyscore取出数组,随机取出1一个值 方法3/zrangebyscore根据随机数取出一个值
第1次 0.0068261623382568 0.0075819492340088
第2次 0.0072751045227051 0.0073590278625488
第3次 0.0055849552154541 0.0072290897369385
第4次 0.0048110485076904 0.0075399875640869
第5次 0.0073840618133545 0.0075678825378418
第6次 0.0072331428527832 0.0072460174560547
第7次 0.007411003112793 0.0074880123138428
第8次 0.0062360763549805 0.007282018661499
第9次 0.0077290534973145 0.0074591636657715
第10次 0.0068199634552002 0.0074419975280762

可以看到方法 4 比方法 3 快一些,再用 ab 测试工具测一下

# 模拟100个并发用户,对一个资源发送100个请求。
ab -c 100 -n 100 url

方法 4 的测试结果如下:

Server Software:        nginx/1.15.11
Server Hostname:        127.0.0.1
Server Port:            80

Document Path:          test1.php
Document Length:        38 bytes

Concurrency Level:      100
Time taken for tests:   0.520 seconds
Complete requests:      100
Failed requests:        0
Non-2xx responses:      100
Total transferred:      23400 bytes
HTML transferred:       3800 bytes
Requests per second:    192.25 [#/sec] (mean)
Time per request:       520.161 [ms] (mean)
Time per request:       5.202 [ms] (mean, across all concurrent requests)
Transfer rate:          43.93 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       18   25   5.6     26      35
Processing:    41  219  87.1    219     359
Waiting:       41  219  87.4    219     359
Total:         60  245  92.3    246     393

Percentage of the requests served within a certain time (ms)
  50%    246
  66%    296
  75%    326
  80%    340
  90%    372
  95%    392
  98%    392
  99%    393
 100%    393 (longest request)

方法 3 的测试结果如下:

Server Software:        nginx/1.15.11
Server Hostname:        127.0.0.1
Server Port:            80

Document Path:          /test2.php
Document Length:        38 bytes

Concurrency Level:      100
Time taken for tests:   0.526 seconds
Complete requests:      100
Failed requests:        0
Non-2xx responses:      100
Total transferred:      23400 bytes
HTML transferred:       3800 bytes
Requests per second:    189.97 [#/sec] (mean)
Time per request:       526.390 [ms] (mean)
Time per request:       5.264 [ms] (mean, across all concurrent requests)
Transfer rate:          43.41 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       16   23   3.8     25      31
Processing:    36  216  89.5    220     372
Waiting:       36  216  89.2    220     372
Total:         54  239  92.9    245     403

Percentage of the requests served within a certain time (ms)
  50%    245
  66%    295
  75%    316
  80%    333
  90%    362
  95%    374
  98%    402
  99%    403
 100%    403 (longest request)

通过 Time taken for testsRequests per second 等结果,可以看出方法 4 比方法 3 的性能更高一些。

也就是先取出所有元素,再随机取出一个值 和 构造一个随机数取出一个元素 这两种方案,前者更好一些。

到这里就结束了吗?并没有~

最终结果就是不采用有序集合这种数据结构了,用列表集合这种数据结构即可。因为有序集合 zset 还要构造 score 值,比如插入元素,要查出最大的score值,再加 1。
既然需求只是从一堆元素中随机取一个值,用列表集合这种数据结构就能满足所需了。

Guess you like

Origin www.cnblogs.com/sunshineliulu/p/12399213.html