[Reprint] US corporations stepped on some pit -3.redis Redis memory footprint soar

US group stepped on some pit -3.redis Redis memory footprint soar

Blog Category:
  

   Please indicate the source Ha: http://carlosfu.iteye.com/blog/2254154

  More Redis development, operation and maintenance, as well as new developments in architecture, welcome attention to the micro-channel public number:

 


    


 A phenomenon:
    redis-cluster of a fragmented memory soared, obviously much higher than the other fragments, and continues to grow. The amount of memory from the master and is not consistent.
 
Second, analysis of possible reasons:
 1. redis-cluster of bugs (this should not present)
 2. The client's hash (key) in question, resulting in uneven distribution. (Redis using crc16, so uneven situation does not appear)
 3. The existence of large individual key-value: for example, a data set that contains millions of data structure (this is possible)
 4. The main problem appeared from the copy.
 5. Other reasons
 
Third, investigate the cause:
 1. Upon inquiry, 1-4 above do not exist
 2. Observe info information, a little aroused suspicion: client_longes_output_list some exceptions.
3. So expect to understand service and client interaction, client settings for each input and output buffers are buffers, if this part of the case will take up a lot of memory Redis server.
 
From the above client_longest_output_list view, it should be the output buffer memory for large, large amount of data that is output from the end Redis server to certain clients.
于是使用client list命令(类似于mysql processlist) redis-cli -h host -p port client list | grep -v "omem=0",来查询输出缓冲区不为0的客户端连接,于是查询到祸首monitor,于是豁然开朗.
 
monitor的模型是这样的,它会将所有在Redis服务器执行的命令进行输出,通常来讲Redis服务器的QPS是很高的,也就是如果执行了monitor命令,Redis服务器在Monitor这个客户端的输出缓冲区又会有大量“存货”,也就占用了大量Redis内存。
 
 
四、紧急处理和解决方法
进行主从切换(主从内存使用量不一致),也就是redis-cluster的fail-over操作,继续观察新的Master是否有异常,通过观察未出现异常。
查找到真正的原因后,也就是monitor,关闭掉monitor命令的进程后,内存很快就降下来了。
 
五、 预防办法:
1. 为什么会有monitor这个命令发生,我想原因有两个:
(1). 工程师想看看究竟有哪些命令在执行,就用了monitor
(2). 工程师对于redis学习的目的,因为进行了redis的托管,工程师只要会用redis就可以了,但是作为技术人员都有学习的好奇心和欲望。
2. 预防方法:
(1) 对工程师培训,讲一讲redis使用过程中的坑和禁忌
(2) 对redis云进行介绍,甚至可以让有兴趣的同学参与进来
(3) 针对client做限制,但是官方也不建议这么做,官方的默认配置中对于输出缓冲区没有限制。
Java代码   收藏代码
  1. client-output-buffer-limit normal 0  
(4) 密码:redis的密码功能较弱,同时多了一次IO
(5) 修改客户端源代码,禁止掉一些危险的命令(shutdown, flushall, monitor, keys *),当然还是可以通过redis-cli来完成
(6) 添加command-rename配置,将一些危险的命令(flushall, monitor, keys * , flushdb)做rename,如果有需要的话,找到redis的运维人员处理
Java代码   收藏代码
  1. rename-command FLUSHALL "随机数"  
  2. rename-command FLUSHDB "随机数"  
  3. rename-command KEYS "随机数"  
 
六、模拟实验:
1.  开启一个空的Redis(最简,直接redis-server)
Java代码   收藏代码
  1. redis-server  
    初始化内存使用量如下:
Java代码   收藏代码
  1. # Memory  
  2. used_memory:815072  
  3. used_memory_human:795.97K  
  4. used_memory_rss:7946240  
  5. used_memory_peak:815912  
  6. used_memory_peak_human:796.79K  
  7. used_memory_lua:36864  
  8. mem_fragmentation_ratio:9.75  
  9. mem_allocator:jemalloc-3.6.0  
    client缓冲区:
Java代码   收藏代码
  1. # Clients  
  2. connected_clients:1  
  3. client_longest_output_list:0  
  4. client_biggest_input_buf:0  
  5. blocked_clients:0  
 
2. 开启一个monitor:
Java代码   收藏代码
  1. redis-cli -h 127.0.0.1 -p 6379 monitor  
3. 使用redis-benchmark:
Java代码   收藏代码
  1. redis-benchmark -h 127.0.0.1 -p 6379 -c 500 -n 200000  
4. 观察
(1) info memory:内存一直增加,直到benchmark结束,monitor输出完毕,但是used_memory_peak_human(历史峰值)依然很高--观察附件中日志
(2)info clients: client_longest_output_list: 一直在增加,直到benchmark结束,monitor输出完毕,才变为0--观察附件中日志
(3)redis-cli -h host -p port client list | grep "monitor" omem一直很高,直到benchmark结束,monitor输出完毕,才变为0--观察附件中日志
监控脚本:
Java代码   收藏代码
  1. while [ 1 == 1 ]  
  2. do  
  3. now=$(date "+%Y-%m-%d_%H:%M:%S")  
  4. echo "=========================${now}==============================="  
  5. echo " #Client-Monitor"  
  6. redis-cli -h 127.0.0.1 -p 6379 client list | grep monitor  
  7. redis-cli -h 127.0.0.1 -p 6379 info clients  
  8. redis-cli -h 127.0.0.1 -p 6379 info memory  
  9. #休息100毫秒  
  10. usleep 100000  
  11. done  
 完整的日志文件:
 部分日志:
Java代码   收藏代码
  1. =========================2015-11-06_10:07:16===============================  
  2.  #Client-Monitor  
  3. id=7 addr=127.0.0.1:56358 fd=6 name= age=91 idle=0 flags=O db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=4869 omem=133081288 events=rw cmd=monitor  
  4. # Clients  
  5. connected_clients:502  
  6. client_longest_output_list:4869  
  7. client_biggest_input_buf:0  
  8. blocked_clients:0  
  9. # Memory  
  10. used_memory:174411224  
  11. used_memory_human:166.33M  
  12. used_memory_rss:161513472  
  13. used_memory_peak:176974792  
  14. used_memory_peak_human:168.78M  
  15. used_memory_lua:36864  
  16. mem_fragmentation_ratio:0.93  
  17. mem_allocator:jemalloc-3.6.0  

Guess you like

Origin www.cnblogs.com/jinanxiaolaohu/p/12009719.html