Redis application scenarios and best practices

    As the most popular in-memory Nosql database, Redis has many application scenarios. In different application scenarios, the deployment, configuration and usage of Redis exist in different places. According to my work experience, the "best practices" for application scenarios such as queues, caching, merging, and deduplication are organized as follows.

    All the code in this article can be found on github: https://github.com/huyanping/RedisStudy

queue

     The list data structure of Redis is often used as a queue. Common methods are: lpop/rpop, lpush/rpush, llen, lindex, etc. Since the list provided by Redis is a doubly linked list, we can also use the list as a stack. When using Redis's list as a queue, you need to pay attention to the following issues:

  1. The data in the queue generally has relatively high reliability requirements. The persistence mechanism of Redis is best to use the AOF method to ensure that the data is not lost.
  2. Also, due to the high requirements for data reliability, memory monitoring is particularly important. If the queue accumulates memory and runs out of memory, the service cannot be provided.
  3. If there is only one consumer in the consumption queue, it is recommended to use lindex to read the message first, and then throw it away in lpop after consumption, which can ensure transactionality and avoid message loss after message processing fails.
  4. If multiple consumers are consuming the queue and the message processing fails, the message can be rewritten to the queue. In the early stage, there is no ordering requirement for the message.

    The processing power of producers and consumers can be increased by batching (multi) and parallelism. Batch processing can reduce network traffic and reduce the overhead of Redis switching between different tasks. The benefit of parallelism is that when one client is preparing or processing data and Redis is idle, another client can read data from Redis; this keeps Redis busy as much as possible.

    If the above optimizations still lead to queue accumulation, it is recommended to start multiple Redis instances. Since Redis is a single-threaded model and cannot utilize multi-core CPUs, opening multiple instances can significantly increase throughput. There are many types of Redis cluster solutions, which can also be implemented simply by using the hash algorithm on the client side. The specific implementation scheme is beyond the scope of the description in this article, and is no longer redundant.

    The sample code for using a message queue based on Redis list is as follows:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

<?php

// 生产者

namespace jenner\redis\study\queue;

use Jenner\SimpleFork\Process;

use Jenner\SimpleFork\Queue\RedisQueue;

 

class Producer extends Process

{

    /**

     * start producer process

     */

    public function run()

    {

        $queue new RedisQueue('127.0.0.1', 6379, 1);

        for ($i = 0; $i < 100000; $i++) {

            $queue->put(getmypid() . '-' . mt_rand(0, 1000));

        }

        $queue->close();

    }

}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

<?php

// 消费者

namespace jenner\redis\study\queue;

use Jenner\SimpleFork\Process;

use Jenner\SimpleFork\Queue\RedisQueue;

 

class Consumer extends Process

{

    /**

     * start consumer process

     */

    public function run()

    {

        $queue new RedisQueue('127.0.0.1', 6379, 1);

        while (true) {

            $res $queue->get();

            if ($res !== false) {

                echo $res . PHP_EOL;

            else {

                break;

            }

        }

    }

}

cache

The cache we are talking about here is data that can be lost or expired, and the cache (or should be called a database) that cannot be lost or expired is beyond the scope of this article. The dictionary structure of Redis is often used for caching. Common methods are: set/get, hget/hgetall/hset, etc. When using redis as a cache, you need to pay attention to the following issues:

  1. Since the memory available to Redis is limited and cannot tolerate the infinite increase of redis memory, it is best to set the maximum memory maxmemory
  2. When maxmemory is enabled, the lru mechanism can be enabled to set the expiration of the key. When the maximum memory of Redis is reached, Redis will automatically eliminate the key according to the least recently used algorithm; there are 6 lru strategies, please refer to: http:/ /www.aikaiyuan.com/7089.html
  3. The persistence strategy of Redis and the recovery time of Redis failure are a game process. If you want to recover as soon as possible in the event of a failure, you should enable the dump backup mechanism, but the dump mechanism requires you to keep at least 1/3 (experience value) available Memory (copy-on-write), so you may not be able to allocate as much memory to Redis; if you can tolerate the long failure recovery time of Redis, you can use the AOF persistence mechanism and turn off the dump mechanism, which can break through the retention of 1/3 memory limit.

Regarding the use of cache, it does not belong to the scope of this article, please refer to: http://tech.meituan.com/avalanche-study.html

There are too many sample codes, so here is just an address: https://github.com/huyanping/RedisStudy/tree/master/src/spider

calculate

The atomic increment and decrement methods and ordered sets provided by Redis can undertake some computing tasks, such as traffic statistics. Commonly used methods are: incr/decr, hincrby, zadd/zcard, etc.

在使用redis作为计算服务时,需要注意一下几个问题:

  1. 计算场景的数据一般对可靠性要求比较高,建议启用AOF持久化机制,根据恢复时间和内容利用率的考虑确定是否开启dump机制。
  2. redis的单线程模型决定了redis无法利用多核CPU,这里建议引入redis集群解决方案,当然仍然可以在客户端通过hash方案解决。
  3. 批量发送、批量导出

去重

    Redis的hset和HyperLogLog数据结构可以在使用少量内存的情况下对数据进行去重。在有大量数据需要去重的场景比较试用。Redis的HyperLogLog只需要使用12K的内存空间即可对2的64次方个记录进行去重。具体选用哈希字典还是HyperLogLog需要根据你需要去重的数据量综合决定,如果你需要去重的数据总体占用空间远小于12K,使用哈希字典即可,如果超过12K,推荐使用HyperLogLog。常用的命令有:pfadd/pfcount、hset/hlen。这里需要注意,pfadd返回的是布尔型,表示该值是否已经存在;不可以通过累加pfadd的结果判定唯一记录数,必须调用pfcount获取,这个应该是算法的原因,记住就好。有兴趣的童鞋可以深究一下HyperLogLog的算法。

两种方式的示例代码分别如下:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

<?php

// HyperLogLog

namespace jenner\redis\study\unique;

use jenner\redis\study\tool\Logger;

 

class HyperLogLog

{

    /**

     * @var \Redis

     */

    protected $redis;

    /**

     * @var array

     */

    protected $ips;

    /**

     * default hyperloglog key

     */

    const KEY = "ip-unique-hyperloglog";

    /**

     * HyperLogLog constructor.

     * @param array $ips

     */

    public function __construct(array $ips)

    {

        $this->redis = new \Redis();

        $this->redis->connect("127.0.0.1", 6379);

        $this->redis->select(3);

        $this->ips = $ips;

    }

    /**

     * start to count ips using hyperloglog

     */

    public function start()

    {

        Logger::info("unique process start");

        $this->redis->pfadd(self::KEY, $this->ips);

        Logger::info("unique done. ip count:" $this->redis->pfcount(self::KEY));

    }

}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

<?php

// set

namespace jenner\redis\study\unique;

use jenner\redis\study\tool\Logger;

 

class Set

{

    /**

     * @var \Redis

     */

    protected $redis;

    /**

     * @var array

     */

    protected $ips;

    /**

     *

     */

    const KEY = "ip-unique-normal";

    /**

     * Set constructor.

     * @param array $ips

     */

    public function __construct(array $ips)

    {

        $this->redis = new \Redis();

        $this->redis->connect("127.0.0.1", 6379);

        $this->redis->select(3);

        $this->ips = $ips;

    }

    /**

     * start to count ips using set

     */

    public function start()

    {

        Logger::info("unique process start");

        foreach ($this->ips as $ip) {

            $this->redis->sAdd(self::KEY, $ip);

        }

        Logger::info("unique done. ip count:" $this->redis->sCard(self::KEY));

    }

}

发布订阅

Redis的发布订阅机制,在客户端与服务端由于某些问题链接失效时,中间订阅的数据会丢失;所以在实际生产环境中,很少应用这种机制。

示例代码如下:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

<?php

// publisher

namespace jenner\redis\study\pubsub;

 

class Publisher

{

    /**

     * @var \Redis

     */

    protected $redis;

    /**

     * default pubsub key

     */

    const KEY = "pubsub-demo";

    /**

     * Publisher constructor.

     */

    public function __construct()

    {

        $this->redis = new \Redis();

        $this->redis->connect("127.0.0.1", 6379);

        $this->redis->select(4);

    }

    public function publish()

    {

        $count = 10;

        for ($i = 0; $i $count$i++) {

            $this->redis->publish(self::KEY, mt_rand(0, 10000));

        }

    }

}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

<?php

// subscriber

namespace jenner\redis\study\pubsub;

use jenner\redis\study\tool\Logger;

 

class Subscriber

{

    /**

     * default pubsub key

     */

    const KEY = "pubsub-demo";

    /**

     * @var \Redis

     */

    protected $redis;

    /**

     * Subscriber constructor.

     */

    public function __construct()

    {

        $this->redis = new \Redis();

        $this->redis->connect("127.0.0.1", 6379);

        $this->redis->select(4);

    }

    public function subscribe()

    {

        $this->redis->subscribe(array(self::KEY), function ($redis$channel$message) {

            Logger::info("get message[" $message "] from channel[" $channel "]");

        });

    }

}

Redis lua应用

Lua 脚本功能是 Reids 2.6 版本的最大亮点, 通过内嵌对 Lua 环境的支持, Redis 解决了长久以来不能高效地处理 CAS (check-and-set)命令的缺点, 并且可以通过组合使用多个命令, 轻松实现以前很难实现或者不能高效实现的模式。

优点:原子性(由于Redis是单线程模型,同一时刻只能处理一个lua脚本),更小的请求包

应用场景:事务实现,批量处理

以下示例代码实现了getAndSet命令:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

<?php

// redis lua

namespace jenner\redis\study\lua;

use jenner\redis\study\tool\Logger;

 

class Lua

{

    /**

     * @var \Redis

     */

    protected $redis;

    /**

     * Lua constructor.

     */

    public function __construct()

    {

        $this->redis = new \Redis();

        $this->redis->connect("127.0.0.1", 6379);

        $this->redis->select(2);

    }

    /**

     * @param $key

     * @param $value

     * @return bool

     */

    public function set($key$value)

    {

        return $this->redis->set($key$value);

    }

    /**

     * @param $key

     * @param $value

     */

    public function getAndSet($key$value)

    {

        $lua = <<<GLOB_MARK

local value = redis.call('get', KEYS[1])

redis.call('set', KEYS[1], ARGV[1])

return value

GLOB_MARK;

        $result $this->redis->eval($luaarray($key$value), 1);

        Logger::info("eval script result:" . var_export($result, true));

    }

    /**

     * @return string

     */

    public function error()

    {

        return $this->redis->getLastError();

    }

}

Dump故障

当Redis使用内存大于操作系统剩余内存的2倍时,使用dump持久化机制可能会造成服务器宕机、假死等情况。原因是dump时,Redis会fork一个子进程,根据写实复制原则,如果Redis中的数据会发生修改时,操作系统会把服务进程的内存copy一份给子进程,具体copy多少根据数据修改的覆盖度;这时如果内存不够用,操作系统会使用swap扩展内存,性能急剧下降,如果swap也不够了,则可能发生宕机、假死等情况。

解决方案:设置maxmemory,监控Redis内存使用(Redis info命令),场景允许的情况下开启lru机制。

maxmemory故障

故障描述:设置了maxmemory,内存用完,客户端无法写入

解决方案:对Redis内存使用进行监控,根据业务场景控制内存使用;如果内存确实不够用了,考虑引入分布式Redis集群方案

redis访问漏洞

这个漏洞的原理非常简单,只需执行以几条命令即可:

1

2

3

4

redis> config set dbfilename authorized_keys

redis> config set dir '/root/.ssh'

redis> set xxoo "\n\n\nyour public ssh key"

redis> save

通过以上几条命令,可以将你的ssh公钥写入对方的Redis服务器,从而获取root权限。这个漏洞的利用条件也比较苛刻,需要满足以下几个条件:

  1. Redis需是root用户运行,或已知Redis运行用户
  2. 6379端口无防火墙拦截
  3. Redis无访问密码
  4. config set命令没有被禁用

根据以上利用条件,对应防御手段如下:

  1. 优先监听127.0.0.1网卡(如过redis是给本机访问),优先监听内网网卡
  2. 防火墙对6379端口访问进行限制
  3. 使用非root用户运行redis
  4. redis开启密码访问(养成好习惯)
  5. 禁用config set命令

一般情况下做到第二点,基本就不会被黑了,但我们应该尽量做到第四点,尽善尽美。

redis自动重连

RedisRetry是一个支持自动重连的redis客户端封装,项目地址:https://github.com/huyanping/RedisRetry

原理:使用__call方法对redis的原生方法封装,当发生RedisException时,自动关闭并重新建立连接,执行n次,每次相隔m毫秒。

常量:

REDIS_RETRY_TIMES 重试次数

REDIS_RETRY_DELAY 间隔时间,单位毫秒

使用方式:把使用Redis类的地方,添加’use \Jenner\RedisRetry\Redis’即可。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325556555&siteId=291194637