Encapsulate the scan method for redis fuzzy query

    public  function redisScan($pattern, $count = 1000){
    
    
       $redis = new \myredis\Datasource();
       $myredis = $redis::getRedis('instance1');
        $ret = [];
        $iterator = 0;
        while (true) {
    
    
            $result =  $myredis->rawCommand("scan", $iterator, 'match', $pattern, 'count', $count);
//            print_r($result);
//            echo '<br>';
            if ($result === false) {
    
    
                break;
            }
            $ret = array_merge($ret, $result[1]);
            $iterator = $result[0];
            if($result[0] == 0){
    
    
                break;
            }
        }

        return $ret;
    }
    public function redisScanTest(){
    
    
        $res=$this->dcscan('tp3_home_watch_mobile_invite_list*');
        print_r($res);
    }

Insert picture description here

    /**
     * redis里把tp3_home_watch_mobile_invite_list*出队列
     */
    public function redisPull(){
    
    
        while (true) {
    
    
            $key = $this->redis->rpop('tp3_invite_list_key_mult');//tp3_home_watch_mobile_invite_list_46085

            if($key){
    
    
                $this->redis2mysql($key);
            }else{
    
    
                echo "deal finish";
                return true;
            }
        }
        echo 'success';
    }
        /**
     * redis里把tp3_home_watch_mobile_invite_list*键写入redis队列
     */
    public function redisPush(){
    
    
        $res=$this->redisScan('tp3_home_watch_mobile_invite_list*');

//        $res=['tp3_home_watch_mobile_invite_list_71354'];
        foreach ($res as $k=>$v){
    
    
            $this->redis->lpush('tp3_invite_list_key_mult',$v);
        }
        echo 'success';
    }

There is a classic problem in Redis. In the case of a huge amount of data, there are two ways to do information similar to finding a key that meets a certain rule.
One is the keys command, which is simple and rude, because Redis is single-threaded. One feature is that the keys command is executed in a blocking manner, and the complexity of the keys being implemented in a traversal manner is O(n). The more keys in the Redis library, the greater the search implementation cost and the longer the blocking time.
The second is the scan command, which realizes the key value search in a non-blocking manner. In most cases, it can replace the keys command and is more optional.

Write 100,000 pieces of test data in key***: value*** format as follows (ps: if using pipline, 1w one stroke, each stroke is completed in seconds)

# -*- coding: utf-8 -*-
# !/usr/bin/env python3
import redis
import sys
import datetimedef create_testdata():
    r = redis.StrictRedis(host='***.***.***.***', port=***, db=0, password='***')
    counter = 0
    with r.pipeline(transaction=False) as p:
        for i in range(0, 100000):
            p.set('key' + str(i), "value" + str(i))
            counter = counter + 1
            if (counter == 10000):
                p.execute()
                counter = 0
                print("set by pipline loop")

if __name__ == "__main__":
    create_testdata()

For example, what are the keys starting with key111 in the query here?
If you use the keys command, execute keys key1111* to find out all at once.
Insert picture description here
Similarly, if you use the scan command,

the syntax of scan 0 match key1111* count 20 scan is: SCAN cursor [MATCH pattern] [COUNT count] The default COUNT value is 10.

The SCAN command is a cursor-based iterator. This means that every time a command is called, the cursor returned by the previous call needs to be used as the cursor parameter of the call to continue the previous iteration process.
Here, the scan 0 match key1111* count 20 command is used to complete the query. What is a little unexpected is that there is no query result at the beginning of the use. This depends on the principle of the scan command.
When scan traverses the key, 0 represents the first time, key1111* represents the pattern matching at the beginning of key1111, and 20 in count 20 does not represent the output of qualified keys, but limits the number of dictionary slots that the server can traverse in a single time (Approximately equal to).

So, what is called slot data? Is this slot in the Redis cluster? the answer is negative. In fact, the above picture has already given the answer.
If the number of "dictionary slots" mentioned above is the number of slots in the cluster, and we know that the number of slots in the cluster is 16,384, then after traversing 16384 slots, all the key information must be traversed
. As you can see clearly above, when traversing When the number of dictionary slots is 20,000, the cursor still has not finished the traversal result, so this dictionary slot is not equal to the concept of slot in the cluster.
After testing, when scanning, how big the COUNT value can be traversed to completely match the key that meets the conditions is related to the number of keys of the specific object.
If you scan with a count that exceeds the number of keys, you will definitely find it all at once. To all eligible keys, for example, when the number of keys is 10W, traversing 20w dictionary slots at a time will surely be able to traverse the results completely.
Insert picture description here
The scan instruction is a series of instructions. In addition to traversing all keys, it can also traverse the specified container set.
zscan traverses the elements of the zset collection,
hscan traverses the elements of the hash dictionary, and
sscan traverses the elements of the set collection.
The first parameter of the SSCAN command, HSCAN command and ZSCAN command is always a database key (a specified key).

In addition, when using redis desktop manager, when a library is refreshed, the console automatically refreshes the scan command, so you know what it is doing.
Insert picture description here

Encapsulate the scan method for redis fuzzy query

Guess you like