Redis 5 kinds of basic data structures, case-depth to explain the source

A, Redis Introduction

"Open Source the Redis IS AN (BSD Licensed), in-Store Memory Data Structure, Used AS A Database, Message Broker and Cache."  - the Redis is an open source (BSD license) data structure stored in memory, as databases, caching and message broker. (Taken from the official website)

No public reprint: I do not have three heart

Redis  is an open source, advanced key-value store and a suitable solution for building high-performance, scalable Web applications. Redis  is also dubbed a data structure of the server, which means that the user can be identified by the command, based on a simple server with TCP sockets - to access a set of client protocol  variable data structure  . (Redis are employed in the embodiment of the key, but not the same as the data structure corresponding Bale)

Redis advantage

The following are some of the advantages of Redis:

  • Exceptionally fast  - Redis is very fast, it can perform about 110,000 times per second set (SET) operations per second read about executable 81000 times / acquisition (GET) operations.
  • Support for rich data types  - Redis supports the most commonly used type of data developers, such as lists, sets, set sorting and hashing, and so on. This makes Redis can easily be used to solve a variety of problems, because we know what the problem may be better to use what data type processing solutions.
  • Having atomic operations  - Redis All operations are atomic operations, which ensures that if two clients concurrent access, Redis server can receive updated values.
  • Multi Utility  - Redis is a multi-utility for a variety of use cases, such as: a buffer, the message queue (the Redis local support publish / subscribe), any short-term data in the application, for example, web application session page hit counting.

Redis installation

This step is relatively simple, you can be found to the satisfaction of many tutorials on the Internet, not repeat them here.

A novice to use as a reference installation guide tutorial: https: //www.runoob.com/redis/redis-install.html

Local Redis performance test

Once you have installed, you can first perform redis-server so that Redis to start up and run the command redis-benchmark -n 100000 -q to detect the performance of local execution while 100,000 requests:

Of course, due to various reasons there will be performance gap between different computers, you can right when this test is a  "fun" just fine.

Two, Redis five basic data structures

Redis  There are five basic data structures, they are: String (String) , List (list) , the hash (dictionary) , SET (set)  and  zset (ordered set) . These five are Redis knowledge of the most basic, the most important part, here we combine source code as well as some of the practices to explain to you separately.

1) string string

Redis strings is a  dynamic string , which means that users can modify, it's somewhat similar to the underlying implementation of Java  ArrayList , there is an array of characters from the source  sds.h / sdshdr file  can be seen in for the definition of the underlying string Redis  SDS , i.e. Simple Dynamic string structure:

/* Note: sdshdr5 is never used, we just access the flags byte directly.
 * However is here to document the layout of type 5 SDS strings. */
struct __attribute__ ((__packed__)) sdshdr5 {
    unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr8 {
    uint8_t len; /* used */
    uint8_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr16 {
    uint16_t len; /* used */
    uint16_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr32 {
    uint32_t len; /* used */
    uint32_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr64 {
    uint64_t len; /* used */
    uint64_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};

You will find the same set of structures defined using generics Redis many times, why not just use an int it?

Because when the string short time, len, and alloc may be used to represent the byte, and short, the Redis memory in order to make the ultimate optimization, different strings of different lengths of the structure are represented.

SDS and the difference character string C

Why not consider a string directly using C language it? Because the C language this simple string representation  does not meet the requirements of the Redis strings in safety, efficiency and functional aspects . We know, C language using a character array of length N + 1 is represented by a string of length N, and the last element of the array is always character '\ 0'. (The following figure shows the C language is "Redis" of a character array)

Such simple data structure may cause the following problems:

  • Get string length of O (N) operations level  → C because the length of the array is not stored, each time need to traverse the entire array again;
  • Not well prevent  buffer overflow / memory leak  problems → problems with the above reasons, like if you perform splicing or shorten the string operation, if improper operation is likely to cause the problems described above;
  • C string  can only save the text data  → C because the string must meet certain language code (such as ASCII), for example, appears in the middle of the '\ 0' may be determined to be ahead of the end of the character string can not be identified;

We string to append operation example, Redis source as follows:

/ * The Append The specified binary-Safe String pointed by 'T' of 'len' bytes to The 
 * End of The specified SDS String 'S'. 
 * 
 * The After The Call, The passed SDS String IS NO longer Valid and All The 
 * the Substituted with MUST BE References new new Call * the pointer returned by /. 
SDS sdscatlen (SDS S, T const void *, size_t len) { 
    // Get the length of the string 
    size_t curlen = sdslen (S); 
  
    // needed for adjustment, if the capacity is not enough to accommodate additional content, it is re-allocated byte array and copy the contents of the original string to the new array 
    S = sdsMakeRoomFor (S, len); 
    IF (S == NULL) return NULL; // insufficient memory 
    memcpy (s + curlen, t, len); // append the target string into a byte array 
    sdssetlen (s, curlen + len) ; // Sets the length of the additional  
    s [curlen + len] = ' \ 0 '; // make the string ends with \ 0, easy to debug Print
    return S; 
}
  • Note: Redis specifies the length of the string can not exceed 512 MB.

The basic operation of the string

Redis is installed, we can use redis-cli command line to operate on Redis, of course, Redis official also provides online debugger, you can also type in order to operate in it: http: //try.redis.io / # run

Set and get key-value pairs

> SET key value
OK
> GET key
"value"

As you can see, we usually use the GET and SET to set and get the string value.

Value can be any kind of strings (including binary data), for example, you can save a .jpeg picture in a key, only need to pay attention not to exceed the maximum of 512 MB enough.

When the key exists, SET command will overwrite the value of the last time you set:

> SET key newValue
OK
> GET key
"newValue"

In addition you can also use the EXISTS and DEL key to query whether there and delete the key pair:

> EXISTS key
(integer) 1
> DEL key
(integer) 1
> GET key
(nil)

Batch set key-value pairs

> SET key1 value1
OK
> SET key2 value2
OK
> MGET key1 key2 key3    # 返回一个列表
1) "value1"
2) "value2"
3) (nil)
> MSET key1 value1 key2 value2
> MGET key1 key2
1) "value1"
2) "value2"

Expired and SET command extensions

You can set the expiration time for the key, will be automatically deleted to time, this feature is commonly used to control cache expiration time. (Expiration may be any data structure)

> The SET Key value1 
> GET Key 
"value1" 
> EXPIRE name after 5 # 5s expired 
... # wait 5S 
> GET Key 
(nil)

It is equivalent to the SET + EXPIRE SETNX command:

> SETNX key value1 
... # get after waiting 5S 
> GET key 
(nil) 
> SETNX key value1 # If the key does not exist SET successful 
(Integer) 1 
> SETNX key value1 # key if there is a failure of the SET 
(Integer) 0 
> Key GET 
"value" has not changed #

count

If the value is an integer, you can also use it INCR command  atomicity  of the increment operator, which means that multiple clients in a timely manner on the same key to operate, will never lead to competition:

> SET counter 100
> INCR count
(interger) 101
> INCRBY counter 50
(integer) 151

Return the original value of GETSET command

String, there is a relatively GETSET people feel interesting, it functions as saying name: set a value for key and returns the original value:

> SET key value
> GETSET key value1
"value"

This can be the key to the statistics very easy to set up and view some of the need for time to time, for example: the system whenever the user enters when you INCR command is to use a key operation, when the need to count when you put the key use GETSET command reassigned to zero, so that to achieve statistical purposes.

2) a list of list

Redis list corresponds to the Java language  LinkedList , note that it is a linked list instead of an array. This means that the list of insert and delete operations are very fast, the time complexity is O (1), but the index position is very slow, time complexity is O (n).

We can source from adlist.h / listNode to see the definition of it:

/* Node, List, and Iterator are the only data structures used currently. */

typedef struct listNode {
    struct listNode *prev;
    struct listNode *next;
    void *value;
} listNode;

typedef struct listIter {
    listNode *next;
    int direction;
} listIter;

typedef struct list {
    listNode *head;
    listNode *tail;
    void *(*dup)(void *ptr);
    void (*free)(void *ptr);
    int (*match)(void *ptr, void *key);
    unsigned long len;
} list;

It can be seen doubly linked list can be composed of multiple listNode through prev and next pointers:

Although only the use of multiple listNode structures can form a linked list, but using adlist.h / list structure to hold the list, then the operation will be more convenient:

The basic operation of the list

  • LPUSH and RPUSH and right, respectively, may (tail) to add a new element to the left of the list (head);
  • LRANGE command taken from a range of elements in the list;
  • LINDEX command element can be removed from the list specified in the table, equivalent to get (int index) Java operation list operations;

demonstration:

> RPUSH mylist A 
(Integer). 1 
> RPUSH mylist B 
(Integer) 2 
> First LPUSH mylist 
(Integer). 3 
> Lrange mylist # 0 -1 -1 represents the inverse of the first element, there is shown the first element to the last elements, i.e. all 
. 1) "First" 
2) "A" 
. 3) "B"

list queue

FIFO queue is a data structure used in the asynchronous message queue and processing logic, it will ensure that access to the order of elements:

> RPUSH books python java golang
(integer) 3
> LPOP books
"python"
> LPOP books
"java"
> LPOP books
"golang"
> LPOP books
(nil)

list to achieve stack

After the stack is advanced out of the data structure, just the opposite with the queue:

> RPUSH books python java golang
> RPOP books
"golang"
> RPOP books
"java"
> RPOP books
"python"
> RPOP books
(nil)

3) hash dictionary

Redis dictionary in the Java equivalent of  the HashMap , almost similar to the internal implementation, are by  "+ list array"  to solve some of the chain address method  hash collision , while such a structure has absorbed the advantages of two different data structures . The Source Definition dict.h / dictht defined:

struct {dictht typedef 
    // hash table array 
    dictEntry ** Table; 
    // hash table size 
    unsigned Long size; 
    // size of the hash mask table for calculating the index value, always equal size -. 1 
    unsigned Long sizemask; 
    // the number of nodes existing hash table 
    unsigned Long Used; 
} dictht; 

typedef struct {dict 
    dictType * type; 
    void * privdata; 
    // two dictht internal structure 
    dictht HT [2]; 
    Long rehashidx; / * rehashing Progress in IF rehashidx == Not -1 * / 
    unsigned Long iterators; / * Number of iterators Currently running * / 
} dict;

attribute table is an array, each element of the array is a pointer dict.h / dictEntry structure, each structure dictEntry holds a key-value pair:

struct {dictEntry typedef 
    // key 
    void * Key; 
    // value 
    Union { 
        void * Val; 
        uint64_t U64; 
        an int64_t S64; 
        Double D; 
    } V; 
    // hash table point to the next node, the list is formed 
    struct * Next dictEntry; 
} dictEntry;

It can be seen from the above source code, in fact, the internal structure of the dictionary contains two hashtable , only a hashtable value is normally, but when the dictionary expansion volume reduction is necessary to assign a new hashtable, followed by  gradual removal  (below that reason).

Progressive rehash

Dictionary of expansion is relatively time consuming, need to re-apply for a new array, and then the old dictionary of all the elements of the list reattached to the new array below, this is an O (n) levels of operation as a single-threaded Redis is difficult to bear such a time-consuming process, so use Redis  progressive rehash  small step to move:

Progressive will rehash rehash the same time, retain the two old and new hash configuration, as shown above, two hash queries simultaneously query structure and tasks and the timing of the subsequent hash operation instruction, the contents of the old gradual dictionary migrated to the new dictionary. When the move is complete, it will use the new hash structure instead.

Scaling capacity conditions

Normally, when the hash table  when the number of elements in a first dimension equal to the length of the array , will begin expansion, expansion of the new array is  two times the original size of the array . But if Redis is doing bgsave (persistent command), in order to reduce the memory is also too much separation, Redis try not expansion, but if the hash table is very full, up to 5 times the first dimension of the array length , this time It will be  forced expansion .

When the hash table is deleted because the elements gradually become more and more sparse, Redis will hash table hash table to reduce the volume reduction of the space occupied by the first dimension of the array. Conditions used were  the number of elements of the array length is less than 10% , volume reduction did not consider whether Redis bgsave.

The basic operation of the dictionary

hash also has disadvantages, storage consumption hash structure than a single string, so in the end of the string or hash used, according to the actual need to weigh again:

> HSET books java "think in java " string # command line if it contains spaces you need to use quotation marks wrap 
(Integer) 1 
> HSET Books Python "Python Cookbook" 
(Integer) 1 
> HGETALL Books # Key and value intervals appear 
1) "Java" 
2) "Think in Java" 
. 3) "Python" 
. 4) "Python Cookbook" 
> HGET Books Java 
"Think in Java" 
> HSET Books Java "head First Java"   
(Integer) 0 # because the update operation, so return 0 
> HMSET the Java Books "effetive the Java" Python "Learning Python" # batch operation 
OK

4) collection set

Redis is equivalent to a set of Java language  HashSet , its internal key-value pairs are unordered, unique. Its internal implementation is equivalent to a special dictionary, a dictionary of all the value is a value NULL.

The basic use of a collection set

Since the structure is relatively simple, direct look at how we use:

> The SADD Books Java 
(Integer). 1 
> Books Java # repeating the SADD 
(Integer) 0 
> Books Python golang the SADD 
(Integer) 2 
> # Note the order SMEMBERS Books, set are unordered 
. 1) "Java" 
2) "Python" 
. 3 ) "golang" 
> SISMEMBER the Java Books # query whether there is a value equivalent to the contains 
(Integer) 1 
> SCARD Books # get the length 
(Integer) 3 
> SPOP Books # pops up a 
"java"

5) an ordered list zset

This may be one of the most unique Redis data structures, and it is similar to Java in the  SortedSet  and  HashMap  combination of, on the one hand it is a set, to ensure the uniqueness of the internal value, on the other hand it can be given for each value a score value, to represent the ordering weights.

With its internal implementation is called  "jumping table"  data structure, due to the more complex, so in principle like a brief mention here:

Imagine you are a startup company owner, started only a few people, we are on an equal footing. Later, as the company grows, more and more number of people, team communication costs increased gradually, gradually introduced the head of the system, the team divided, so some people is the identity of the head of the staff there .

Still later, the company further expand the scale, companies need to re-enter a hierarchy: department. Each department will then elect an elected leader from the minister.

Jump table is similar to such a mechanism, the bottom layer will string together all the elements, all the staff, then every element will pick out a few representatives, these representatives then use another one pointer to string together . Then pick those representatives two representative inside, and then string together. Eventually forming a pyramid structure.

Think about your current geographical location: Asia> China> province> a city> ...., is such a structure!

An ordered list of the underlying operating zset

> 9.0 Zadd Books "Think in Java" 
> Zadd 8.9 Books "Java Concurrency" 
> Zadd 8.6 Books "Java Cookbook" 
> Z Range The Books # 0 -1 sorted by score listed parameters interval position range 
. 1) "Java Cookbook" 
2 ) "Java Concurrency" 
. 3) "Think in Java" 
> ZREVRANGE press Books score 0 -1 # reverse listed parameters interval position range 
. 1) "Think in Java" 
2) "Java Concurrency" 
. 3) "Java Cookbook" 
> ZCARD books # corresponds COUNT () 
(Integer). 3 
> ZSCORE Books "Java Concurrency" # Get the specified value of the score 
"8.9000000000000004" # score using a double internal storage, there is the problem point precision 
> ZRANK books "java concurrency"# Rank 
(Integer). 1 
> 8.91 # 0 ZRANGEBYSCORE Books traversing interval value according zset
. 1) "Java Cookbook" 
2) "Java Concurrency" 
> -INF ZRANGEBYSCORE Books 8.91 withscores # The value interval (-∞, 8.91] traversal zset, and returns the representative value .inf infinite, meaning infinite. 
. 1) "Java Cookbook " 
2)" 8.5999999999999996 " 
. 3)" Java Concurrency " 
. 4)" 8.9000000000000004 " 
> ZREM Books" Java Concurrency "# delete value 
(Integer). 1 
> Z Range The Books 0 -1 
. 1)" Java Cookbook " 
2)" Think in Java "

recommend

Java in more than three years, did not go to candidates 16k last to be hired, as detailed below ......

Java technology stack ppt Share: big data architecture articles + + algorithm articles (Alibaba internal data)

Ali, a senior architect Recommended: layman SpringCloud and services to build micro-pdf

 

Published 106 original articles · won praise 68 · views 50000 +

Guess you like

Origin blog.csdn.net/kxkxyzyz/article/details/104762012
Recommended