[Redis] In-depth exploration of Redis data types - hash table hash


Preface

Data caching is one of the key strategies for improving performance and reducing database load when building and optimizing applications. Redis (Remote Dictionary Server) is a high-performance in-memory database widely used for data caching and fast data access. Among them, the hash type (Hash) is a powerful data structure in Redis, which is usually used to store objects, mapping relationships and key-value data.

In this article, we'll take a deep dive into hash types in Redis. We will start with the basic commands of the hash type, and gradually introduce their usage, internal encoding, and application in practical application scenarios. By learning and understanding Redis hash types, we can help us better use Redis to optimize data storage and access and improve application performance.

Next, let's take a deeper look at Redis hash types.

1. Commands related to hash type

1.1 HSET and HSETNX

  1. HSET
  • Function : Set the value of the field in the specified hash table. If the field exists, it will be updated, otherwise it will be created. And multiple sets of fields can be set at the same time.

  • grammar:

    HSET key field value [field value ... ]
    
  1. HSETNX
  • Role : Set the value of the field in the specified hash table only if the field does not exist. Returns 1 if the setting is successful, otherwise returns 0.
  • grammar:
    HSETNX key field value 
    
  1. Usage example

1.2 HGET and HMGET

  1. HGET
  • Function : Get the value of the field in the specified hash table.
  • grammar:
    HGET key field
    
  1. HMGET
  • Function : It is to get the values ​​of multiple fields in the specified hash table.
  • grammar:
    HMGET key field1 [field2 ...]
    
  1. Usage example

Of course, I will continue to improve the following sections to cover the detailed description of hash type related commands:

1.3 HKEYS、WHALE 和 HGETAL

  1. HKEYS
  • Function : Get the names of all fields in the specified hash table.
  • Syntax :
    HKEYS key
    
  1. HVALS
  • Function : Get the values ​​of all fields in the specified hash table.
  • Syntax :
    HVALS key
    
  1. HGETALL
  • Function : Get all fields and corresponding values ​​in the specified hash table.
  • Syntax :
    HGETALL key
    
  1. Use Cases

1.4 HEXISTS and HDEL

  1. HEXISTS
  • Function : Check whether a field exists in the specified hash table.
  • Syntax :
    HEXISTS key field
    
  1. HDEL
  • Function : Delete one or more fields in the specified hash table.
  • Syntax :
    HDEL key field1 [field2 ...]
    
  1. Use Cases

1.5 FULFILLMENT

  1. HLEN
  • Function : Get the number of fields in the specified hash table (that is, the size of the hash table).
  • Syntax :
    HLEN key
    
  1. Use Cases

1.6 HINCRBY and HINCRBYFLOAT

  1. HINCRBY
  • Function : Increase the value of the specified field in the hash table by an integer.
  • Syntax :
    HINCRBY key field increment
    
  1. HINCRBYFLOAT
  • Function : Increase the value of the specified field in the hash table by a floating point number.
  • Syntax :
    HINCRBYFLOAT key field increment
    
  1. Use Cases

1.7 Summary of hash related commands

The following is a summary of commands related to the hash type, including commands, functions and time complexity:

Order effect time complexity
HSET Set the value of the field in the hash table, update it if it exists, otherwise create it. Multiple sets of fields can be set at the same time. O(1)
HSETNX Sets the field's value in the hash table only if the field does not exist, returning 1 on success, 0 otherwise. O(1)
HGET Gets the value of a field in the specified hash table. O(1)
HMGET Gets the values ​​of multiple fields in the specified hash table. O(N), N is the number of fields
HKEYS Gets the names of all fields in the specified hash table. O(N), N is the number of fields
WHALES Gets the values ​​of all fields in the specified hash table. O(N), N is the number of fields
HGETALL Get all fields and corresponding values ​​in the specified hash table. O(N), N is the number of fields
HEXISTS Checks whether a field exists in the specified hash table. O(1)
HDEL Delete one or more fields in the specified hash table. O(N), N is the number of deleted fields
HLEN Gets the number of fields in the specified hash table (that is, the size of the hash table). O(1)
HINCRBY Increases the value of the specified field in the hash table by an integer. O(1)
HINCRBYFLOAT Increases the value of the specified field in the hash table by a floating point number. O(1)

2. Internal encoding of hash type

Redis is a high-performance memory database that supports a variety of data structures, including hash (Hash). In Redis, the hash data type has two internal encoding methods, namely ziplist (compressed list) and hashtable (hash table). The choice of these two encodings depends on the size and storage characteristics of the hash.

1. ziplist (compressed list):

ziplist is a compact data structure used internally in Redis to encode smaller hashes. Here are some key features about ziplist:

  • When the number of elements of the hash type is relatively small and all fields and corresponding values ​​meet certain restrictions, Redis will use ziplist as the internal implementation of hashing.
  • By default, Redis chooses ziplist. Specifically, ziplist is the preferred encoding if the hash has no more than 512 elements and all values ​​are smaller than 64 bytes.
  • Ziplist is a compact data structure that can save memory better than hashtable. It stores multiple hash elements together continuously, effectively reducing the memory footprint.

2. hashtable:

Hashtable is an internal encoding used in Redis to store large-scale hash data. The following are the key features of hashtable:

  • When the number of hash type elements exceeds the ziplist configuration limit, or the corresponding value of a field is greater than 64 bytes, Redis will switch the internal encoding to hashtable.
  • Hashtable is a hash table data structure with O(1) read and write time complexity, suitable for large-scale hash data sets.
  • Switching to a hashtable can provide better performance and memory management, especially when dealing with large hashes or containing large values.

Based on the above description, here are some examples demonstrating the internal encoding of hash data types and under what conditions encoding conversion occurs:

Example 1: Using ziplist encoding

> hmset hashkey f1 v1 f2 v2
OK
> object encoding hashkey
"ziplist"

In this example, Redis uses ziplist as internal encoding due to the small number of fields and satisfying values.

Example 2: Switch to hashtable encoding

> hset hashkey f3 "one string is bigger than 64 bytes ..." 1
OK
> object encoding hashkey
"hashtable"

In this example, because one of the fields corresponds to a value larger than 64 bytes, Redis switches the internal encoding to hashtable.

Example 3: Switch to hashtable encoding

> hmset hashkey f1 v1 h2 v2 f3 v3 ... (超过 512 个字段) ...
OK
> object encoding hashkey
"hashtable"

In this example, since the number of fields exceeds 512, Redis converts the internal encoding to a hashtable.

These internal encodings are chosen to balance memory usage and performance in different situations. Redis automatically performs these encoding conversions as needed to optimize storage and operation efficiency. Regardless of the size of your hashed data set, Redis intelligently chooses the appropriate internal encoding based on configuration and data characteristics. This automatic optimization ensures excellent performance of Redis under various workloads.

3. Application scenarios of hash type

In this section, we'll explore practical uses of hash types in applications. First, let's review the structure of user information stored in a relational database.

Relational data tables store user information

The above figure shows two pieces of user information recorded in a relational data table. The user's attributes are represented as columns of the table, and each piece of user information is represented as a row. If we want to map these two user information in Redis, we can use the hash type.

Map user information using hash type:

Mapping relationship represents user information

Compared with using JSON formatted strings to cache user information, the hash type is more intuitive and more flexible in update operations. We can define each user's ID as the suffix of the key, and then use multiple field-values ​​to correspond to each attribute of the user, similar to the following pseudocode:

UserInfo getUserInfo(long uid) {
    
    
    // 根据 uid 得到 Redis 的键
    String key = "user:" + uid;

    // 尝试从 Redis 中获取对应的值
    userInfoMap = Redis 执行命令:hgetall key;

    // 如果缓存命中(hit)
    if (userInfoMap != null) {
    
    
        // 将映射关系还原为对象形式
        UserInfo userInfo = 利用映射关系构建对象(userInfoMap);
        return userInfo;
    }

    // 如果缓存未命中(miss)
    // 从数据库中,根据 uid 获取用户信息
    UserInfo userInfo = MySQL 执行 SQL:select * from user_info where uid = <uid>;

    // 如果表中没有 uid 对应的用户信息
    if (userInfo == null) {
    
    
        响应 404;
        return null;
    }

    // 将缓存以哈希类型保存
    Redis 执行命令:hmset key name userInfo.name age userInfo.age city userInfo.city;

    // 写入缓存,为了防止数据腐烂(rot),设置过期时间为 1 小时(3600 秒)
    Redis 执行命令:expire key 3600;

    // 返回用户信息
    return userInfo;
}

The above code demonstrates a common caching strategy, first trying to get the data from the Redis cache, retrieving it from the database if it misses, and storing the result in Redis for subsequent access. This strategy can improve access performance and reduce the burden on the database.

However, it is important to note that there are two main differences between hash types and relational databases:

  • The sparsity of hash types: Hash types allow each key to have different fields, while relational databases need to set values ​​for all rows when adding new columns, even null.

  • Inapplicability of complex relational queries: Relational databases support complex relational queries, but Redis is not suitable for simulating complex relational queries, such as joint table queries and aggregation queries, and the maintenance cost is high.

Examples of sparsity in relational databases:

Relational database sparsity

Through the hash type, we can map and store user information more intuitively, adapting to the needs of different application scenarios. The usage of the hash type is simple, intuitive, and flexible, especially suitable for local attribute changes and query operations, and also has good memory efficiency.

4. Comparison of native, serialized and hash type caching methods

When it comes to caching user information, there are a number of different caching methods to choose from. The following is a detailed comparison and analysis of three common caching methods: native string types, serialized string types (such as JSON format), and hash types. Here we explore their implementation methods, advantages, and disadvantages to help you better choose the caching strategy that works for your application.

4.1 Native string type

Implementation method: Use the native string type to store each user attribute as a separate key-value pair, for example:

set user:1:name James
set user:1:age 23
set user:1:city Beijing

advantage:

  • Simple to implement, each attribute is stored with a separate key, making it easy to understand and maintain.
  • Very flexible for individual attribute changes.

shortcoming:

  • Occupying too many keys results in large memory usage.
  • User information is stored dispersedly in Redis, which lacks cohesion and is inconvenient for batch operations and management.
  • Not suitable for situations where complete user information needs to be obtained at one time, requiring a large number of key operations.

Applicable scenarios: This method is suitable for scenarios where individual and frequent changes to user attributes are required, but it is not suitable for scenarios where complete user information needs to be obtained at one time.

4.2 Serialized string type (such as JSON format)

Implementation method: Use the serialized string type to serialize user information in JSON format and store it as a key-value pair, for example:

set user:1 {
    
    "name": "James", "age": 23, "city": "Beijing"}

advantage:

  • It is suitable for information storage with the whole unit as the operating unit, and the programming is relatively simple.
  • Can use memory efficiently and is particularly suitable for storing large objects or data structures.

shortcoming:

  • Serialization and deserialization require some overhead.
  • It is not suitable for frequent updates or queries of individual attributes and lacks flexibility.

Applicable scenarios: This method is suitable for scenarios where complete user information needs to be obtained at one time, especially when the user information is a complex object or data structure.

4.3 Hash type

Implementation method: Use hash type to store user information as Redis hash type, for example:

hmset user:1 name James age 23 city Beijing

advantage:

  • Simple, intuitive and flexible.
  • Suitable for local changes or acquisition operations of information, supporting reading and writing of single attributes.
  • The internal encoding can be ziplist or hashtable, which has better flexibility and memory efficiency.

shortcoming:

  • Need to control hash conversion between ziplist and hashtable two internal encodings, which may cause memory consumption.
  • It is not suitable for situations where complete user information needs to be obtained at one time, and multiple reads may be required.

Applicable scenarios: The hash type is suitable for scenarios where local changes to user attributes or frequent individual attribute operations are required. It is also suitable for situations where flexible querying of attributes is required.

4.4 Summary

Choosing the appropriate caching method should be determined based on specific application needs and access patterns. Usually, these three caching methods can be comprehensively considered based on different data characteristics and operational requirements, and appropriately combined and used in applications to obtain the best performance and flexibility.

For example, a hash type cache can be used to handle local attribute changes and queries, while a serialized string type cache can be used to obtain complete user information. In this way, the advantages of Redis can be fully utilized to improve the efficiency of data access while maintaining flexibility and maintainability.

Guess you like

Origin blog.csdn.net/qq_61635026/article/details/132768903