NOSQL——redis installation, configuration and simple operation

1. Knowledge about cache
1.1 The concept of cache
Cache is to adjust the speed of two or more different substances with inconsistent speeds, and accelerate the slower one in the middle, such as the first-level and second-level cache of the CPU It saves the data frequently accessed by the CPU recently, and the memory stores the data frequently accessed by the CPU on the hard disk, and the hard disk also has caches of different sizes, even the raid card of the physical server has caches, all of which are used to speed up the CPU's access to hard disk data Because the speed of the CPU is too fast, the data needed by the CPU cannot meet the needs of the CPU in a short period of time due to the hard disk, so the CPU cache, memory, Raid card cache, and hard disk cache can satisfy the data of the CPU to a certain extent. Requirements, that is, the CPU reading data from the cache can greatly improve the CPU's work efficiency.


1.2 System cache
buffer and cache:

Buffer: Buffer is also called write buffer. It is generally used for write operations. Data can be written into memory first and then written to disk. Buffer is generally used for write buffer, which is used to solve buffers with inconsistent speeds of different media. Data is temporarily written first. To the nearest place, to increase the writing speed, the CPU will first write the data to the disk buffer of the memory, and then think that the data has been written, and then the kernel will write to the disk at a later time, so the server Sudden power failure will lose part of the data in the memory.
cache: Cache is also called read cache. It is generally used for read operations. The CPU reads files from the memory. If there is no memory, read from the hard disk to the memory and then to the CPU. Put the data that needs to be read frequently in your nearest cache area, you can quickly read it the next time you read it
 

1.3 Cache storage location and hierarchical structure

In the field of Internet applications, caching is the key to improving service response speed

User layer: browser DNS cache, application DNS cache, operating system DNS cache client Proxy
layer: CDN, reverse proxy cache
Web layer: Web server cache
Application layer: page static
Data layer: distributed cache, database
system layer : operating system cache
physical layer: disk cache, Raid Cache
 

DNS cache
The DNS cache of the browser defaults to 60 seconds, that is, DNS resolution will not be performed if the same domain name is accessed within 60 seconds.

application layer cache

Web services such as Nginx and PHP can set application caches to speed up response to user requests. In addition, some interpreted languages, such as PHP/Python/Java, cannot be run directly and need to be compiled into bytecode first, but the bytecode needs to be interpreted by an interpreter as The machine code can only be executed later, so the bytecode is also a kind of cache, and sometimes there will be a phenomenon that the bytecode is not updated after the program code is launched. Therefore, before launching the new version, you need to clear the application cache first, and then launch the new version.

In addition, dynamic page static technology can be used to speed up access, for example: to access the dynamic page of database data, use the program to generate static page file html in advance, product introduction of e-commerce website, comment information, non-real-time data, etc. can be realized by using this technology .
 

data layer cache

Distributed cache service:

Redis
Memcached
database:

MySQL query cache
, innodb cache, MYISAM cache,
hardware cache,
CPU cache (L1 data cache and L1 instruction cache), level 2 cache, level 3 cache
Disk cache: Disk Cache
Disk array cache: Raid Cache, battery can be used to prevent power loss Data
2. Relational data and non-relational database
2.1 Relational database

A relational database is a structured database, built on the basis of a relational model (two-dimensional table model), generally oriented to records.
SQL statement (Standard Data Query Language) is a language based on relational database, which is used to perform retrieval and operation of data in relational database.
Mainstream relational databases include Oracle, MySQL, SQL Server, Microsoft Access, DB2, PostgreSQL, etc.
When using the above databases, you must first build a database, build a table, and design a table structure, and then store data according to the table structure. If the data does not match the table structure, the storage will fail.
 

2.2 Non-relational database
      NoSQL (NoSQL=NotonlysQL), which means "not just SQL", is the general term for non-relational databases.
Databases other than mainstream relational databases are considered non-relational.
     There is no need to pre-build databases and tables to define the data storage table structure. Each record can have different data types and number of fields (such as text, pictures, videos, music, etc. in WeChat group chats).
     The mainstream NOSQL databases include Redis, MongBD, Hbase (distributed non-relational database, used for big data), Memcached, ElasticSearch (referred to as ES, indexed database), TSDB (time-continuous database), etc.
 

2.3 Differences between relational databases and non-relational databases:

(1) Different data storage methods

The main difference between relational and non-relational databases is the way data is stored.

Relational data is inherently tabular and thus stored in rows and columns of a data table. Data tables can be associated with each other and stored collaboratively, and it is also easy to extract data.
In contrast, non-relational data does not fit into rows and columns of tables, but is grouped together in large chunks. Non-relational data is usually stored in datasets, like documents, key-value pairs, or graph structures. Your data and its characteristics are the number one influencing factor in choosing how to store and extract your data. (It is easy to switch data types, there are multiple data types in a data set)
 

(2) Different expansion methods

The biggest difference between SQL and NoSQL databases may be in the way of expansion. To support the growing demand, of course, expansion is required.

To support more concurrency, the SQL database is scaled up, that is to say, to increase the processing power and use a faster computer, so that the same data set can be processed faster. Because data is stored in relational tables, performance bottlenecks for operations that may involve many tables need to be overcome by increasing computer performance. Although the SQI database has a lot of room for development, it will definitely reach the upper limit of vertical expansion in the end. (Data is generally stored in the local file system. Reads can share performance through read-write separation and load balancing, but reads and writes still consume IO performance.)
NoSQL databases scale horizontally. Because non-relational data storage is naturally distributed, the expansion of NoSQL databases can share the load by adding more common database servers (nodes) to the resource pool. (Data distribution is stored on different servers, which can be read and written concurrently to speed up efficiency)


Horizontal expansion: add servers. (cheaper)

Vertical expansion: Improve hardware configuration, such as changing to a higher-performance CPU, increasing the number of CPU cores, hard disks, disk IO, and memory sticks. (except for the hard disk, others need to be shut down to add)

(3) Different support for transactional

If data operations require high transactionality or complex data queries need to control the execution plan, then the traditional SQL database is your best choice in terms of performance and stability. SQL database supports fine-grained control over transaction atomicity, and it is easy to roll back transactions.
Although NoSQL databases can also use transaction operations, they cannot compare with relational databases in terms of stability, so their real value lies in the scalability of operations and the processing of large amounts of data.
Non-relational databases are inferior to relational databases in terms of transaction processing and stability. However, it has good read and write performance, is easy to expand, and has an advantage in processing large data.
Relational database: especially suitable for high transactional requirements and tasks that need to control the execution plan, and the fine-grained control of transactions is better.

Non-relational database: transaction control will be slightly weaker, and its value lies in high scalability and large data volume processing.
 

2.4 Background of non-relational database

It can be used to deal with the three high problems of Web2.0 pure dynamic website type.

(1) High performance - high concurrent read and write requirements for the database.

(2) Hugestorage - the need for efficient storage and access to massive data.

(3) HighScalability&&HighAvailability——Requirements for high scalability and high availability of the database.

Relational databases and non-relational databases have their own characteristics and application scenarios. The close combination of the two will bring new ideas to the development of web2.0 databases. Let relational databases focus on relationships and data consistency guarantees, and non-relational databases focus on storage and high efficiency. For example, in a MySQI database environment where reads and writes are separated, frequently accessed data (that is, high-heat data) can be stored in a non-relational database to improve access speed.
 

2.5 Comparison of data records between NOSQL and SQL

Relational database:
instance --> database --> table (table) --> record row (row), data field (column)


Non-relational database:
instance --> database --> collection (collection) --> key-value pair (key-value)
non-relational database does not need to manually build a database and collection (table).
 

3 Knowledge about redis 
3.1 Introduction to redis 
Redis is an open source, memory-based, key-value database written in C language, and provides APIs in multiple languages. Its data structure is very rich, mainly used for databases, caches, distributed locks, message queues, etc...

The Redis server program is a single-process model, that is, multiple Redis processes can be started on one server at the same time, and the actual processing speed of Redis depends entirely on the execution efficiency of the main process.

If only one Redis process is running on the server, when multiple clients access at the same time, the processing capacity of the server will decrease to a certain extent;
if multiple Redis processes are opened on the same server, Redis will improve the concurrent processing capacity. At the same time, it will cause a lot of pressure on the CPU of the server.


 3.2 The five major data types of redis
The basic data types include: string (string), list (list, doubly linked list), hash (hash, key-value pair collection), set (set, not repeated) and sorted set can also be called Zset (ordered set). 

3.3 Advantages and disadvantages of redis 
(1) It has extremely high data reading and writing speed: the data reading speed can reach up to 110,000 times/s, and the data writing speed can reach up to 81,000 times/s.

(2) Supported data structure: key-value, supports rich data types: Strings, Lists, Hashes, Sets and Sorted Sets and other data type operations.
 

(3) Support data persistence: The data in the memory can be saved in the disk, and can be loaded again for use when restarting.

(4) Atomicity: All Redis operations are atomic. (Support transactions, all operations are treated as transactions)

(5) Support data backup: namely data backup in master-salve mode. (Support master-slave replication)

Disadvantages of Redis
Cache and database double-write consistency problem
Cache avalanche problem
Cache breakdown problem
Cache concurrent competition problem
3.4 Applicable scenarios of Redis 
Redis, as a memory-based database, is a high-performance cache. Leaderboards, counters, recent hottest articles, recent hottest comments, publish subscriptions, etc.
Redis is suitable for scenarios with high real-time data requirements, data storage with expiration and elimination characteristics, no need for persistence or only weak consistency, and simple logic.
//Which data is suitable to be placed in the cache?

● immediacy. For example, query the latest logistics status information.
● Data consistency requirements are not high. For example, store information, after modification, has been changed in the database, and the cache will be up-to-date after five minutes, but it will not affect the use of functions.
●The number of visits is large and the update frequency is not high. For example, the advertisement information on the homepage of the website has a large number of visits, but it does not change frequently.
 

3.5 Reasons why Redis uses single thread
First of all, it must be clear that Redis single thread means that network IO and key-value pair reading and writing are completed by one thread, but Redis persistence and cluster data are executed by additional threads. Before you understand that Redis uses single-threading, you can first understand the overhead of multi-threading.

Usually, the use of multithreading can increase system throughput or increase system scalability, but multithreading usually accesses certain shared resources at the same time. In order to ensure the correctness of accessing shared resources, an additional mechanism is required to ensure that, This mechanism will first bring a certain amount of overhead. In fact, the control of concurrent access by multiple threads has always been a difficult problem. If there is no fine design, for example, simply using a coarse-grained mutex lock, unsatisfactory results will occur. Even if threads are added, most threads are waiting to acquire mutexes for accessing shared resources, parallelism becomes serial, and the system throughput rate does not increase with the increase of threads.
 

also:

It is worth noting that multi-threading was introduced in Redis6.0. Before Redis6.0, from network IO processing to actual read and write command processing was completed by a single thread, but with the performance improvement of network hardware, the performance bottleneck of Redis may appear in the processing of network IO. That is to say, the speed at which a single main thread processes network requests cannot keep up with the speed of the underlying network hardware. To solve this problem, Redis uses multiple IO threads to process network requests to improve the parallelism of network request processing, but multiple IO threads are only used to process network requests. For read and write commands, Redis still uses single-thread processing!
 

3.6 The reason why redis runs fast

Redis is based on memory, and most requests are memory operations, which is very fast.
Redis has an efficient underlying data structure. In order to optimize memory, there are basically two underlying implementations for each type.
The main execution process is single-threaded, which avoids unnecessary context switching and resource competition, and there is no problem of CPU switching and locks caused by multi-threading.

 IO multiplexing mechanism: it can process a large number of client requests concurrently in network IO operations to achieve high throughput.
 

The IO multiplexing mechanism refers to a thread processing multiple IO streams, which is often referred to as the select/epoll mechanism. In the case of Redis running single-threaded, this mechanism allows multiple listening sockets and connected sockets to exist in the kernel at the same time. The kernel will always listen for connection requests or data requests on these sockets. Once a request arrives, it will be handed over to the Redis thread for processing, which realizes the effect of one Redis thread processing multiple IO streams, thereby improving concurrency. 

 
3.7 Comparison between Redis and memcached 


4. Redis installation and configuration 
4.1 Redis source code compilation and installation 
#Close firewall
 systemctl stop firewalld
 setenforce 0
 #Installation environment dependent package
 yum install -y gcc gcc-c++ make
 ​#Upload
 software package and unzip
 cd /opt/
 tar zxvf redis-5.0 .7.tar.gz -C /opt/
 cd /opt/redis-5.0.7/
 #Open 2-core compilation and installation, specify the installation path as /usr/local/redis
 make -j2 && make PREFIX=/usr/local/ redis install
 #Because the Makefile is directly provided in the Redis source package, after decompressing the package, you don't need to execute ./configure to configure it, you can directly execute the make and make install commands to install it.
 ​#Execute
 the install_server.sh script file provided by the software package, and set up the relevant configuration files required by the Redis service
 cd /opt/redis-5.0.7/utils
 ./install_server.sh
 .......#
 Enter
 Please select the redis executable path [] /usr/local/redis/bin/redis-server
 #The default is /usr/local/bin/redis-server, which needs to be manually modified to /usr/local/redis/bin/redis- server, pay attention to input correctly at one time  ----------------------
   The dotted line is a comment -------------- -----------------------------------------  Selected config:  Port: 6379 #The default listening port is 6379  Config file: /etc/redis/6379.conf #Configuration file path  Log file: /var/log/redis_6379.log #Log file path  Data dir: /var/lib/redis/6379 #Data file path  Executable: /usr /local/redis/bin/redis-server #Executable file path  Cli Executable: /usr/local/bin/redis-cli #Client command tool









 -------------------------------------------------- ---------------------------------
 #
 When the install_server.sh script finishes running, the Redis service has already started. By default The listening port is 6379
 netstat -natp | grep redis
 ​#Put
 the redis executable program file into the directory of the path environment variable for system identification
 ln -s /usr/local/redis/bin/* /usr/local/bin/
 ​

4.2 Redis startup configuration 
 #Redis
 service control
 /etc/init.d/redis_6379 stop #stop
 /etc/init.d/redis_6379 start #start
 /etc/init.d/redis_6379 restart #restart
 /etc/init.d/ redis_6379 status #View status  #
   Edit configuration file, parameter  vim /etc/redis/6379.conf  ......  70 bind 127.0.0.1 192.168.73.105 #The IP address of the monitor is also a specified remote login Mode  93 port 6379 #Monitoring port  137 daemonize yes #Start with a daemon process, that is, start in the background   159 pidfile /var/run/redis_6379.pid #Redis process number storage location  172 logfile /var/log/redis_6379.log #log Saved location  187 databases 16 #The number of monitoring libraries (number 0-15  )











 /etc/init.d/redis_6379 restart #Restart redis service
 

The correct understanding of bind in Redis

bind: It is to bind the IP address of the machine, (accurately: the IP address corresponding to the network card of the machine, each network card has an IP address), instead of redis allowing IP addresses from other computers.

If bind is specified, it means that only Redis requests from the specified network card are allowed. If not specified, it means that Redis requests from any network card can be accepted.

If the local access protection mode is enabled, Redis can only accept the local response if no bind ip is set and no password is set.

Explanation of bind 127.0.0.1: (Why only this machine can connect, but others cannot)

We can see from ifconfig: lo network card (corresponding to 127.0.0.1 IP address): it is a loopback address (Local Loopback), that is, only the local can access this loopback address, and other computers can only access their own loopback address.

Then the computer from this lo network card only has this computer, so only this computer can access it, but other computers cannot.

bind 0.0.0.0 represents all physical network card addresses on the server

 5. Redis command tool


5.1 redis-cli: Redis command line tool
redis-cli -h host -p port [-a password]
 ​-h
 : specify the remote host machine
 -p: specify the port number of the Redis service
 -a: specify the password, no database password is set The -a option can be omitted
 #-a option If no option is added, it means that 127.0.0.1:6379 is used to connect to the Redis database on this machine
 #Login
 local
 redis-cli
 #Remote login
 redis-cli -h 192.168.72.60 -p 6379 [-a password]

5.2 redis-benchmark test tool 
redis-benchmark is the official Redis performance test tool that comes with it, which can effectively test the performance of Redis services.

Basic test syntax: redis-benchmark [option] [option value]
 ​-h
 : Specify the server host name.
 -p: Specifies the server port.
 -s: specify the server socket
 -c: specify the number of concurrent connections.
 -n: Specifies the number of requests.
 -d: Specifies the data size of the SET/GET value in bytes.
 -k: l=keep alive 0=reconnect 
 -r: SET/GET/INCR use random key, SADD use random value
 -P: pipe <numreg> request
 -q: force quit redis, only show query/sec value
 - -csv: output in CSV format
 -l: generate loop, execute tests forever
 -t: run only comma-separated list of test commands
 -I: Idle mode, only open N idle connections and wait

(1) Concurrent connection and 100000 request processing performance test
redis-benchmark -h 192.168.73.105 -p 6379 -c 100 -n 100000

(2) Performance test of data packet access
  redis-benchmark -h 192.168.73.105 -p 6379 -q -d 100

(3) Key-value pair creation speed test
 redis-benchmark -t set, lpush -n 100000 -q

6. Simple operation of redis 


6.1 Access to redis key-value pairs 
 set: store data, the command format is set key value 
 get: get data, the command format is get key 

6.2 
The setting of obtaining the key value of the redis key value list: 

192.168.73.105:6379> set v1 1
OK
192.168.73.105:6379> set v2 2
OK
192.168.73.105:6379> set v3 4
OK
192.168.73.105:6379> set k1 5
OK
192.168.73.105:6379> set k2 6
OK
192.168.73.105:6379> set k3 7
OK
192.168.73.105:6379> set k4 8
OK

(1) Get all list 
keys *
 

(2) Obtain keys of any length starting with a certain character
keys v*
keys k*

(3) Obtain a key that starts with a certain character and then 
adds test data for a key of a specified length: 

192.168.73.105:6379> set v123 123
OK
192.168.73.105:6379> set v11 11
OK
192.168.73.105:6379> set v1124 1124


 6.3 Judging whether the key exists 
 exists Key   
 
#The return result is 0, which means it does not exist, and the return result is 1, which means it exists


6.4 Delete key 
del key

6.5 View the data type of key storage 


6.6 rename Rename 
When using the rename command to rename, it will be renamed regardless of whether the target key exists, and the value of the source key will overwrite the value of the target key.
In actual use, it is recommended to use the exists command to check whether the target key exists before deciding whether to execute the rename command to avoid overwriting important data.
Command format: rename source key target key

6.7 renamenx rename
- will check if the target key name already exists  

The function of the renamenx command is to rename the existing key and check whether the new name exists. If the target key exists, the rename will not be performed. (not covered) 

renamenx source key target key

6.8 dbsize view key number 
dbsize

6.9 Setting and Clearing Password 
Setting and Viewing Password 
#Setting the login password of redis
config set requirepass password
#Viewing the password of redis
config get requirepass 

Clear password 
 #clear password
 
config set requirepass '' 
 

7. Redis multi-database operation
Redis supports multiple databases. By default, Redis contains 16 databases, and the database names are sequentially named with numbers 0-15.

After using redis-cli to connect to the Redis database, the database with serial number 0 is used by default.

Multiple databases are independent of each other and do not interfere with each other.

7.1 Select command format for switching between multiple databases 
 : select serial number
 #After
 using redis-cli to connect to the Redis database, the database with serial number 0 is used by default.
 127.0.0.1:6379>select 10 #Switch to the database with serial number 10
 ​127.0.0.1:6379[10]>select
 15 #Switch to the database with serial number 15
 ​127.0.0.1:6379[15]>select
 0 #Switch To the database with serial number 0
 ​127.0.0.1:6379
 [0]>

7.2 Move data between multiple databases 
move key-value sequence number (serial number of library)

7.3 Clear the data in the database 
  FLUSHDB: clear the current database data
 FLUSHALL: clear all the database data,

 8. Common errors and solutions of redis
8.1 Common operation and maintenance faults of Redis


Use keys* to block the library. - It is recommended to use an alias to rename this command.
After the memory usage is exceeded, some data is deleted. ——This has a deletion strategy, just choose the one that suits you.
If persistence is not enabled, but the instance is restarted, all data will be lost. - Remember that non-cached information needs to have persistence turned on.
RDB persistence requires Vm.overcommit_memory=1, otherwise persistence will fail.
In the case of no persistence, the master-slave, the master restarts too quickly, and the slave does not think that the master is down, the slave will clear its own data. Before artificially restarting the master node, first turn off the synchronization of the slave node.
 

8.2 Redis Troubleshooting
Combine Redis monitoring to view information such as QPS, cache hit rate, and memory usage.
Check whether the resources at the machine level are abnormal.
In the event of a failure, get on the computer in time, use the redis-cli monitor to print out the operation log, and then analyze it (afterwards analysis of this item fails).
Communicate with R&D to confirm whether there is a big key that is blocked (the big key can also be obtained in daily inspections) Communicate with colleagues in the group to see if there is any misoperation.
Check whether the traffic is normal and whether there is any brushing with the operation and maintenance colleagues and R & D.

Guess you like

Origin blog.csdn.net/zl965230/article/details/130803463