Introduction to NoSQL
What is NoSQL
Non-relational database is NoSQL, relational database is MySQL
For relational databases, data needs to be stored in libraries, tables, rows, and fields. When querying, it is matched line by line according to the conditions. When the equivalent is very large, it takes time and resources because the data is stored on disk , You need to retrieve it from the disk according to your query conditions
NoSQL database storage principle is very simple (typical data type is kv, one key and one value), there is no complicated relationship chain, there are no rows, fields, and such complex data structures. For example, when mysql query, you need to find the corresponding library , Tables (usually multiple tables) and fields.
NoSQL database does not store the association between tables
NoSQL data can be stored in memory and query speed is very fast
Although NoSQL can outperform relational databases in performance, it cannot completely replace relational databases.
Because NoSQL has no complicated data structure, it is very easy to extend and supports distributed
2. Common NoSQL databases
Kv form: memcached, redis are suitable for storing user information, such as sessions (Session: user information), configuration files, parameters, shopping carts, etc. This information is generally linked to ID (key). In this scenario, a key-value database is a good choice.
Document database: mongodb stores data in the form of documents. Each document is a collection of a series of data items. Each data item has a name and a corresponding value. The value can be a simple data type, such as string, number, and date, or it can be a complex type, such as an ordered list and associated objects. The smallest unit of data storage is a document. The document attributes stored in the same table can be different, and the data can be stored in multiple forms such as XML, JSON, or JSONB.
Column storage: Hbase
图 :Neo4J、Infinite Graph、OrientDB
Expansion:
What are the common relational databases and non-relational databases?
Relational Database:
Oracle、DB2、Microsoft SQL Server、Microsoft Access、MySQL
Non-relational database:
NoSql、Cloudant、MongoDb、redis、HBase
The difference between the two databases:
Relational Database
Characteristics of relational databases
1. Relational database refers to a database that uses a relational model to organize data;
2. The biggest feature of relational databases is the consistency of transactions;
3. In simple terms, a relational model refers to a two-dimensional table model, and a relational database is a data organization composed of two-dimensional tables and the connections between them.
Advantages of relational databases
1. Easy to understand: The two-dimensional table structure is a concept very close to the logical world, and the relational model is easier to understand than other models such as mesh and hierarchy;
2. Easy to use: The general SQL language makes it very convenient to operate relational databases;
3. Easy to maintain: Rich integrity (entity integrity, referential integrity and user-defined integrity) greatly reduces the probability of data redundancy and data inconsistency;
4. Support SQL, which can be used for complex queries.
Disadvantages of relational databases
1. The huge price paid for maintaining consistency is its poor read and write performance;
2. Fixed table structure;
3. High concurrent read and write requirements;
4. Efficient reading and writing of massive data;
Non-relational database
Characteristics of non-relational databases
1. Use key-value pairs to store data;
2. Distributed;
3. Generally does not support ACID features;
4. A non-relational database is not strictly a database, it should be a collection of data structured storage methods.
Advantages of non-relational databases
1. No need to go through the sql layer analysis, high read and write performance;
2. Based on key-value pairs, the data is not coupled and easy to expand;
3. The format of the stored data: Nosql's storage format is key, value, document, picture, etc., document, picture, etc., while relational databases only support basic types.
Disadvantages of non-relational databases
1. SQL support is not provided, and the cost of learning and use is high;
2. No transaction processing, and support for additional functions such as bi and reports is not good;
Introduction to Memcached
In lamp or lamp architecture, memcached is used to cache some data, such as forum posts, visits, online people...
Memcached is developed by the LiveJournal team of a foreign community website. The purpose is to reduce the number of database visits by caching database query results, thereby improving the performance of dynamic web sites.
Official site http://www.memcached.org/
characteristic
The data structure is simple (kv), and the data is stored in memory
Multithreading (when your server has a lot of CPUs, you will obviously find that memcached is fast)
Based on the c/s architecture, the protocol is simple (meaning you have to start a server, and the client connects to the server)
Event handling based on libevent (nginx is also based on libevent)
Autonomous memory storage processing (slab allowcation)
There are two data expiration methods: Lazy Expiration and LRU
Disadvantages: does not support landing, does not support persistence, once you restart the server or restart the memcached service, the cached data will be lost
Solution: Save the data to the disk before restarting, and then restore the data back after restarting
Memcached data flow
Normal flow: For example, there is a forum with lnmp architecture. Normally, php will deal with the database, and then nginx will call php for data interaction. For example, to check the content of a post, first the user initiates a request to nginx, and nginx sends the php request The script for php is given to the service of php-fpm. After the php-fpm service parses the php script, it finds that the post is to be queried, and then the database is used to query the data. After the query is completed, nginx feedback To users
Principle of Slab Allocation
Divide the allocated memory into chunks of various sizes, and divide the chunks of the same size into groups (collections of chunks). Each chunk collection is called a slab.
The memory allocation of Memcached takes Page as the unit. The default value of Page is 1M, which can be specified by the -I parameter at startup.
Slab is composed of multiple pages, which are cut into multiple chunks according to the specified size.
Slab allocation
Growth factor
Memcached can specify the Growth Factor factor through the -f option at startup. This value controls the difference in chunk size. The default value is 1.25.
Use the memcached-tool command to view the different slab status of the specified Memcached instance, and you can see that the size (chunk size) of each Item is 1.25
命令:# memcached-tool 127.0.0.1:11211 display
Memcached data expiration method
Lazy Expiration
When creating a kv, you need to specify an expiration time (time stamp), which is determined by the user
Memcached does not monitor whether the record is expired internally. Instead, it looks at the time stamp of the record when getting it to check whether the record is expired. This technique is called lazy expiration. Therefore, Memcached will not consume CPU time on expiration monitoring.
LRU
In other words, we store a lot of data in memcache, but some of the data is often used and some are basically stored and have not been used. These will be marked as dangerous. When the space is not enough, these frequently not used will be overwritten.
Memcached will give priority to using the space of records that have timed out, but even so, there will be insufficient space when adding new records. At this time, a mechanism called Least Recently Used (LRU) is used to allocate space. As the name suggests, this is a mechanism to delete the "least recently used" record. Therefore, when the memory space is insufficient (when new space cannot be obtained from the slab class), it searches from the records that have not been used recently and allocates its space to the new record. From a practical point of view of caching, this model is ideal.
Memcached installation
Installation: yum install -y memcached libmemcached libevent memcached is based on libevent, so you must install libevent before installing me, rpm -qa libevent to see if it is installed or not, usually when installing memcached, he will automatically install it.
Start: systemctl start memcached
[root@awei-01 ~]# ps aux|grep memcached memcach+ 8484 0.0 0.0 326028 1204 ? Ssl 18:07 0:00 /usr/bin/memcached -u memcached -p 11211 -m 64 -c 1024 root 8491 0.0 0.0 112656 996 pts/0 S+ 18:07 0:00 grep --color=auto memcached
-u: Specify system user -p: monitor port -m: allocated memory size (unit: M mega) -c: maximum concurrent number
These parameters can be changed, but it does not have a configuration file. There are two ways to change it. One is to add parameters on the command line at startup, and the other is to change this file.
3. Configuration file: vim /etc/sysconfig/memcached can configure parameters
PORT="11211" port USER="memcached" user MAXCONN="1024" Maximum number of connections CACHESIZE="64" RAM OPTIONS="" other
For example, to add monitoring ip, you can change OPTIONS="" to: OPTIONS="127.0.0.1"
-m specifies memcached to allocate memory
-c specifies the maximum number of concurrent
-u specifies the user who runs the memcached service
View parameter usage: memcached -h
Check the running status of Memcached
Method 1: Built-in tool for viewing status: memcached-tool
Format: memcached-tool IP: port stats
用法memcached-tool 127.0.0.1:11211 stats
Mainly look: get_hits/curr_items= hit rate (that is, for example, you have some posts cached in me, whether the user visits through me when they visit)
connection_structures 11 curr_connections 10 curr_items 0 decr_hits 0 decr_misses 0 delete_hits 0 delete_misses 0 evicted_unfetched 0 evictions 0 expired_unfetched 0 get_hits 0 get_misses 0 hash_bytes 524288
Method 2: Command nc, need to install nc tool: yum install -y nc
Check which package installed this command: rpm -qf `which nc`
[root@awei-01 ~]# rpm -qf `which nc` nmap-ncat-6.40-19.el7.x86_64
Format: echo stats |nc IP port
Command usage: echo stats |nc 127.0.0.1 11211
[root@awei-01 ~]# echo stats |nc 127.0.0.1 11211 STAT pid 8484 STAT uptime 1327 STAT time 1599474551 STAT version 1.4.15 STAT libevent 2.0.21-stable STAT pointer_size 64 STAT rusage_user 0.013403 STAT rusage_system 0.040211 STAT curr_connections 10 STAT total_connections 12 STAT connection_structures 11
Method 3:
rpm -qa libmemcached is used to view a package. After installing libmemcached, you can use the command
Format: memstat --servers=IP:port
Usage: memstat --servers=127.0.0.1:11211
[root@awei-01 ~]# memstat --servers=127.0.0.1:11211 Server: 127.0.0.1 (11211) pid: 8484 uptime: 1406 time: 1599474630 version: 1.4.15 libevent: 2.0.21-stable pointer_size: 64 rusage_user: 0.014232 rusage_system: 0.042697 curr_connections: 10 total_connections: 13 connection_structures: 11 reserved_fds: 20 cmd_get: 0 cmd_set: 0
Memcached command line
Enter the memcached command: telnet
Usage: telnet 127.0.0.1 11211
[root@awei-01 ~]# telnet 127.0.0.1 11211 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. ##Enter the command
Data storage command: set
Usage: set key2 0 30 2
View storage data command: get
Usage: get key2
get key2 VALUE key2 0 2 12 END
Memcached syntax rules
<command name> <key> <flags> <exptime> <bytes>\r\n <data block>\r\n
Note: \r\n is the Enter key under windows
<command name> can be set, add, replace
set: storage . Indicates to store the data according to the corresponding <key>, increase when not, and sometimes overwrite
add: add . Indicates that the data is added according to the corresponding <key>, but if the <key> already exists, the operation will fail
replace: Replace . It means to replace the data according to the corresponding <key>, but if the <key> does not exist, the operation fails.
<key> The key that the client needs to save data
<flags> is a 16-bit unsigned integer (represented in decimal notation). This flag will be stored together with the data that needs to be stored and returned when the client gets the data. The client can use this flag for special purposes. This flag is opaque to the server.
<exptime> is the expiration time. If it is 0, the stored data will never expire (but can be replaced by server algorithm: LRU, etc.). If it is not 0 (unix time or the number of seconds from this time), when the expiration, the server can guarantee that the user will not get the data (based on the server time).
<bytes> The number of bytes that need to be stored, when the user wants to store empty data, <bytes> can be 0
<data block> content to be stored, after the input is completed, the client needs to add \r\n (directly click Enter) as the end mark.
Memcached data example
Replace data: replace
set key1 1 100 3 abc STORED get key1 VALUE key1 1 3 abc END replace key1 1 100 4 abde STORED get key1 VALUE key1 1 4 abde END
Delete data: delete
delete key1 DELETED get key1 END
Memcached data export and import
Export:
memcached-tool 127.0.0.1:11211 dump > data.txt
cat data.txt
Import:
nc 127.0.0.1 11211 < data.txt
If the nc command does not exist, yum install nc
Note: The exported data has a timestamp. This timestamp is the point in time when the data expires. If the current time has exceeded the timestamp, it cannot be imported.
PHP connect to Memcached
Install the memcache extension of php first
cd /usr/local/src/
Download the extension package: wget http://www.apelearn.com/bbs/data/attachment/forum/memcache-2.2.3.tgz (This is an extension of PHP, not memcached)
Unzip: tar zxf memcache-2.2.3.tgz
Enter: cd memcache-2.2.3
Generate config file: /usr/local/php-fpm/bin/phpize
编译:./configure --with-php-config=/usr/local/php-fpm/bin/php-config
安装:make && make install
After installation, there will be a prompt like this: Installing shared extensions: /usr/local/php-fpm/lib/php/extensions/no-debug-non-zts-20131226/
Then modify php.ini: vim /usr/local/php-fpm/etc/php.ini
Add a line extension=memcache.so
Check: /usr/local/php/bin/php-fpm -m template is loaded successfully
PHP connect to Memcached
Download test script
curl www.apelearn.com/study_v2/.memcache.txt > 1.php 2>/dev/null
1.php content can also refer to https://coding.net/u/aminglinux/p/yuanke_centos7/git/blob/master/21NOSQL/1.php
Execute the script and test it:
/usr/local/php-fpm/bin/php 1.php
Or put 1.php under the root directory of a virtual host, and visit in the browser to see the effect
Finally, you can see the data as follows:
[0] => aaa
[1] => bbb
[2] => ccc
[3] => ddd
Store session in Memcached
This example is implemented in the lamp/lnmp environment
How to specify the session to be stored in the memcached service
Edit the configuration file: /usr/local/php-fpm/etc/php.ini
Edit php.ini, find the session related and add two lines, the default is files (stored in the local disk)
The default storage location/tmp/ directory
If you want to save it in memcache, comment it out;,
Restart the php-fpm service: /etc/init.d/php-fpm restart
If it is apache adding method:
Add to the corresponding virtual host in httpd.conf
php_value session.save_handler "memcache" ##Specify storage type
php_value session.save_path "tcp://192.168.0.9:11211" ##Specify the server IP port of memcached
If it is nginx, the method is as follows:
Add to the pool corresponding to php-fpm.conf
php_value[session.save_handler] = memcache##Specify storage type
php_value[session.save_path] = "tcp://192.168.0.9:11211 "##Specify the server IP of memcached
After the configuration is successfully accessed, the value in the blue box at the back is displayed, you can enter memchahed to get this value
telnet 127.0.0.1 11211
get i44nunao0g3o7vf2su0hnc5440
experiment:
Download a php file and save a script of the session
wget http://study.lishiming.net/.mem_se.txt
Put it in the default virtual host directory
mv .mem_se.txt /usr/local/apache2/htdocs/session.php
The content of session.php can refer to https://coding.net/u/aminglinux/p/yuanke_centos7/git/blob/master/21NOSQL/session.php
Use the command to access: curl localhost/session.php This will store a session
Similar to 1443702394<br><br>1443702394<br><br>i44nunao0g3o7vf2su0hnc5440 is stored under /tmp/ by default