Relational Databases and NOSQL

Relational databases represent all data in binary representations of rows and columns.

 

Advantages of relational databases:

1. Maintain data consistency (transaction processing)

2. Due to the premise of standardization, the overhead of data update is very small (the same field basically has only one place)

3. Can perform complex queries such as Join

The ability to maintain data consistency is the biggest advantage of relational databases.

 

Disadvantages of relational databases:

bad handling

1. Write processing of large amounts of data

2. Make index or table schema changes for tables with data updates

3. Apply when the field is not fixed

4. Processing of simple queries that need to return results quickly

--Write processing of large amounts of data

The concentration of reading and writing on one database makes the database overwhelmed. Most websites have used master-slave replication technology to achieve read-write separation to improve the read-write performance and the scalability of the read database.

Therefore, when a large number of data operations are performed, the database master-slave mode will be used. The main database is responsible for data writing, and the secondary database is responsible for data reading. It is relatively simple to add secondary databases to achieve scale, but there is no simple way to solve the problem of scale when writing data.

First, if you want to scale data writing, you can consider increasing the number of primary databases from one to two, and use them as binary primary databases that are replicated with each other. Indeed, this can reduce the load of each primary database by half. However, there will be conflicts in the update processing, which may cause data inconsistency. In order to avoid such problems, it is necessary to allocate requests for each table to the appropriate master database for processing.

Second, you can consider dividing the database and placing them on different database servers, such as placing different tables on different database servers. Database segmentation can reduce the amount of data on each database server, so as to reduce hard disk IO It can realize high-speed processing on memory. However, since the join processing cannot be performed between the tables on different servers that store the words separately, these problems need to be considered in advance when the database is divided. After the database is divided, if the join processing must be performed, it must be associated in the program. This is very difficult.

 

 

--Do indexes or table structure changes for tables with data updates

When using a relational database, in order to speed up the query, it is necessary to create an index, and in order to add necessary fields, the table structure must be changed. are impossible. If you need to perform some time-consuming operations, such as creating an index for a table with a large amount of data or changing its table structure, you need to pay special attention, and the data may not be updated for a long time.

 

--Application when the field is not fixed

If the fields are not fixed, it is also difficult to use a relational database. Some people will say that it is enough to add a field when needed. This method is not impossible, but in actual application, repeated table structure changes are made every time. is very painful. You can also pre-set a large number of preliminary fields, but in this case, it is easy to lose the corresponding state of the fields and data over time, that is, which fields hold which data.

-- Processing of simple queries that need to return results quickly ("simple" here means no complex query conditions)

This is not a disadvantage, but in any case, relational databases are not good at returning results quickly for simple queries, because relational databases use a special SQL language for data reading, and it needs to parse SQL and Vietnam , and there are additional overheads such as locking and unlocking tables. This is not to say that relational databases are too slow, but just want to tell you that if you want high-speed processing of simple queries, there is no need to use relational databases. Not possible.

---------------------------

NoSQL database

Relational databases are widely used and can perform complex queries such as transaction processing and table joins. In contrast, NoSQL databases are only used in specific fields and basically do not perform complex processing, but they just make up for the deficiencies of relational databases listed above.

advantage:

 Easy data dispersal

The relationship between various data is the main reason for the name of the relational database. In order to perform join processing, the relational database has to store the data in the same server, which is not conducive to the dispersion of the data, which is also the relational database is not good at The reason for the write processing of large data volumes. On the contrary, NoSQL databases originally do not support Join processing, and each data is designed independently, so it is easy to distribute data on multiple servers, thus reducing the amount of data on each server. Even if a large amount of data is to be written, It becomes easier, and the reading of data is of course just as easy.

 

Typical NoSQL database

Transient key-value stores (memcached, Redis), persistent key-value stores (ROMA, Redis), document-oriented databases (MongoDB, CouchDB), column-oriented databases (Cassandra, HBase)

1. Key-value storage

Its data is stored in the form of key-value. Although it is very fast, it can basically only obtain data through a completely consistent query of the key. According to the way the data is saved, it can be divided into temporary, permanent and both. With three.

(1) Temporary

      The so-called temporary means that data may be lost. Memcached stores all data in memory, so the speed of saving and reading is very fast, but when memcached stops, the data does not exist. Since the data is stored in the memory, the data beyond the memory capacity cannot be manipulated, and the old data will be lost. In conclusion:

      . save data in memory

      . Very fast save and read processing is possible

      . Data may be lost

 (2) Permanent

       The so-called permanent means that the data will not be lost. The key-value storage here is to save the data on the hard disk. Compared with the temporary, there is still a gap in performance due to the inevitable IO operation on the hard disk, but the data will not be Loss is its greatest advantage. In conclusion:

       . Save data on hard drive

       . Can do very fast save and read processing (but not comparable to memcached)

       . Data will not be lost

(3) Both

       Redis falls into this category. Redis is somewhat special, both temporary and permanent. Redis first saves the data in the memory, and writes the data to the hard disk when certain conditions are met (the default is more than once in 15 minutes, more than 10 keys in 5 minutes, and more than 10,000 keys in 1 minute). It not only ensures the processing speed of data in memory, but also ensures the permanence of data by writing to the hard disk. This type of database is especially suitable for processing array type data. In conclusion:

       . Simultaneous storage of data on internal memory and hard disk

       . Very fast save and read processing is possible

       . Data saved on the hard drive will not disappear (can be recovered)

       . Suitable for handling array-type data

     

The document-oriented database

   MongoDB, CouchDB fall into this category, they are NoSQL databases but distinct from key-value stores.

   (1) The table structure is not defined

     Even if the table structure is not defined, it can be used as if the table structure is defined, and the trouble of changing the table structure is also saved.

   (2) Complex query conditions can be used 

     Unlike key-value storage, document-oriented databases can obtain data through complex query conditions. Although they do not have the processing capabilities of relational databases such as transaction processing and Join, other processing other than initial processing can basically be achieved. .

3. Column-oriented database

   Cassandra, HBae, HyperTable belong to this type, and due to the explosive growth of data volume in recent years, this type of NoSQL database is particularly attractive.

   Ordinary relational databases store data in behavioral units, and are good at read-in processing in behavioral units, such as the acquisition of data under specific conditions. Therefore, relational databases are also known as row-oriented databases. In contrast, column-oriented databases store data in columns and are good at reading in data in columns.

The column-oriented database is scalable, and even if the data increases, it will not reduce the corresponding processing speed (especially the writing speed), so it is mainly used when a large amount of data needs to be processed. In addition, it is also very useful as a memory for batch programs to update large amounts of data. However, because the column-oriented database is very different from the current way of thinking about database storage, it is very difficult to apply.

 

Summary: Relational databases and NoSQL databases are not opposite but complementary, that is, relational databases are usually used, and NoSQL databases are used when NoSQL is suitable, so that NoSQL databases can make up for the deficiencies of relational databases.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326395117&siteId=291194637