Why do you need nosql

http://www.infoq.com/cn/news/2011/01/nosql-why
quote

[Editor's note] NoSQL was booming in 2010. In the pursuit of high performance and high reliability, large and small Web sites involuntarily chose NoSQL technology as a priority. At the beginning of this year, InfoQ Chinese station is honored to invite Mr. Sun Li from Phoenix.com to share his experience and experience with NoSQL.


I am very honored to be invited to open such a column about NoSQL in InfoQ. InfoQ is a technology media that I respect very much. At the same time, I also hope to use InfoQ to promote the development of NoSQL in China, and I hope that friends who are as interested as me will join in. This NoSQL column series will first introduce NoSQL as a whole, and then introduce how to apply NoSQL to appropriate scenarios in your own projects, and will also appropriately analyze some successful cases. I hope that friends who have successfully used NoSQL can provide me with some clues and information.

NoSQL concept

With the rapid development of web 2.0, non-relational, distributed data storage has developed rapidly, and they do not guarantee the ACID properties of relational data. The concept of NoSQL was proposed in 2009. The most common interpretation of NoSQL is "non-relational", and "Not Only SQL" is also accepted by many people. (The term "NoSQL" was first used in 1998 for the name of a lightweight relational database.)

NoSQL is the one we use the most for key-value stores, but there are other document-based, column-based, Graph database, xml database, etc. Before the concept of NoSQL, these databases were used in various systems, but they were rarely used in web Internet applications. Such as cdb, qdbm, bdb database.

Bottlenecks of

traditional relational databases Traditional relational databases have good performance, high stability, time-tested, simple to use and powerful functions, and have accumulated a large number of successful cases. In the field of the Internet, MySQL has become the absolute king. It is no exaggeration to say that MySQL has made outstanding contributions to the development of the Internet.

In the 1990s, the number of visits to a website was generally not large, and a single database could easily handle it. At that time, there were more static web pages, and there were not many dynamic interactive websites.

In the last 10 years, the website has grown rapidly. Popular forums, blogs, sns, and Weibo are gradually leading the trend in the web field. In the early days, the traffic of the forum was not very large. If you have been in touch with the Internet earlier, you may remember that there were forum programs stored in text format at that time. You can imagine how much traffic is in general forums.

Memcached+MySQL

Later, with the increase in traffic, almost most websites using MySQL architecture began to have performance problems in the database. Web programs no longer only focus on functions, but also pursue performance. Programmers began to use a lot of caching technology to relieve the pressure on the database and optimize the structure and index of the database. At the beginning, it is more popular to use file cache to relieve database pressure, but when the traffic continues to increase, multiple web machines cannot share through file cache, and a large number of small file caches also bring relatively high IO pressure. At this time, Memcached has naturally become a very fashionable technology product.

As an independent distributed cache server, Memcached provides a shared high-performance cache service for multiple web servers. On the Memcached server, the expansion of multiple Memcached cache services based on the hash algorithm has been developed, and then appeared again. Consistent hashing is used to solve the drawbacks of a large number of cache invalidations caused by re-hash caused by adding or reducing cache servers. At that time, if you went to an interview and you said you had Memcached experience, it would definitely give you extra points.

Mysql master-slave read-write separation Memcached can only relieve the read pressure of the database

due to the increased write pressure of the database. The concentration of reading and writing on one database makes the database overwhelmed, and most websites begin to use master-slave replication technology to achieve read-write separation, in order to improve the read-write performance and the scalability of the read database. Mysql's master-slave mode has become standard for websites at this time.

Sub-table and sub-library

With the continuous rapid development of web2.0, on the basis of Memcached cache, MySQL master-slave replication, and read-write separation, the write pressure of MySQL's main database began to become a bottleneck, and the amount of data continued to increase sharply. Since MyISAM uses table locks, there will be serious lock problems under high concurrency. A large number of high-concurrency MySQL applications begin to use the InnoDB engine instead of MyISAM. At the same time, it has become popular to use sub-tables and sub-libraries to alleviate the problem of writing pressure and data growth. At this time, sub-table sub-database has become a hot technology, a hot question in interviews and a hot technical issue discussed in the industry. It was at this time that MySQL launched table partitions that were not yet stable, which also brought hope to companies with average technical strength. Although MySQL has launched MySQL Cluster, there are almost no successful cases on the Internet, and the performance cannot meet the requirements of the Internet, but it only provides a very large guarantee of high reliability.

The scalability bottleneck of MySQL

On the Internet, most MySQL should be IO-intensive. In fact, if your MySQL is CPU-intensive, it is very likely that your MySQL is designed with performance problems and needs to be optimized. MySQL application development in a large data volume and high concurrency environment is becoming more and more complex and technically challenging. The rules of sub-table and sub-library need experience. Although a company with strong technical strength like Taobao has developed a transparent middleware layer to shield the complexity of developers, the complexity of the entire architecture cannot be avoided. Sub-libraries of sub-libraries and sub-tables face expansion problems at a certain stage. There is also a change in requirements, which may require a new way of sub-library.

MySQL databases often store some large text fields, resulting in very large database tables, which are very slow when doing database recovery, and it is not easy to quickly restore the database. For example, 10 million 4KB texts are close to 40GB in size. If these data can be omitted from MySQL, MySQL will become very small.

Relational databases are powerful, but they are not well suited for all application scenarios. MySQL's poor scalability (requiring complex technologies to implement), high IO pressure under big data, and difficulty in changing the table structure are exactly the problems faced by developers currently using MySQL.

Advantages of NOSQL There

are

many types of easily scalable NoSQL databases, but a common feature is to remove the relational features of relational databases. There is no relationship between the data, which makes it very easy to expand. Invisibly, it brings scalable capabilities at the architectural level.

Large amounts of data, high-performance

NoSQL databases have very high read and write performance, especially in large amounts of data, the same performance. Thanks to its non-relational nature, the structure of the database is simple. Generally, MySQL uses Query Cache, which is invalid every time a table is updated. It is a large-grained Cache. In applications with frequent interactions with web2.0, the Cache performance is not high. The NoSQL Cache is record-level, a fine-grained Cache, so NoSQL has much higher performance at this level.

Flexible data model

NoSQL does not need to establish fields for the data to be stored in advance, and can store customized data formats at any time. In relational databases, adding and deleting fields is a very troublesome thing. If it is a table with a very large amount of data, adding fields is a nightmare. This is especially evident in the era of web 2.0 with a large amount of data.

High- availability

NoSQL can easily implement a high-availability architecture without affecting performance. For example, Cassandra, HBase model, can also achieve high availability by replicating the model.

To sum up

, the emergence of NoSQL databases has made up for the deficiencies of relational data (such as MySQL) in some aspects, and can greatly save development costs and maintenance costs in some aspects.

MySQL and NoSQL have their own characteristics and application scenarios. The close combination of the two will bring new ideas to the database development of web 2.0. Let relational databases focus on relationships and NoSQL on storage.

Reference reading
1.NoSQL: http://nosql-database.org/ 2.NoSQL
introduction on wiki: http://en.wikipedia.org/wiki/NoSQL
3.NoSQL related blog: http://nosql. mypopescu.com/
4. NoSQL related blogs: http://blog.nosqlfan.com/
5. Sina Weibo NoSQL group: http://qtsina.com.cn/127870





http://www.infoq.com/cn/articles/tq-why-choose-redis

Research on Redis

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326418922&siteId=291194637