NoSQL related knowledge
- 1. The evolution of stand-alone mysql
- 2. Overview of NoSQL
- Three, 3V + 3 high
- 4. Technical overview
- 5. Four categories of NoSQL
1. The evolution of stand-alone mysql
1. Stand-alone Mysql (evolution 1)
1.1 User access process
APP——>DAL——>Mysql
1.2 Background
The traffic of a basic website is generally not too large, and a single database is completely sufficient.
2. Cache (evolution 2)
2.1 Structure
Memcached (caching) + MySQL + vertical split
2.2 Introduction
Since 80% of the website is reading, each query is very troublesome, so in order to reduce data pressure, you can use cache to ensure efficiency.
2.3 Development process
Optimize data structure and index --> file cache (IO) --> Memcached (the hottest year)
2.4 Functions
Realize read-write separation and caching
3. Cluster (evolution 3)
3.1 Structure
Sub-database and table + horizontal split + MySQL cluster + cache
3.2 Introduction
MyISAM: Table locks greatly affect efficiency, and serious lock problems will occur under high concurrency.
Innodb: row locks
3.3 Process
Use sub-database and sub-table to solve the pressure of writing, and MySQL also introduces the concept of table partition
3.4 Function
4. now
4.1 Structure
Load balancing + sub-database and sub-table + horizontal split + MySQL cluster + cache + various servers
4.2 Background
The amount of data is large and changes rapidly, and relational databases such as MySQL are not enough.
4.3 Function
2. Overview of NoSQL
1 Introduction
NoSQL generally refers to non-relational databases. With the birth of the web 2.0 Internet, it is difficult for traditional relational databases to handle large-scale, high-concurrency communities. Therefore, NoSQL is developing rapidly in the current big data environment. (A lot of data types user's personal information, social network, geographic location, the storage of these data types does not need a fixed format)
2. Features
(1) Easy to expand (no relationship between data, easy to expand)
(2) Large data volume and high performance (Redis writes 80,000 times a second, reads 110,000, and NoSQL's cache record level is a fine-grained Cache, the performance will be relatively high)
(3) The data type is diverse (no need to design the database in advance, just take it and use it)
3. The difference between RDBMS and NoSQL
3.1 RDBMS (Relational Database)
Structured organization, SQL, data and relationships are stored in separate tables, adhere to ACID rules, etc.
3.2 NoSQL (non-relational database)
Storage is not just data, there is no fixed query language, key-value pair storage, column storage, document storage, graph database
eventual consistency, CAP theorem and BASE, high performance, high availability, and high scalability.
Three, 3V + 3 high
1. 3V in the era of big data
(1) Massive Volume
(2) Variety
(3) Real-time Velocity
2. Three highs in the era of big data
(1) High concurrency
(2) High scalability
(3) High performance
4. Technical overview
1. Basic product information
1.1 Scenarios
Name, price, business information
1.2 Technology
Relational database (MySQL/Oracle)
2. Product description and comments
2.1 Scenarios
more text
2.2 Technology
Document database (MongoDB)
3. Pictures
3.1 Scenarios
store picture picture
3.2 Technology
Distributed File System FastDFS
Taobao File System TFS
Google File System GFS
Hadoop File System HDFS
Alibaba Cloud File System OSS
4. The keywords of the product
4.1 Scenarios
search
4.2 Technology
Search engine solr, elasticsearch, ISerach
5. Band information of popular commodities
5.1 Scenarios
Product sales
5.2 Technology
Memory database Redis, Tair, Memache...
6. Commodity transactions, external payment interface
Three-party application
5. Four categories of NoSQL
1. KV key-value pair
1.1 Example
Sina: Redis
Meituan: Redis + Tair
Ali, Baidu: Redis + memecache
1.2 Application scenarios
Content caching is mainly used to handle high access loads of large amounts of data, and is also used in some log systems and so on.
1.3 Data Model
Key points to the key-value pair of Value, which is usually implemented by hash table.
1.4 Advantages
fast search
1.5 Disadvantages
The data is unstructured and usually only treated as string or binary data
2. Document database (bson format is the same as json)
1.1 MongoDB、CouchDB
MongoDB is a database based on distributed file storage. It is written in C++ and is mainly used to process a large number of documents
. MongoDB is an intermediate product between relational databases and non-relational databases. like a relational database.
1.2 Application scenarios
Web application (similar to Key-Value, Value is structured, the difference is that the database can understand the content of Value)
1.3 Data Model
The key-value pair corresponding to Key-Value, and Value is structured data.
1.4 Advantages
The data structure requirements are not strict, the table structure is variable, and there is no need to pre-define the table structure like a relational database.
1.5 Disadvantages
The query performance is not high, and it lacks a unified query syntax.
3. Column store database
3.1 Example
HBase (big data)
3.2 Application scenarios
distributed file system
3.3 Data Model
Store in column clusters, store data in the same column together
3.4 Advantages
The search speed is fast, the scalability is strong, and it is easier to perform distributed expansion.
3.5 Disadvantages
Relatively limited functions
4. Graph relational database
It is not a database that stores pictures, but stores relationships
4.1 Examples
Neo4J、InfoGrid、Infinite Graph
4.2 Application scenarios
Social networks, recommendation systems, etc., focusing on building relationship graphs
4.3 Data Model
graph structure
4.4 Advantages
Use graph structure-related algorithms, such as shortest path addressing, N-degree relationship search, etc.
4.5 Disadvantages
In many cases, it is necessary to calculate the entire graph to obtain the required information, and this structure is not easy to distribute.