When the data when crawling, we may need to cache large amounts of data, but it is connected without any complex operation, so we will choose NoSQL database, this database is easier to operate than traditional relational databases, I would like to say about the current main MongoDB is very popular as a cache database.
What is NoSQL?
NoSQL (NoSQL = Not Only SQL), which means "not only SQL". NoSQL, refers to non-relational database.
On the network every day will have a huge amount of data on modern computing systems.
These data are a large part handled by the relational database management system (RDMBS). Paper relational model 1970 EFCodd's proposed "A relational model of data for large shared data banks", which makes data modeling and application programming easier.
By applying proven relational model is well-suited for client-server programming, far beyond the expected benefits, today it is the dominant technology in the network business applications and structured data storage, however, for most cases set according to the amount of data too large making it difficult to store on a single server, then you need to expand on multiple servers. However, the relational model for this extended support is not good enough, because when querying multiple tables, data may be on a different server, on the contrary, NoSQL database is usually no pattern, from the beginning of design across servers without taking into account joints in fragmentation. In NoSQL, there are several ways to achieve this goal, namely column data (e.g. Hbase), storage key (e.g., the Redis), document-oriented databases (e.g., MongoDB) and a graphic database (e.g. Neo4j).
NoSQL advantages and disadvantages
advantage:
- High Scalability
- Distributed Computing
- low cost
- Architectural flexibility, semi-structured data
- No complicated relationship
Disadvantages:
- No standardization
- Limited search function (so far)
- The final agreement is not intuitive program
NoSQL database classification
Types of | Some representatives
|
Feature |
Column stores | Hbase Cassandra Hypertable |
As the name suggests, data is stored in columns. The biggest feature is easy to store structured and semi-structured data, data compression easy to do, to have a very big advantage for IO of a column or columns in a query. |
Document storage |
MongoDB CouchDB |
Document storage is generally used to store similar json format, content is stored in the document type. This also had the opportunity to index certain fields, to achieve some of the features of a relational database. |
key-value store |
Tokyo Cabinet / Tyrant Berkeley DB MemcacheDB Redis |
You can quickly check to its value by key. In general, the storage format regardless of value to inherit. (Redis includes other features) |
Map storage |
Neo4J FlockDB |
Best store graphics relations. A traditional relational database to solve it poor performance, design and inconvenient to use. |
Object Storage |
db4o Versant |
The syntax by a similar operation object-oriented database language to access the data by way of the object. |
xml database |
Berkeley DB XML BaseX |
Efficient storage of XML data, and support internal XML query syntax, such as XQuery, Xpath. |
Well, so much NoSQL, then have a simple understanding of the non-relational databases, that the following specific introduce MongoDB
What is MongoDB?
MongoDB is written in C ++ language, it is an open source database distributed file system based storage.
In the case of high load, add more nodes, you can ensure server performance.
MongoDB is designed to provide scalable, high-performance data storage solution for WEB applications.
MongoDB the data is stored as a document data structure of the key (key => value) pairs. MongoDB document similar to JSON object. Field value can contain other documents, arrays and array of documents.
What are the main features of MongoDB is?
- MongoDB is a document-oriented database that stores operate more simple and easy.
- You can set any property of the index in MongoDB records to enable faster sorting.
- You can create data through a local or network mirroring, which makes MongoDB stronger scalability.
- If the increase in the load on other nodes, which may be distributed in a computer network (this is called fragmentation).
- Mongo supports a rich query expressions. JSON query instruction using the form tag can easily query document embedded objects and arrays.
- MongoDb using update () command to implement the document (data) or complete replacement of some of the specified data fields.
- Map / reduce Mongodb it is mainly used for data processing and batch polymerization operations.
- Map and Reduce. Map function call EMIT (key, value) through all of the record set, and the key value passed to Reduce function processing.
- Map and Reduce functions are functions written using Javascript, and MapReduce operation can be performed by db.runCommand or mapreduce command.
- GridFS is a built-in feature in MongoDB can be used to store a large number of small files.
- MongoDB allows the server to execute the script, you can use Javascript to write a function, execute directly on the server, you can put the function definitions are stored in the server, you can directly call the next time.
- MongoDB supports a variety of programming languages: RUBY, PYTHON, JAVA, C ++, PHP, C # and other languages.
- MongoDB is simple to install.
MongoDB downloads
You can download the installation package mongodb official website address is: https://www.mongodb.com/download-center#community .
MongoDB installation
The first step: Click the Install: mongodb-win32-x86_64-2008plus-ssl-3.0.1-signed.msi
Step two: consent agreement
The third step: Custom Installation
Step Four: Continue to next step
Step Five: Complete
Mongodb configuration
First, create the database file storage location
For example d: / mongodb / data / db. Mongodb needs must be created before you start the service to store files in the database file folder, or the command does not automatically create, and can not start successfully.
Open cmd (windows key + r input cmd) command, enter D: \ mongodb \ bin directory (FIG first type d: d into the tray and enter cd d: \ mongodb \ bin),
Enter the following command to start mongodb services:
D:/mongodb/bin>mongod --dbpath D:\mongodb\data\db
Then enter
mongod.exe --dbpath=d:\db
Finally, start the service
net start mongodb