nosql chapter two after-school exercises

2.4 Exercises
1. Basic knowledge
1. Mechanical hard disks in a stand-alone environment are the biggest bottleneck that affects the read and write speed of the database, which requires attention in the course of practice.
2. To expand the server function, you can expand vertically or expand horizontally.
3 Different code commands have a slight difference in processing speed, which requires programmers to carefully understand and choose.
4. Changing data from hard disk read and write processing to memory processing is a vertical expansion; putting big data on the memory of different servers for processing is horizontal expansion.
5. NoSQL mainly solves the problem of data storage processing speed in the big data environment.
6. TRDB is good at solving structured data, and NoSQL is good at solving unstructured or semi-structured data.
7. TRDB has rules and completeness constraints on the data storage structure, and NoSQL has row-level locks and foreign key constraints on the data storage structure. TRDB reads the data of the hard disk, and reads and writes with "row" as the basic unit; NoSQL does not have the concept of "row", and reads and writes a certain "block" of data directly according to the address. In the case of reading and writing the same information, NoSQL reads and writes faster than TRDB.
8. For big data analysis, NoSQL has at least technical and cost advantages over TRDB.

Second, comprehensive application

  1. What are the four limitations of running a database in a stand-alone environment?
    1) The speed bottleneck of reading and writing data in a
    single machine 2) The limited amount of data stored in a
    single machine 3) The subtle difference in speed between instructions in a single machine
    4) Security issues

  2. What is the core physical difference between centralized data processing and distributed data processing?
    Centralized data processing deploys the project to one machine, which requires relatively high machine performance.
    Distributed data processing is to distribute the project on different machines and run it, which has low requirements for machine performance.

  3. The 500-meter spherical radio telescope (FAST) in Guizhou Province was built in 2016, and the daily production data is 5TB. These massive data will be retained for more than 10 years. Assuming that each PC server can store 20TB (regardless of system and other operational storage requirements, and no backup), how many servers of this capacity should be deployed to store data for 10 years?
    (5 365 10) / 20 = 912.5

  4. What are the main similarities and differences between cluster and distributed processing?
    Distributed must be a cluster. The cluster is not necessarily distributed (and may be a centralized multi-machine backup). The cluster is only a concept relative to the number of machines.

  5. Briefly describe the data processing principle of Master/Slave distributed database.
    The operating language of the master database is transferred to the server through binary log files, and then these log files are redone on the replication server, so that the data of the server (Slave) and the master server (Master) are kept in sync.

  6. Briefly describe the three characteristics of the hat theorem.
    1) Consistency. Can be understood as the synchronous data replication function
    2) Availability. It can be understood as meeting the need to update the operation function at any time
    3) Partition capacity. Can be understood as meeting the function of reading valid data at any time

  7. Explain what ACID is?
    Atomicity (A), consistency (C), isolation (I), durability (D)
    ACID is the four characteristics of TRDB to ensure data when processing data with transaction functions.

  8. Explain what BASE is?
    Basically available (BA), soft state (S), and eventual consistency (E)
    BASE is the three characteristics of NoSQL to ensure data when processing data with transaction functions.

  9. In the absence of a similar SQL database operating language, what are the deficiencies in NoSQL's data operation commands? (Name at least two)
    1) Does not support SQL queries
    2) Does not support transactions

Guess you like

Origin blog.csdn.net/m0_46202060/article/details/115250448