Touge Big Data Assignment 5: NoSQL and Cloud Database

Extracurricular homework five: NoSQL and cloud database

  • Job details

content

1. SQL cloud database experiment 1. "Quick Start with RDS in 10 Minutes" KooLabs Cloud Experiment_Online Experiment_Cloud Practice_Cloud Computing Experiment_AI Experiment_Huawei Cloud Official Experiment Platform-Huawei Cloud Create a database named RDS followed by your full name Pin, create a data table named table and then spell your name in full. Public network connectivity test 2. "How to quickly connect to the cloud database RDS MySQL" How to quickly connect to the cloud database RDS MySQL - Yunqi Lab - Online Experiment - Cloud Practice - Alibaba Cloud Developer Community - Alibaba Cloud Official Experimental Platform - Alibaba Cloud Create a database named RDS followed by the full spelling of your name, create a data table named table followed by the full spelling of your name, create an ordinary user named user followed by the full spelling of your name 3. " MySQL Database Rapid Deployment Practice》MySQL Database Rapid Deployment Practice-Yunqi Lab-Online Experiment-Cloud Practice-Alibaba Cloud Developer Community-Alibaba Cloud Official Experiment Platform-Alibaba Cloud Create a database named RDS followed by the full spelling of your name, create The name of the data table is table followed by your full name.

Experimental requirements:

Create a database named RDS followed by the full spelling of your name, create a data table named table followed by the full spelling of your name, create a common user named user followed by the full spelling of your name, and take a screenshot with the above information.

2. NoSQL cloud database experiment 1. "Introduction to MongoDB Database" Introduction to MongoDB Database - Yunqi Lab - Online Experiment - Cloud Practice - Alibaba Cloud Developer Community - Alibaba Cloud Official Experiment Platform - Alibaba Cloud Create a database named mongodb to pick yourself up To spell out your name in full, create a collection named collection followed by your own name in full, create an ordinary user named user followed by your full name, insert three documents with your name information, and screenshot the query to display all document contents.

2. "Implementing Online Game Points Ranking Based on Redis" Implementing Online Game Points Ranking Based on Redis-Yunqi Lab-Online Experiment-Cloud Practice-Alibaba Cloud Developer Community-Alibaba Cloud Official Experiment Platform-Alibaba Cloud

Experimental requirements:

Create a database named mongodb followed by the full spelling of your name, create a collection named collection followed by the full spelling of your name, create a common user named user followed by the full spelling of your name, insert three document contents with your own name information, screenshot query displays all document contents .

3. Briefly answer the content of “Classroom Assessment”

  1. What is a cloud database instance? Do I need to install a database system to use it? Answer: A database instance is a program, a layer of data management software located between the user and the operating system, and a channel for accessing the database; users can perform any operations on the data in the database, including data definition, data query, data maintenance, and database management. Operation control, etc. are all performed under the database instance. Applications can only deal with the database through the database instance. A database system needs to be installed.
  2. Is the cloud database storage capacity fixed, or can it be larger or smaller? Answer: Cloud database storage capacity is not fixed. Size according to personal needs.
  3. What is the difference between a self-built database on ECS and a cloud database instance? (1) Self-built database on ECS: 1) Reliability High reliability can only be achieved with a good architecture. The cost of achieving RPO=0 is extremely high, and R&D services need to be purchased separately. 2) Pre-security protection: whitelist, security group, private network isolation. In-process protection: Connection link encryption and data disk encryption need to be implemented separately. BYOK key rotation is difficult and consultation costs are high. Post-event audit: Auditing is difficult and SQL logs need to be saved separately. 3) ECS self-built database price, hardware costs and spare parts costs: at least 2 ECS instances are required as primary and secondary instances. The cost for two ECS instances with 2 CPUs, 4 GB memory, and 100 GB storage space (the IOPS capacity can reach 6800) is 6,800 yuan/year. (2) Cloud database instance: 1) Reliability Data reliability is high, with automatic primary and secondary replication, data backup, log backup, etc. MySQL 5.6 three-node Enterprise Edition achieves RPO (Recovery Point Object) = 0. MySQL 5.7 three-node Enterprise Edition (MGR) achieves RPO=0 and RTO (Recovery Time Objective) < 1 minute. 2) Pre-security protection: whitelist, security group, private network isolation. In-process protection: connection link encryption, data transfer encryption (BYOK covers a variety of storage media). Post-mortem audit: SQL insights, historical events. 3) Cloud database RDS price hardware costs and spare parts costs: RDS instance costs. For example, the instance fee for 2 CPUs, 4 GB memory, and 100 GB storage space (the IOPS capability can reach 6800) is 8,000 yuan/year.
  4. Is MongoDB SQL or NoSQL? Why? Answer: MongoDB is NoSQL. MongoDB is a non-relational database (NoSQL), which is a document database. The document database is designed to solve the problems caused by relational databases. The stored data format is JSON (or BSON). We are all familiar with the JSON format. For example, the Response returned by a Rest API request is in JSON format. The difference between JSON format data and XML format is that JSON is simpler and does not have so many tags to define field names. In other words, JSON is self-describing. In addition, after the JSON format is stored in MongoDB, even reading a field that does not exist in JSON will not cause syntax errors like SQL.
  5. Did you install MongoDB yourself during the experiment? What are the created database names and user names? What is a user? Why do you need a user password? What does a normal user role include? Answer: I installed it myself; The created database name is: rds_yanrong, and the user name is: user_yanrong; It is an ordinary user; For the security of the database; There are two roles: read and readwrite.
  6. What is a collection? What is the document? What operations were performed on the documents during the experiment? Answer: A collection is equivalent to a table in RDB, and the table here has no schema definition; each instance of MongoDB can contain multiple databases, and each database has its own collection and permissions; the document is the basic unit in MongoDB, equivalent to RDB A record is usually displayed in json format and stored in bson format; the experiment creates, inserts, and deletes documents.
  7. Is Redis SQL or NoSQL? Why? Answer: redis is a nosql database. Used for recovery backup, cached data, and auxiliary persistence
  8. Which Redis data structure is used in the experiment? Answer: There are five common data structures in Redis: String, List, Hash, Set, and Sorted Set. The Redis data structure in the experiment is a collection. 4. Exercises • 5.8 Exercises
  9. How to accurately understand the meaning of NoSQL? Answer: NoSQL is a database management system design method that is different from relational databases. It is a general term for non-relational databases. The data model it uses is not the relational model of traditional relational databases, but Non-relational models such as key/value, column family, document, etc.
  10. Describe the aspects in which relational databases cannot meet the needs of Web 2.0 applications. Answer: Mainly in the following aspects: (1) Unable to meet the management needs of massive data (2) Unable to meet the needs of high data concurrency (3) Unable to meet the needs of high scalability and high availability
  11. Why do some key features of relational databases become "tasteless" in the Web2.0 era? Answer: (1) Web2.0 website systems usually do not require strict database transactions (2) Web2.0 does not require strict real-time reading and writing ( 3) Web2.0 usually does not contain a large number of complex SQL queries
  12. Please compare the advantages and disadvantages of NoSQL databases and relational databases. (1) Advantages of relational database: It is based on perfect relational theoretical algebra, has strict standards, supports transaction ACID four properties, university query, mature technology, and technical support from professional companies; Disadvantages: poor scalability, cannot be better Supports massive data storage, the data model is too rigid, cannot support Web2.0 applications well, and the transaction mechanism affects the overall performance of the system (2) NoSQL database advantages: supports ultra-large-scale data storage, the data model flexibly supports Web2.0, and has powerful horizontal expansion Capability shortcomings: lack of mathematical theoretical foundation, low complex query performance. Strong transaction consistency cannot be achieved, data integrity is difficult to achieve, the technology is not yet mature, professional technical support is lacking, and maintenance is difficult.
  13. Describe the four major types of NoSQL databases. Answer: Key-value database, column family database, document database and graph database.
  14. Describe the applicable occasions, advantages and disadvantages of key-value databases, column family databases, document databases and graph databases. Suitable occasions for databases Advantages Disadvantages Key-value databases have good scalability, good flexibility, and high performance during large-scale write operations. Inability to store structured information and low conditional query efficiency. Column family databases do not need ACID. Transaction support has fast search speed, strong scalability, easy distributed expansion, low complexity and few functions, and most document databases do not support strong transaction consistency. Only adding transactions to the same document has good performance (high concurrency). High flexibility, low complexity, and flexible data structure; provides embedded document function to store frequently queried data in the same document; indexes can be built based on keys or content. Lack of unified query syntax graphics The database has highly interrelated data, high flexibility, supports complex graph algorithms, and can be used to build complex relationship graphs. The complexity is high and can only support a certain data scale.
  15. Describe the specific meaning of CAP theory. Answer: C (Consistency): Consistency means that any read operation can always read the result of the previously completed write operation. That is, in a distributed environment, data at multiple points is consistent, or in other words, all nodes Have the same data at the same time

A: (Availability): Availability refers to obtaining data quickly and returning operation results within a certain time, ensuring that each request is responded to regardless of success or failure;

P (Tolerance of Network Partition): Partition tolerance means that when a network partition occurs (that is, some nodes in the system cannot communicate with other nodes), the separated system can also run normally, that is, the system Any loss or failure of information will not affect the continued operation of the system. 8. Please give examples of how CAP theory is used in the design of different products. Answer: CA. Emphasis on consistency and availability (A), abandon partition tolerance§, the simplest approach is to put all transaction-related content on the same machine. This approach will seriously affect the scalability of the system. For example, traditional relational databases (MySQL, SQL Server and PostgreSQL).

CP . Emphasize consistency and partition tolerance§, give up availability (A), when a network partition occurs, the affected service needs to wait for the data to be consistent, so it cannot provide external services during the waiting period. For example, NoSQL databases such as Neo4J, BigTable and HBase.

AP. Emphasize availability (A) and partition tolerance§, abandon consistency, and allow the system to return inconsistent data. This is feasible for many Web 2.0 websites. The first concern of users of these websites is whether the website service is available. When users need to publish a Weibo, they must be able to publish it immediately, otherwise, users will give up using it. However, this Weibo When the blog can be read by other users after it is published is not a very important issue and will not affect the user experience. Therefore, for Web 2.0 websites, availability and partition tolerance have higher priority than data consistency, and websites will generally try to be designed in the direction of AP. Of course, when adopting AP design, you can not completely give up consistency and switch to eventual consistency. For example, NoSQL databases such as Dynamo, Riak, CouchDB, and Cassandra. 9. Describe the meaning of the four ACID properties of the database. Answer: Consistency means that all data must remain consistent when a transaction is completed. Isolation means that modifications made by concurrent transactions must be isolated from modifications made by other concurrent transactions. Durability means that after a transaction is completed, its impact on the system is permanent, and the modification will remain even if a fatal system failure occurs. 10. Describe the specific meaning of BASE. Answer: The basic meaning of BASE is Basically Available, Soft state and Eventual consistency. 11. Please explain the specific meaning of soft state, stateless and hard state. Answer: "Soft-state" is a term corresponding to "hard-state". When the data stored in the database is in "hard state", data consistency can be guaranteed, that is, the data is always correct. "Soft state" means that the state can be out of sync for a period of time, with a certain degree of hysteresis. 12. What is eventual consistency? Answer: Depending on the time and manner in which each process accesses the data after the data is updated, eventual consistency can be divided into: (1) Session consistency: It puts the process accessing the storage system into the session ( In the context of a session), as long as the session still exists, the system guarantees "read what you have written" consistency. If the session is terminated due to some failure conditions, a new session must be established, and the system guarantees that it will not continue to the new session; (2) Monotonic write consistency: The system guarantees that write operations from the same process are executed sequentially. The system must guarantee this level of consistency, otherwise it will be very difficult to program (3) Monotonic read consistency: If the process has already seen a certain value of the data object, then any subsequent access will not return the value before that value (4) Causal consistency: If process A notifies process B that it has updated a data item, then subsequent accesses by process B will obtain the latest value written by A. The access of process C, which has no causal relationship with process A, still obeys the general eventual consistency rules (5) "read what you have written" consistency: it can be regarded as a special case of causal consistency. When process A performs an update operation by itself, it can always access the updated value and will never see the old value. 13. Describe the meaning of the inconsistency window. Answer: All subsequent accesses can read the latest value written by the OP. The time interval between the completion of the OP operation and the time when subsequent accesses can finally read the latest value written by the OP is called the "inconsistency window". 14. What different types of consistency can eventual consistency be divided into based on the time and manner in which each process accesses the data after updating the data? Answer: session consistency, monotonic write consistency, monotonic read consistency, and causal consistency Consistency with "reading what one has written". 15. What is NewSQL database? Answer: NewSQL is the abbreviation for various new scalable, high-performance databases. This type of database not only has NoSQL's storage and management capabilities for massive data, but also maintains the ACID and SQL features of traditional databases. 16. Describe the differences between NewSQL databases, traditional relational databases and NoSQL databases. Answer: NewSQL database not only has the storage and management capabilities of NoSQL for massive data, but also maintains the ACID and SQL features of traditional databases. • 6.5 Exercises What is NewSQL database? Answer: NewSQL is the abbreviation for various new scalable, high-performance databases. This type of database not only has NoSQL's storage and management capabilities for massive data, but also maintains the ACID and SQL features of traditional databases. 16. Describe the differences between NewSQL databases, traditional relational databases and NoSQL databases. Answer: NewSQL database not only has the storage and management capabilities of NoSQL for massive data, but also maintains the ACID and SQL features of traditional databases. • 6.5 Exercises What is NewSQL database? Answer: NewSQL is the abbreviation for various new scalable, high-performance databases. This type of database not only has NoSQL's storage and management capabilities for massive data, but also maintains the ACID and SQL features of traditional databases. 16. Describe the differences between NewSQL databases, traditional relational databases and NoSQL databases. Answer: NewSQL database not only has the storage and management capabilities of NoSQL for massive data, but also maintains the ACID and SQL features of traditional databases. • 6.5 Exercises

  1. Describe the concept of cloud database. Answer: Cloud database is a database deployed and virtualized in a cloud computing environment. Cloud database is an emerging method of sharing infrastructure developed under the background of cloud computing. It greatly enhances the storage capacity of the database, eliminates the duplication of configuration of personnel, hardware, and software, and makes software and hardware upgrades easy. It is easier to implement and also virtualizes many back-end functions. Cloud databases have the characteristics of high scalability, high availability, multi-tenancy and effective distribution of resources.
  2. Compared with the traditional way of using software, what are the obvious advantages of the cloud computing method? Project Traditional method Cloud computing method Obtain software by investing in building a computer room, building a hardware platform, purchasing software and installing it locally Directly purchasing software services from cloud computing vendors The usage method is local installation. The local software runs on the cloud computing vendor's server. Users can use the software service through the network anywhere there is network access. The payment method requires a one-time payment of a large initial investment cost, including the construction of a computer room and configuration. Hardware, purchase of various software (operating system, anti-virus, business software, etc.), you can immediately obtain the required IT resources with zero cost investment, you only need to pay for the resources used, use more, pay more, use less, pay less, extremely cheap maintenance Cost: You need to spend your own money to hire professional technicians for maintenance. There is zero cost. All maintenance work is done by the cloud computing vendor. The speed of obtaining IT resources requires a long time to build a computer room, purchase, install and debug equipment. The system is always available. After purchasing the service, it can be used immediately for sharing. The method is to build it yourself and be self-sufficient. After cloud computing manufacturers build a cloud computing service platform, they can provide services to many users at the same time. When problems such as viruses and system crashes occur, they need to hire their own IT personnel for maintenance. Many ordinary enterprises have limited technical capabilities of IT personnel. When encountering some problems, you may even need to seek external assistance. Usually, when any problems cannot be solved immediately, cloud computing vendors will rely on their professional teams to respond promptly to ensure the normal use of cloud services. Resource utilization is low and a large amount of money is invested. The IT system built is often only used by the enterprise itself. When the enterprise does not need so many IT resources, it will cause resource waste and the utilization rate is high. It can provide services to a large number of users every day; when there are idle resources, cloud computing management The system will automatically close and exit redundant resources; when additional resources are needed, discounts will automatically start and relevant resources will be added. The cost of user relocation. When a company moves, the original computer room facilities will be scrapped, and a large cost of construction will need to be reinvested in the new place. No matter where the computer room enterprise relocates, it can use the network to immediately obtain cloud computing services at zero cost. Because the resources are in the cloud and not on the user side, user relocation will not affect the distribution of IT resources and resource scalability. The IT foundation built by the enterprise itself The service capacity of facilities usually has an upper limit. When an enterprise's business volume suddenly increases, the existing IT infrastructure cannot meet the demand immediately, and time and money need to be spent to purchase and install new equipment; when the business peak has passed, the redundant equipment will will be idle, resulting in a waste of resources.Cloud computing vendors can provide enterprises with nearly unlimited IT resources (storage, cloud computing and other resources). Users can immediately obtain as much as they want to use. When users are not using it, they only need to unsubscribe from excess resources, and there are no problems such as idle resources.
  3. What are the characteristics of cloud databases? 1) Dynamic scalability: In theory, the transportation library has unlimited scalability and can meet the increasing data storage needs. 2) High availability: There is no single point of failure problem. 3) Lower cost of use: Usually in the form of multi-tenancy (Muti-tenancy), it provides services to multiple users at the same time. This form of shared resources can save costs for users, and users adopt "pay-as-you-go" Use various software and hardware resources in the cloud computing environment in a way that will not cause unnecessary waste of resources. 4) Ease of use: Users using cloud databases do not need to control the machine running the original database, nor do they need to know where it is. 5) High performance: Use a large-scale distributed storage service cluster to support massive data access, automatic backup and redundant backup in multiple computer rooms, and automatic separation of reading and writing. 6) Maintenance-free: Users do not need to pay attention to various risks such as the stability of back-end machines and databases, network problems, computer room disasters, and single database pressure. Cloud database service providers provide 7*24h professional services, and expansion and migration are transparent to users. It does not affect services and can provide all-round, all-weather three-dimensional monitoring. Users do not need to deal with database failures in the middle of the night. 7) Security: Provide data isolation, data of different applications will exist in different databases without affecting each other; provide security checks, which can detect and deny malicious offensive access in time; provide multi-point backup of data to ensure that it will not happen data lost.
  4. Let’s discuss the impact of cloud databases. Answer: In the era of big data, every enterprise is continuously generating large amounts of data almost every day. Different types of enterprises have different storage needs, and cloud databases can well meet the personalized storage needs of different enterprises. First of all, cloud databases can meet the massive data storage needs of large enterprises. Cloud databases have broad application prospects in the current big data era of data explosion. According to IDC's research report, enterprises' storage demand for structured data will increase by about 20% every year, while storage demand for unstructured data will increase by about 60% every year. Traditional relational databases are difficult to expand horizontally and cannot store such massive amounts of data. Therefore, cloud databases with high scalability have become a good choice for enterprises to store and manage massive data. Secondly, cloud databases can meet the low-cost data storage needs of small and medium-sized enterprises. Small and medium-sized enterprises have relatively limited investment in IT infrastructure and are very eager to obtain database services from third parties conveniently, quickly and cheaply. The cloud database adopts a multi-tenant approach to provide services to multiple users at the same time, which reduces the usage cost of a single user. Moreover, users usually pay on demand when using cloud database services, without wasting resources and causing additional expenses. Therefore, the cost of using cloud databases is very low. For small and medium-sized enterprises, it can greatly lower the threshold of enterprise informatization, allowing enterprises to obtain high-quality professional-level database services while paying lower costs, thereby effectively improving the level of enterprise informatization. In addition, cloud databases can meet the dynamically changing data storage needs of enterprises. The amount of data that enterprises need to store at different times is constantly changing, sometimes increasing and sometimes decreasing. In the case of small-scale applications, changes in system load can be handled by the idle redundant resources of the system. However, in the case of large-scale applications, traditional relational databases are not only unable to meet application needs due to their poor scalability, but also Bringing high storage costs and management overhead to enterprises. The good scalability of cloud databases allows enterprises to immediately improve database capabilities when demand increases, and immediately release excess database capabilities when demand decreases, thus better meeting the dynamic data storage needs of enterprises.
  5. Give examples of cloud database vendors and their representative products. (1) Traditional database vendors, such as Teradata, Oracle, IBM DB2 and Microsoft SQL Server, etc. (2) Cloud providers involved in the database market, such as Amazon, Google, Yahoo!, Alibaba, Baidu, Tencent, etc. (3) Emerging vendors, such as Vertica, LongJump and EnterpriseDB, etc. Cloud database products: Enterprise products Amazon Dynamo, SimpleDB, RDS Google Google Cloud SQL Microsoft Microsoft SQL Azure Oracle Oracle Cloud Yahoo! PNUTS Vertica Analytic Database v3.0 for the Cloud EnerpriseDB Postgres Plus in the Cloud Alibaba Cloud RDS Baidu Baidu Cloud Database Tencent Tencent Cloud Database
  6. Describe the architecture of Microsoft SQL Azure. Answer: The architecture of SQL Azure includes a virtual machine cluster, which can dynamically increase or decrease the number of virtual machines according to changes in workload, as shown in the figure. Each virtual machine SQL Server VM (Virtual Machine) is installed with the SQL Server2008 database management system to store data in a relational model. Usually, a database will be distributed and stored in 3 to 5 SQL Server VMs. Each SQL Server VM is installed with both SQL Azure Fabric and SQL Azure management services. The latter is responsible for data replication of the database to ensure the basic high availability requirements of SQL Azure. SQL Azure Fabric and management services in different SQL Server VMs will exchange monitoring information with each other to ensure the monitorability of the overall service.
  7. Describe the functions of UMP system. Answer: The UMP system is a low-cost and high-performance MySQL cloud database solution, and the key modules are implemented in the Erlang language. Developers apply for MySQL instance resources from the platform through the network, and access data through a single entrance provided by the platform. The UMP system divides various server resources into resource pools and allocates resources to MySQL instances based on resource pools. The system contains a series of components that work together to provide a series of services such as master-slave hot backup, data backup, migration, disaster recovery, read-write separation, and database and table sharding in a transparent manner to users. The system is divided into three types of users, namely users with small data volume and traffic, medium-sized users, and users who need to split databases and tables. Multiple small-scale users can share the same MySQL instance, medium-sized users can exclusively use one MySQL instance, and multiple MySQL instances of users who need to divide databases and tables share the same physical machine. Through these methods, resource virtualization is realized and the cost is reduced. overall cost. UMP implements resource isolation, on-demand allocation and restriction of CPU, memory and IO resources through two methods: "using Cgroup to limit MySQL process resources" and "limiting PS (QueryPerSecond) on the Proxy server side"; at the same time, it also supports On the premise of providing data services, the capacity can be dynamically expanded and reduced according to the development of user business. The system also comprehensively uses SSL database connection, data access IP whitelist, user operation log recording, SQL interception and other technologies to effectively protect user data security.
  8. Describe the components of the UMP system and their specific functions. (1) Mnesia Mnesia is a distributed database management system suitable for telecommunications and other Erlang applications that require continuous operation and soft real-time characteristics. It is a control system platform for building telecommunications applications - Open Telecom Platform (OTP) )a part of.

(2) RabbitMQ RabbitMQ is an industrial-grade message queue product developed in Erlang (similar in function to IBM's message queue product IBM WEBSPHERE MQ). It is used as a message transmission middleware to achieve reliable message transmission.

(3) Zookeeper Zookeeper is an efficient and reliable collaborative working system that provides basic services such as distributed locks (such as unified naming service, status synchronization service, cluster management, management of distributed application configuration items, etc.) and is used to build distributed application, alleviating the coordination tasks undertaken by distributed applications. In the UMP system, Zookeeper mainly plays three roles.

As a global configuration server. Provide distributed locks. Monitor all MySQL instances. (4) LVS LVS (Linux Virtual Server) is a Linux virtual server and is a virtual server cluster system. The UMP system uses LVS to achieve load balancing within the cluster.

(5) Controller server The Controller server provides various management services to the UMP cluster, realizing cluster member management, metadata storage, MySQL instance management, fault recovery, backup, migration, expansion and other functions.

(6) Web console The Web console provides users with a system management interface.

(7) Proxy server Proxy server provides users with access to MySQL database services. It fully implements the MySQL protocol. Users can use existing MySQL clients to connect to the Proxy server. The Proxy server obtains the user's authentication information and resources through the user name. Quota restrictions (such as QPS, IOPS (I/O Per Second), maximum number of connections, etc.), as well as the address of the background MySQL instance, then the user's SQL query request will be forwarded to the responding MySQL instance. In addition to the basic functions of data routing, the Proxy server also implements many important functions, including shielding MySQL instance failures, separation of read and write, sharding databases and tables, resource isolation, recording access logs, etc.

(8) Agent server Agent server is deployed on the machine running the MySQL process. It is used to manage the MySQL instance on each physical machine, perform master-slave switching, creation, deletion, backup, migration and other operations. It is also responsible for collecting and analyzing the MySQL process. Statistics, Slow Query Log and bin-log.

(9) Log analysis server The log analysis server stores and analyzes user access logs incoming from the Proxy server, and supports real-time query of slow logs and statistical reports over a period of time.

(10) Information statistics server The information statistics server regularly collects statistics on the number of user connections, QPS values, and process status of MySQL instances using RRDtool. The statistical results can be displayed visually on the Web interface, and the statistical results can also be used as a basis for future implementation. The basis for elastic resource allocation and automated MySQL instance migration.

(11) Yugong System Yugong System is a tool for full replication combined with bin-log analysis for incremental replication, which can achieve dynamic expansion, shrinkage and migration without downtime. 9. Describe the method of realizing master-slave backup in UMP system. Answer: The UMP system will create two MySQL instances for users, one is the master database and the other is the slave database. The two MySQL instances set each other as backup machines. Any updates that occur on either MySQL instance will be copied to other side. Once the host goes down, the Controller server will initiate master-slave switching and modify the mapping relationship; the downed master database will come online again after recovery processing, and updates will be copied from the slave database until the updates are completely consistent. The master-slave switching operation will be initiated again. 10. Describe the implementation method of reading and writing separation in UMP system. Answer: Since each user has two MySQL instances, namely the master database and the slave database, the master-slave database can be fully utilized to separate the user's read and write operations and achieve load balancing. The UMP system implements a read-write separation function that is transparent to users. When the entire function is turned on, the Proxy server responsible for providing users with access to MySQL database services will parse the SQL statement initiated by the user. If it is a write operation, it will directly If it is sent to the main library, if it is a read operation, it will be sent to the main library and the slave library for execution evenly. 11. What two methods does the UMP system use to achieve resource isolation? When multiple users share the same MySQL instance or multiple MySQL instances coexist on the same physical machine, in order to protect the security of user applications and data, resource isolation must be implemented, otherwise , excessive consumption of system resources by a user will seriously affect the operating performance of other users. Resource Isolation Method Application Scenario Implementation Method Using Cgroup to limit MySQL process resources is suitable for situations where multiple MySQL instances share the same physical machine. The maximum CPU usage, memory and IOPS that can be used by the user's MySQL process can be limited on the Proxy server side. Limiting QPS is suitable for situations where multiple users share the same MySQL instance. The Controller server monitors the resource consumption of the user's MySQL instance. If the quota is obviously exceeded, it notifies the Proxy server to limit the user's QPS by increasing the delay to reduce user concerns. System resource consumption

  1. Describe the three types of users in the UMP system. Answer: There are three types of users in the UMP system, namely users with small data volume and traffic, medium-sized users and users who need to divide databases and tables. Multiple small-scale users can share the same MySQL instance. For medium-sized users, each user has a dedicated MySQL instance. Users can adjust the memory space and disk space according to their own needs. If the user needs more resources, they can migrate to a server with free resources or a higher configuration. For users with sub-databases and tables, they will occupy multiple independent servers. MySQL instances, these instances can coexist on the same physical machine, or each instance can occupy a physical machine exclusively.

UMP implements resource scheduling through the migration of MySQL instances. With the help of the Yugong system developed by the middleware team of Alibaba Group, UMP can achieve dynamic expansion, shrinkage and migration without downtime. 13. How does the UMP system ensure data security? 1) SSL database connection. SSL (Secure Sockets Layer) is a security protocol that provides security and data integrity for network communications. It encrypts network connections at the transport layer. The Proxy server implements the complete MySQL client server protocol and can establish an SSL database connection with the client.

2) Data access IP whitelist. You can put the IP addresses that are allowed to access the cloud database into a "whitelist". Only IP addresses in the whitelist can access, and access from other IP addresses will be denied, thereby further ensuring account security.

3) Record user operation logs. All user operation records will be recorded to the log analysis server. By checking user operation records, hidden security vulnerabilities can be discovered.

4)  SQL interception. The Proxy server can intercept various types of SQL statements according to requirements, such as the full table scan statement "select *".

2023-04-04 00:02 Zhu Yaling submitted 2023-04-06 11:04 Zhu Yaling updated

Guess you like

Origin blog.csdn.net/qq_50530107/article/details/131260960