Cloud database knowledge learning - cloud database products, cloud database system architecture

1. Cloud database products

1.1. Overview of cloud database vendors

        Cloud database providers are mainly divided into three categories.
        ① Traditional database vendors, such as Teradata, Oracle, IBM DB2, and Microsoft SQL Server.
        ② Cloud providers involved in the database market, such as Amazon, Google, Yahoo!, Ali, Baidu, Tencent, etc.
        ③ Emerging vendors, such as Vertica, LongJump, and EnterpriseDB.

        Common cloud database products on the market are shown in Table 6-3.

1.2. Amazon’s cloud database products

        Amazon is a pioneer in the cloud database market. In addition to providing the famous S3 storage service and EC2 computing service, Amazon also provides cloud-based database services SimpleDB and Dynamo.

        SimpleDB is a distributed data storage system developed by Amazon that can be queried. It is the first NoSQL database service on AWS (Amazon Web Service) and integrates a large amount of Amazon's AWS infrastructure. As the name suggests, SimpleDB is intended to be used as a simple database, and its storage elements (attributes and values) are determined by an id field to determine the location of the row. This structure can satisfy users' basic read, write and query functions. SimpleDB provides an easy-to-use API to quickly store and access data. However, SimpleDB is not a relational database. Traditional relational databases use row storage, while SimpleDB uses "key/value" storage. It mainly serves Web developers who do not need a relational database. However, SimpleDB has some obvious defects, such as single table limitation, unstable performance, and only supports eventual consistency.

        Dynamo absorbs the essence of SimpleDB and other NoSQL database design ideas and is designed for more demanding applications that require scalable data storage and more advanced data management functions. Dynamo uses "key/value" storage. The data it stores is unstructured data and does not recognize any structured data. Users need to complete the parsing of the values ​​themselves. The key in the Dynamo system is not stored as a string, but as md5_key (obtained after conversion through the md5 algorithm). Therefore, it can only be accessed based on the key and does not support querying. DynamoDB uses solid-state drives for constant, low-latency read and write times and is designed to scale to large capacities while maintaining consistent performance, albeit with a more rigorous query model.

        Amazon RDS (Amazon Relational Database Service) is a web service developed by Amazon that allows users to build and operate relational databases in a cloud environment (can support databases such as MySQL and Oracle). Users need to focus on application and business-level content without spending too much time on tedious database management work.

        In addition, Amazon has developed good cooperation with other database vendors. Amazon EC2 application hosting service can already deploy many kinds of database products, including mainstream database platforms such as SQL Server, Oracle 11g, MySQL and IBM DB2, as well as other database products, such as EnterpriseDB. As a scalable hosting environment, developers can develop and host their own database applications in the EC2 environment.

1.3. Google’s cloud database products

        Google Cloud SQL is a cloud database based on MySQL launched by Google. The benefits of using Cloud SQL are obvious. All transactions are in the cloud and managed by Google. Users do not need to configure or troubleshoot errors, they can just rely on it to carry out their work. . Because the data is replicated across Google's multiple data centers, it's always available. Google will also provide import or export services to facilitate users to bring databases into or out of the cloud. Google uses very familiar MySQL, a traditional MySQL database environment with JDBC support (for Java-based App Engine applications) and DB-API support (for Python-based App Engine applications), so most applications don't need to go through It can be run with multiple debugs, and the data format is very familiar to most developers and administrators. Another benefit of Google Cloud SQL is its integration with Google App Engine.

1.4. Microsoft’s cloud database products

        In March 2008, Microsoft provided the relational database function of SQL Server through SQL Data Service (SDS), making Microsoft the first large database vendor in the cloud database market. Since then, Microsoft has expanded SDS functionality and renamed it SQL Azure. Microsoft's Azure platform provides a collection of Web services that allow users to create, query and use SQL Server databases in the cloud through the network. The location of the SQL Server server in the cloud is transparent to users. This is an important milestone for cloud computing. SQL Azure has the following features.

  • ① It is a relational database. Supports the use of TSQL (Transact Structured Query Language) to manage, create and operate cloud databases.
  • ② Support stored procedures. Its data types and stored procedures are very similar to traditional SQL Server, so applications can be developed locally and then deployed to the cloud platform.
  • ③ Supports a large number of data types. Contains almost all typical SQL Server 2008 data types.
  • ④ Support transactions in the cloud. Local transactions are supported, but distributed transactions are not supported.

        The architecture of SQL Azure includes a virtual machine cluster, which can dynamically increase or decrease the number of virtual machines according to changes in workload, as shown in Figure 6-2. Each virtual machine SQL Server VM (Virtual Machine) is installed with the SQL Server2008 database management system to store data in a relational model. Usually, a database will be distributed and stored in 3 to 5 SQL Server VMs. Each SQL Server VM is installed with SQL Azure Fabric and SQL Azure Management Services. The latter is responsible for data replication of the database to ensure the basic high availability requirements of SQL Azure. SQL Azure Fabric and management services in different SQL Server VMs will exchange monitoring information with each other to ensure the monitorability of the overall service.

1.5. Other cloud database products

        Yahoo! PNUTS is a massively parallel, geographically distributed database system developed for web applications. It is an important part of the Yahoo! cloud computing platform. Vertica Systems released cloud database in 2008. 10Gen's Mongo and AppJet's AppJet database also provide corresponding cloud database versions. IBM-backed EnerpriseDB also offers a cloud database running on Amazon EC2. LongJump, a new company competing with Salesforce, has launched a cloud database product based on the open source database PostgreSQL. Intuit QuickBase also offers its own line of cloud databases. The Relational Cloud developed by MIT can automatically distinguish load types and allocate similar types of loads to the same data node. It also adopts a graph-based data partitioning strategy, which is also very good for complex transactional loads. Scalability, in addition, it supports running SQL queries on encrypted data. Alibaba Cloud RDS is a relational database service provided by Alibaba Cloud, which leases database instances directly running on physical servers to users. Baidu Cloud Database can support distributed relational database services (based on MySQL), distributed non-relational database storage services (based on MongoDB), and key/value non-relational database services (based on Redis).

2. Cloud database system architecture

        There are great differences in the system architecture adopted by different cloud database products. The following uses the UMP (Unified MySQL Platform) system developed by the core system database team of Alibaba Group as an example.

2.1. UMP system overview

        The UMP system is a low-cost and high-performance MySQL cloud database solution, and the key modules are implemented in the Erlang language. Developers apply for MySQL instance resources from the platform through the network, and the platform provides a single entrance to access data. The UMP system divides various server resources into resource pools and allocates resources to MySQL instances based on resource pools. The system contains a series of components that work together to provide a series of services such as master-slave hot backup, data backup, migration, disaster recovery, read-write separation, and sharding of databases and tables in a transparent manner to users. The system is divided into three types of users, namely users with small data volume and traffic, medium-sized users, and users who need to split databases and tables. Multiple small-scale users can share the same MySQL instance, medium-sized users can exclusively use one MySQL instance, and multiple MySQL instances of users who need to divide databases and tables share the same physical machine. Through these methods, resource virtualization is achieved and the cost is reduced. overall cost. UMP implements resource isolation, on-demand allocation and restriction of CPU, memory and IO resources through two methods: "using Cgroup to limit MySQL process resources" and "limiting QPS (Query Per Second) on the Proxy server side"; at the same time, it also supports Dynamically expand and shrink capacity according to the development of user business without affecting the provision of data services. The system also comprehensively uses SSL database connection, data access IP whitelist, user operation log recording, SQL interception and other technologies to effectively protect user data security.

        In general, the UMP system architecture design follows the following principles.

  • ① Maintain a single external entrance to the system and maintain a single resource pool within the system.
  • ② Eliminate single points of failure and ensure high availability of services.
  • ③ Ensure that the system has good scalability and can dynamically add and delete computing and storage nodes.
  • ④ Ensure that the resources allocated to users are also elastic and scalable, and resources are isolated from each other to ensure the security of applications and data.

2.2. UMP system architecture

        The UMP system architecture is shown in Figure 6-3. The roles in the UMP system include Controller server, Proxy server, Agent server, Web console, log analysis server, information statistics server, and Yugong system; the dependent open source components include Mnesia, LVS , RabbitMQ and Zookeeper.

2.2.1.Mnesia

        Mnesia is a distributed database management system suitable for telecommunications and other Erlang applications that require continuous operation and soft real-time characteristics. It is part of the control system platform for building telecommunications applications - the Open Telecom Platform (Open Telecom Platform OTP). Erlang is a structured, dynamically typed programming language with built-in parallel computing support, which is very suitable for building distributed, soft real-time parallel computing systems. Applications written in Erlang usually consist of thousands of lightweight processes at runtime and communicate with each other through message passing. Erlang's inter-process context switching is much more efficient than C programs. Mnesia and the Erlang programming language are tightly coupled. The biggest advantage is that when operating data, there will be no impedance mismatch problems caused by different data formats used in the database and the programming language. Mnesia supports transactions, supports transparent data sharding, uses two-stage locks to implement distributed transactions, and can be linearly extended to at least 50 nodes. Mnesia's database schema (schema) can be dynamically reconfigured at runtime, and tables can be migrated or replicated to multiple nodes to improve fault tolerance. These characteristics of Mnesia make it used to provide distributed database services when developing cloud databases.

2.2.2.RabbitMQ

        RabbitMQ is an industrial-grade message queue product developed in Erlang (similar in function to IBM's message queue product IBM WEBSPHERE MQ). It is used as a message transmission middleware to achieve reliable message transmission. The communication between the nodes in the UMP cluster does not need to establish a special connection, but is realized by reading and writing queue messages.

2.2.3.Zookeeper

        Zookeeper is an efficient and reliable collaborative working system that provides basic services such as distributed locks (such as unified naming service, status synchronization service, cluster management, management of distributed application configuration items, etc.), and is used to build distributed applications and alleviate Coordination tasks undertaken by distributed applications (for the working principle of Zookeeper, please refer to relevant books or online materials). In the UMP system, Zookeeper mainly plays three roles.

        (1) As a global configuration server. The UMP system requires multiple servers to run, and some configuration items of the application systems they run are the same. If you want to modify these same configuration items, you must modify them on multiple servers at the same time. This is not only troublesome, but also error-prone. Therefore, the UMP system completely hands over this type of configuration information to Zookeeper for management, saves the configuration information in a directory node of Zookeeper, and then sets all servers that need to be modified to monitor this directory node, that is, monitoring the status of the configuration information. , Once the configuration information changes, each server will receive a notification from Zookeeper, and then obtain new configuration information from Zookeeper.

        (2) Provide distributed locks. Multiple Controller servers are deployed in the UMP cluster. In order to ensure the correct operation of the system, some operations can only be executed by one server at a certain time and cannot be executed at the same time. For example, after a MySQL instance fails, a master-slave switchover is required, and another normal server replaces the currently failed server. If all the Controller servers track and initiate the master-slave switchover process at this time, then the entire The system will enter a state of chaos. Therefore, at the same time, a "general manager" must be elected from multiple Controller servers in the cluster, and this "general manager" is responsible for initiating various system tasks. Zookeeper's distributed lock function can help select a "general manager" to manage the cluster.

        (3) Monitor all MySQL instances. When a server running a MySQL instance in the cluster fails, it must be detected in time, and then other normal servers can be used to replace the failed server. The UMP system uses Zookeeper to monitor all MySQL instances. Each MySQL instance will create a temporary directory node on Zookeeper when it starts. When a MySQL instance hangs, this temporary directory node will also be deleted. The background monitoring process can capture this change. Thereby knowing that this MySQL instance is no longer available.

2.2.4.LVS

        LVS (Linux Virtual Server) is a Linux virtual server, which is a virtual server cluster system. The LVS cluster uses IP load balancing technology and content-based request distribution technology. The scheduler is the only entry point of the LVS cluster system. The scheduler has a very good throughput rate and evenly transfers requests to different servers for execution. The scheduler automatically shields server failures, thus forming a group of servers into a high-performance cluster. High-performance, highly available virtual servers. The structure of the entire server cluster is transparent to the client, and there is no need to modify the client and server programs. The UMP system uses LVS to achieve load balancing within the cluster.

2.2.5. Controller server

        The Controller server provides various management services to the UMP cluster, realizing cluster member management, metadata storage, MySQL instance management, fault recovery, backup, migration, expansion and other functions. A set of Mnesia distributed database services runs on the Controller server, which stores various system metadata, mainly including configuration and status information of cluster members, users, and the mapping relationship between user names and back-end MySQL instance addresses (or called "routing table") etc. When other server components need to obtain user data, they can send requests to the Controller server to obtain data. In order to avoid single points of failure and ensure high availability of the system, multiple Controller servers are deployed in the UMP system, and then the distributed lock function of Zookeeper helps select a "general manager" who is responsible for the scheduling and monitoring of various system tasks.

2.2.6.Web console

        The web console provides users with a system management interface.

2.2.7. Proxy server

        The Proxy server provides users with access to MySQL database services. It fully implements the MySQL protocol. Users can use existing MySQL clients to connect to the Proxy server. The Proxy server obtains the user's authentication information and resource quota restrictions through the user name [such as QPS, IOPS (I/O Per Second), maximum number of connections, etc.], and the address of the background MySQL instance, and then the user's SQL query request will be forwarded to the corresponding MySQL instance. In addition to the basic functions of data routing, the Proxy server also implements many important functions, including shielding MySQL instance failures, separation of read and write, sharding databases and tables, resource isolation, recording user access logs, etc.

2.2.8. Agent server

        The Agent server is deployed on the machine running the MySQL process. It is used to manage the MySQL instance on each physical machine, perform master-slave switching, creation, deletion, backup, migration and other operations. It is also responsible for collecting and analyzing the statistics and slowdown of the MySQL process. Query log (Slow Query Log) and bin-log.

2.2.9.Log analysis server

        The log analysis server stores and analyzes user access logs incoming from the Proxy server, and supports real-time query of slow logs and statistical reports over a period of time.

2.2.10. Information statistics server

        The information statistics server regularly collects statistics on the number of user connections, QPS values, and process status of MySQL instances using RRDtool. The statistical results can be displayed visually on the Web interface, and the statistical results can also be used to achieve flexible resource allocation and automation in the future. The basis for migrating MySQL instances.

2.2.11. Foolish Old Man System

        Yugong system is a tool that combines full replication with bin-log analysis for incremental replication, which can achieve dynamic expansion, shrinkage and migration without downtime.

2.3. UMP system functions

        The UMP system is built on a large cluster. Through the collaborative work of multiple components, the entire system realizes disaster recovery, read-write separation, database and table sub-database, resource management, resource scheduling, resource isolation and Data security features.

2.3.1. disaster recovery

        The cloud database must provide users with a database connection that is always available. When a MySQL instance fails, the system must automatically perform fault recovery. All fault handling processes are transparent to users, and users will not be aware of everything happening in the background.

        In order to achieve disaster recovery, the UMP system will create two MySQL instances for each user, one is the master database and the other is the slave database, and the two MySQL instances set each other as backup machines. If any MySQL instance occurs updates will be copied to each other. At the same time, the Proxy server can ensure that only data is written to the main database.

        The status of the main database and slave database is maintained by Zookeeper. Zookeeper can monitor the status of each MySQL instance in real time. Once the main database is down, Zookeeper can immediately sense it and notify the Controller server. The Controller server will start the master-slave switching operation, modify the mapping relationship between the user name and the back-end MySQL instance address in the routing table, and mark the main database as unavailable. At the same time, with the help of the message queue middleware RabbitMQ, it will notify all Proxy servers to modify the user Mapping relationship from name to backend MySQL instance address. After this series of operations, the master-slave switch is completed, and the user name will be assigned to a new MySQL instance that can be used normally, and all this is completely transparent to the user himself.

        The main database that went down needs to be online again after recovery processing. During the downtime and recovery of the master database, the slave database may have been updated multiple times. Therefore, when the master database recovers, it will copy all the updates in the slave database to itself. When the database status of the master database is about to reach the same state as the slave database, the Controller server will order the slave database to stop updating and enter an unwritable state. , prohibiting users from writing data. At this time, users may feel that they cannot write data for a short period of time. When the master database is updated to the same state as the slave database, the Controller server will initiate a master-slave switch operation, mark the master database as available in the routing table, and then notify the Proxy server to switch the write operation back to the master database. User write operations can continue to be executed, and then the slave library can be modified to a writable state.

2.3.2. read-write separation

        Since each user has two MySQL instances, namely the master database and the slave database, the master and slave databases can be fully utilized to separate the user's read and write operations and achieve load balancing. The UMP system implements a read-write separation function that is transparent to users. When the entire function is turned on, the Proxy server responsible for providing users with access to MySQL database services will parse the SQL statement initiated by the user. If it is a write operation, it will directly If it is sent to the main library, if it is a read operation, it will be sent to the main library and the slave library for execution evenly. However, a situation may occur, that is, the user has just written data to the main database, and before the data has been copied to the slave database, the user reads the data from the slave database, resulting in the user either not being able to read the data, or reading the data from the slave database. to an older version of the data. In order to avoid this situation from happening, the UMP system will start a timer after each user write operation. If the user reads data within 300ms of the timer being turned on, whether it is reading the data just written or other data, it will It is forcibly distributed to the main library to perform read operations. Of course, in actual applications, the UMP system allows the setting value of 300ms to be modified, but generally speaking, 300ms can ensure that the data is copied to the slave database after being written to the master database.

2.3.3. Sub-database and sub-table

        UMP supports Shard/Horizontal Partition that is transparent to users, but users need to specify the type as multi-instance and set the number of instances when creating an account. The system will create multiple groups of MySQL instances based on user settings. In addition, users also need to set their own rules for sharding databases and tables. For example, they need to determine the partition field, that is, which field should be used for sharding databases and tables, and how the values ​​in the partition fields are mapped to different MySQL instances.

        When using sub-databases and sub-tables, the system processes user queries as follows: First, the Proxy server parses the user SQL statement and extracts the information needed to rewrite and distribute the SQL statement; second, rewrites the SQL statement to obtain multiple Target the sub-statements of the corresponding MySQL instance, and then distribute the sub-statements to the corresponding MySQL instance for execution; finally, receive the SQL statement execution results from each MySQL instance, and merge them to obtain the final result.

2.3.4. Resource management

        The UMP system uses a resource pool mechanism to manage computing resources such as CPU, memory, disk, etc. on the database server. All computing resources are placed in the resource pool for unified allocation. The resource pool is the basic unit for allocating resources to MySQL instances. All servers in the entire cluster will be divided into multiple resource pools based on factors such as their model and computer room, and each server will be added to the corresponding resource pool. For each specific MySQL instance, the administrator will specify the resource pool where the main database and slave database are located for the MySQL instance based on factors such as which computer rooms the application is deployed in, which computing resources are required, and then the system's instance management service will be based on the load. Based on the principle of balancing, a lightly loaded server is selected from the resource pool to create a MySQL instance. On the basis of resource pool division, UMP also uses Cgroups within each server to further refine resources, thereby limiting the upper limit of resources used by each process group and ensuring mutual isolation between process groups.

2.3.5. Resource Scheduling

        There are three types of users in the UMP system, namely users with relatively small data volume and traffic, medium-sized users, and users who need to split databases and tables. Multiple small-scale users can share the same MySQL instance. For medium-sized users, each user has a dedicated MySQL instance. Users can adjust the memory space and disk space according to their own needs. If the user needs more resources, they can migrate to a server with free resources or a higher configuration. For users with sub-databases and tables, they will have multiple independent MySQL instances. These instances can coexist on the same physical machine, or each instance can occupy a physical machine exclusively.

        UMP implements resource scheduling through the migration of MySQL instances. With the help of the Yugong system developed by the middleware team of Alibaba Group, UMP can achieve dynamic expansion, shrinkage and migration without downtime.

2.3.6.Resource isolation

        When multiple users share the same MySQL instance or multiple MySQL instances coexist on the same physical machine, in order to protect the security of user applications and data, resource isolation must be implemented. Otherwise, excessive consumption of system resources by a user will seriously affect Operational performance of other users. The UMP system adopts two resource isolation methods shown in Table 6-4.

2.3.7.Data Security

        Data security is the key to allowing users to use cloud database products with confidence, especially for enterprise users. The database stores a lot of business data, some of which are commercial secrets. Once leaked, it will cause losses to the enterprise. The UMP system has designed multiple mechanisms to ensure data security.

        (1) SSL database connection. SSL (Secure Sockets Layer) is a security protocol that provides security and data integrity for network communications. It encrypts network connections at the transport layer. The Proxy server implements the complete MySQL client/server protocol and can establish an SSL database connection with the client.
        (2) Data access IP whitelist. You can put the IP addresses that are allowed to access the cloud database into a "whitelist". Only IP addresses in the whitelist can access, and access from other IP addresses will be denied, thereby further ensuring account security.

        (3) Record user operation logs. All user operation records will be recorded to the log analysis server. By checking user operation records, hidden security vulnerabilities can be discovered.

        (4) SQL interception. The Proxy server can intercept various types of SQL statements according to requirements, such as the full table scan statement "select *".

Guess you like

Origin blog.csdn.net/java_faep/article/details/132755190