OpenStack Object StorageSwift

1. Introduction to Swift

1.1.What is Swift?

Swift does not require RAID (redundant array of disks) and has no central unit or master control node. Swift introduces consistent hashing technology and data redundancy technology at the software level, sacrificing a certain degree of data consistency to achieve high availability HA and scalability. It supports multi-tenant mode, container and object read and write operations, and is suitable for solving Internet problems. Unstructured data storage issues in application scenarios.

(1) Highly available, distributed object storage service.

(2) Eventual consistency model.

(3) Suitable for solving the problem of unstructured data storage in Internet application scenarios (object storage),

(4) Built on relatively cheap standard hardware storage infrastructure (built on X86 architecture hardware)

1.2. Structured data and unstructured data

(1) Structured stored data refers to data organized according to clearly defined data models and formats, usually presented in tabular form, such as tables in relational databases. This data type has a clear data schema and fixed fields, making the data easy to organize, store and query. For example, common structured data includes dates, numbers, text, and various identifiers.

(2) Unstructured stored data refers to data that does not have a clear data model or format and cannot be easily presented or organized in tabular form. This data may include files in various formats such as text documents, audio files, videos, images, emails, etc.

1.3. Features of Swift storage:

(1) Based on REST API, friendly access method.

(2) Data is evenly distributed in the system, with high reliability, efficient use of resources, and easy expansion. .

(3) Hardware-independent, supporting a variety of standard hardware, without the need to customize specialized hardware equipment.

(4) There is no central database, and there is no single point performance bottleneck or single point of failure.

(5) The three-level storage structures of Account/Container/Object do not require a file system and have N (3) copies, so the data is highly reliable.

1.4. Why is there no central database in Swift?

In traditional relational databases, there is usually a central database, that is, all data is stored on a centralized database server, and the server is responsible for managing and processing all data operation requests. Or use multi-region database storage, and the central database stores the metainformation of the data, which is the actual location of the data storage. However, this method also has centralization and relies on the central database, and the metadata in the Swift storage method is also ring and account. \container\object. It is also distributed itself.

2. Swift principles and architecture

2.1. Consistent hashing

(1) Objects can be evenly distributed to virtual nodes in the virtual space through calculation.

(2) The amount of data that needs to be moved can be greatly reduced when adding or deleting nodes.

(3) During actual deployment, it is necessary to carefully calculate the appropriate number of virtual nodes to achieve a balance between storage space and work load.

balance between.

Application of consistent hashing in swift: When we store objects, we will use Account/container/object/ as the key value to hash and store it in different nodes to achieve the purpose of distributed storage. If key values ​​are stored in order, there will be hot spots in IO, and the entire storage pool can only exert the I0 performance of a certain hard disk. At the same time, the hash is too large, causing the CPU addressing time to be too long. Shift right by M bits. M defaults to 3, which is divided into 8 parts equally. Even using 8 nodes can help us reduce the number of addressing times and improve performance.

2.2.Data consistency model

N: The total number of copies of the data. W: The number of copies for which the write operation is confirmed. R: The number of copies for the read operation.

(1) Strong consistency: R + W > N, to ensure that the read and write operations on the replicas will intersect, ensuring that the latest version can be read.

(2) Weak consistency: R + W <= N. If the copy sets of read and write operations do not intersect, dirty data may be read; suitable for scenarios with low consistency requirements.

Swift's default configuration is N=3, W=2, R=1 or 2, that is, each object will have 3 copies, and these copies will be stored on nodes in different areas as much as possible; W=2 means that at least 2 need to be updated. The copy is considered to be successfully written; when R=1, it means that a certain read operation will return immediately if it succeeds. In this case, the old version may be read (weak consistency model); when R=2, it needs to pass the read operation Add the x-newest=true parameter to the request header to read the metadata information of the two copies at the same time, and then compare the timestamps to determine which version is the latest (strong consistency model).

2.3.Ring ring

(1) Container Ring: Container Ring is used to manage container Containers in the Swift storage system. A container is a logical collection unit for organizing and managing objects (Objects) in Swift, similar to a folder. Each container has a unique ID, and the Container Ring records the location information of each container in the Swift storage cluster, such as which devices the container is stored on and the number of replicas. When the client needs to access the object in the container, Swift locates the corresponding storage device based on the information of the Container Ring and returns the relevant data.

(2) Account Ring: Account Ring is responsible for managing Accounts in the Swift storage system. Accounts are logical collection units used by Swift to organize and manage containers, similar to tenants. Each account also has a unique ID, and the Account Ring records the location information of each account in the Swift storage cluster, including which devices the account is stored on and the number of replicas. Through Account Ring, Swift can quickly find the corresponding storage device based on the account ID and perform related operations.

(3) The object ring records the location information of each object in the Swift storage cluster, including which devices the object is stored on and the number of copies. When the client needs to access a specific object, Swift will locate the corresponding storage device based on the information in the object ring and return the required data.

When a user wants to check a piece of data, he or she needs to search for the container in the account, the object in the container, and the data block corresponding to the object on the hard disk. Mapping of virtual nodes and devices. Note that different copies of the same partition cannot be stored in the same area.

2.4.Swift API

Swift provides a HITP-based REST service interface through Proxy Server to perform CRUD and other operations on accounts, containers, and objects.

3. Swift’s architecture

Swift divides the entire storage into three levels: Account, Container and Object. The Account itself here is just a storage area and does not represent the "account" in the authentication system, but usually each Account corresponds to a tenant. This is why when we, as an OpenStack user, use Swift, we can only see Container and Object, but not Account. If this user switches to another tenant, he will see that it belongs to another tenant and is also another Container and Object under Account.

おすすめ

転載: blog.csdn.net/m0_73901077/article/details/134804777