Road to Architect-Storage Architecture

Architects must not bypass storage design when doing architecture design. In this article, let's take a look at storage related knowledge. The main content includes storage usage, common protocols, linking methods, and typical architecture cases in distributed storage architecture.

Storage classification

Storage architecture

From an architectural perspective, classified storage can generally be divided into: centralized storage and distributed storage.

The main feature of "centralized storage" is that all data can be stored in one place. The remote terminals of offices in various places are connected to the central computer (host) through cables, ensuring that each terminal uses the same information.

Centralized storage is generally commercial and relatively expensive.

"Distributed storage" is to store data scattered on multiple independent devices, using a scalable system structure, using multiple storage servers to share the storage load, and using location servers to locate and store information, which not only improves the reliability of the system , Usability and access efficiency, and easy to expand.

Distributed storage is generally self-built in combination with management software.

Storage usage

From the perspective of users, storage can be classified into three categories: block storage, file storage, and object storage.

Usage classification

"Block storage:" The general manifestation is a volume or a hard disk. The main operation object is a disk, and the entire raw disk space is mapped to the host for use. In this way, the operating system needs to partition and format the mounted bare hard disk before it can be used. File sharing is not possible for block storage.

"File storage:" The general manifestation is directories and files. Data is stored and accessed in the form of files, organized according to the directory structure. This method also requires mounting. After mounting, it is a directory, and the files in it can be accessed directly; no formatting is required.

"Object storage:" The main operating object is an object, which is essentially a key-value pair storage system, which does not need to be mounted and is directly accessed through the application interface.

Storage Agreement

From the perspective of protocols, storage can be classified into NFS, CIFS, and ISCSI protocols.

"NFS (Network File System, Network File System)" is one of the current mainstream shared file systems on heterogeneous platforms, mainly used in Unix environments. By using NFS, users and programs can access files on remote systems as local files, so that each computer node can use online resources as easily as local resources. In other words, NFS can be used for remote access and sharing of network files in different types of computers, operating systems, network architectures, and transmission protocol operating environments. "For shared file storage."

"CIFS (Common Internet File System, Common Internet File System)" is mainly used in the NT/Windows environment. Its working principle is to let the CIFS protocol run on the TCP/IP communication protocol, so that Unix computers can be used by Windows on the network neighbors. The computer sees. "For shared file storage."

"ISCSI (Internet SCSI/SCSI over IP)" : Mainly used in Windows environment, suitable for TCP/IP communication protocol, it is the file organization format and data transmission method when transmitting files through TCP/IP network. "For data block storage."

Storage link method

From the perspective of storage links, storage can be classified into three categories: DAS, NAS, and SAN.

"DAS (Direct Attached Storage):" The
direct attached storage method is the same as our ordinary PC storage architecture. External storage devices are directly connected to the internal bus of the server. Data storage devices are part of the entire server structure. To access the resources on the storage device, it must go through the server.

DAS architecture

"NAS (Network Attached Storage):"
In the NAS storage structure, the storage system is no longer attached to a specific service period or client through the I/O bus, but is directly connected to the network through the network interface, and the user through the network access. NAS is actually a storage device with a "thin server", which functions like a dedicated file server instead of a traditional general-purpose server. Most functions are removed and only file system functions are provided for storage services.

NAS architecture

"SAN (Storage Area Network):"
SAN is a network-centric storage structure. Unlike ordinary Ethernet, SAN is located at the back end of the server and is built to connect storage devices such as servers, disk arrays, and tape libraries. High-performance "dedicated network (Fibre Channel)" .

SAN architecture

Mainstream storage vendors and products

Commercial storage vendors include: EMC, NetApp, DELL, Huawei, Sugon, etc.

Open source storage products include:

  • FastDFS (object)

  • Swift (object)

  • HDFS (object)

  • Lustre (block)

  • GlusterFS (file)

  • Ceph (block, file, object)

Distributed storage architecture

Centralized storage generally uses commercial software and the manufacturer is responsible for the installation and configuration. Here we mainly talk about the distributed storage architecture.

According to the management method of metadata, the storage architecture can be divided into two architecture modes: symmetric and asymmetric.

In the symmetrical architecture, each node has an equal role, and jointly manages and maintains metadata. The nodes perform operations such as information synchronization and mutual exclusion locks through a high-speed network. (The components installed on each node are the same)

In an asymmetric cluster file system, one or more nodes are responsible for managing metadata, and other nodes need to communicate with metadata nodes frequently to obtain the latest metadata such as directory listings, file attributes, and so on. (Metadata node and storage node are separated)

Next, let's take a look at the representative products under two different architecture modes: FastDFS and swift.

FastDFS typical architecture

FastDFS architecture

FastDFS uses an asymmetric architecture, which is divided into Tracker server and Storage server.

Tracker server, as the central node, manages the cluster topology. Its main function is load balancing and scheduling.

Storage server is organized by volume. A volume contains multiple storage machines. The servers in each volume are in a mirror relationship, and data is backed up by each other. The storage space is based on the storage with the smallest volume content, so it is recommended to have more storage machines in the group. Each storage is configured the same as possible to avoid waste of storage space.

Swift typical architecture

swift architecture

Swift adopts a completely symmetrical, resource-oriented distributed system architecture design, and all components are scalable.

Swift divides the entire storage into three levels: Account, Container and Object.

summary

This article introduces you to storage-related knowledge. The content of storage is relatively boring. You can understand it, and focus on distributed storage architecture.

If this article is helpful to you,

Don’t forget to give me a triple:

Like, repost, comment

See you next time!

Favorite  equal to the white prostitute , thumbs up  is the truth!

End

Dry goods sharing

Here is a small gift for everyone, follow the official account, enter the following code, you can get the Baidu network disk address, no routines!

001: "A must-read book for programmers"
002: "Building back-end service architecture and operation and maintenance architecture for small and medium-sized Internet companies from scratch"
003: "High Concurrency Solutions for Internet Enterprises"
004: "Internet Architecture Teaching Video"
006: " SpringBoot Realization of
Ordering System" 007: "SpringSecurity actual combat video"
008: "Hadoop actual combat teaching video"
009: "Tencent 2019 Techo Developer Conference PPT"

010: WeChat exchange group

Recent hot articles top

1. Solutions for automatic renewal of JWT Token

2. Do not understand ETL yet, take a look at this article?

3. The road of architects-server hardware literacy

4. Architect's Road-Microservice Technology Selection

5. RocketMQ Advanced-Transaction Message

I knew you were "watching"

Guess you like

Origin blog.csdn.net/jianzhang11/article/details/108675600