Learning about Cloud Computing (4)

4. Basic knowledge of storage in cloud computing

1. Storage Architecture in Cloud Computing Virtualization

insert image description here

①Virtualized storage

In the virtualized storage architecture, the bottom layer is the physical disk.

The underlying hardware constitutes a storage pool, which is divided into NAS storage and SAN storage; NAS storage requires a file system; SAN storage needs to logically divide the storage pool to generate logical volumes, and then add a file system to the logical volumes; whether it is NAS storage or SAN storage will eventually generate a shared directory, and the virtual disk will be corresponding to a file.

The essence of virtualization is to turn a physical machine into a file or folder, which is stored in a shared directory, and the file or folder will correspond to a virtual disk.

② Non-virtualized storage

In a non-virtualized storage architecture, the bottom layer is the physical disk.

The hard disk that comes with the server can be used for both virtualized storage and non-virtualized storage.

The hard disk that comes with the server needs to be logically divided to generate logical volumes; if it is distributed storage, a distributed storage pool will be generated, and then logically divided to generate logical volumes; logical volumes do not need to share directories, and are directly converted into virtual disks of virtual machines.

2. Physical disk type

2.1 SATA disk

insert image description here

The full name of SATA is Serial Advanced Technology Attachment, and the hard disk with SATA (Serial ATA) port is also called serial hard disk. The speed of a common SATA disk is 7200 rpm. SATA adopts a serial connection mode, and the serial ATA bus uses an embedded clock signal, which has stronger error correction capabilities. Compared with the past, its biggest difference is that it can check the transmission instructions (not only data), and if errors It will be corrected automatically, which greatly improves the reliability of data transmission. The serial interface has the advantages of simple structure and support for hot plugging.

The storage capacity of SATA disks is higher than that of SAS disks, and the price is lower than that of SAS disks.

2.2 SAS disk

insert image description here

SAS (Serial Attached SCSI) is Serial Attached SCSI, which is a new generation of SCSI technology. Like SATA hard disk, it uses serial technology to obtain higher transmission speed, and improves internal space by shortening the connection line. SAS is a new interface developed after the parallel SCSI interface. The designer of this interface aims to improve the performance, usability and expandability of the storage system, and provide compatibility with SATA hard drives. A common SAS disc rotates at 15,000 rpm. Due to the more advanced interface design and high speed, the read and write speed of SAS disks is higher than that of SATA disks.

SAS disks are mainly used for high data throughput, low latency, and high reliability applications, and are usually used for enterprise-level storage.

2.3NL-SAS disk

insert image description here

NL-SAS (Line SAS) adopts SAS disk interface and SATA disk body. The rotation speed of NL-SAS hard disk is only 7200 rpm, so the performance is worse than that of SAS hard disk. However, due to the use of the SAS interface, the addressing and speed have been improved.

The performance of NL-SAS disk is higher than that of SATA disk, the capacity is larger than that of SAS disk, and the price is between the two.

2.4 SSD disk

insert image description here

SSD (Solid State Disk), solid state hard drive, is a hard disk made of solid-state electronic memory chip arrays, consisting of a control unit and a storage unit (FLASH chip, DRAM chip). SSD is exactly the same as ordinary hard disk in terms of interface specifications and definitions, functions and usage methods, and is completely consistent with ordinary hard disks in terms of product shape and size. Although SSD disks have the characteristics of fast reading and writing, light weight, low energy consumption, and small size that traditional machine hard disks do not have, their service life is limited and the price is relatively high.

2.5 Physical disk performance price comparison

insert image description here

3. Centralized storage and distributed storage

3.1 Centralized storage

Centrally store the hard disks in the hard disk enclosure, perform RAID (Redundant Array of Independent Disks) operations on the disks to form a resource pool, and use the resource pool for the host.

RAID is the acronym for Redundant Array of Independent Disk in English, which means in Chinese: Redundant Array of Independent Disks. In layman's terms, it is to form multiple hard disks into an independent disk array for management.

RAID uses multiple disks to read and write in parallel to improve data read and write speed; it has parity check and hot backup technology to ensure that data will not be lost and improve data security.

insert image description here

The commonly used RAID types are: RAID 0, RAID 1, RAID 5, RAID 6, RAID 01, and RAID 10.

The less commonly used RAID types are: RAID 2, RAID 3, RAID 4, RAID 7, RAID 50, and RAID 53.

RAID 0: Two or more hard disks process data at the same time to increase read and write speed; if one hard disk fails, all data will be invalid, and data security cannot be guaranteed.

RAID 1: Copy the data into two copies and store them on two different hard disks at the same time, which ensures the security of the data and solves the problem of read and write speed to a certain extent; it will waste hard disk space.

RAID 5: Add parity check technology, two hard disks store different data, one hard disk stores the check value, and the data of the other disk can be calculated according to any two hard disks, ensuring data security and solving the problem of read and write speed ;Once the two hard disks are damaged, the data cannot be recovered.

RAID 6: Two hard disks store different data, and the two hard disks store different verification values. If any two hard disks are damaged, the data can be recovered through the other two hard disks. Security is higher than RAID 5, and resource utilization is lower than RAID 5.

3.2 Centralized storage type

insert image description here

A SAN system is a Storage Area Network, and a NAS system is a Network Attached Storage.

① Similarities between SAN and NAS

Both SAN and NAS systems are redundant storage systems that use RAID.

A redundant storage system is able to recover after the failure of one or more components, making it more stable than other types of storage. SAN and NAS solutions are very useful for those who need to store large amounts of data and need to be able to access this data stably and reliably.

②The difference between SAN and NAS

  • **SAN storage devices are connected via fiber optics, while NAS storage devices are connected via TCP/IP. **For this reason, SAN is usually used for advanced solutions, while NAS solutions are more accessible for home users or small businesses. In order to connect through the SAN, the device must be able to use SCSI Fiber Channel. In contrast, NAS is relatively simple, and anything can be connected to the NAS solution through Ethernet.

  • **SAN storage devices access data blocks, while NAS storage devices access individual files. **Depending on performance needs, either option may be preferable, it all depends on the data and the architecture of the system. For advanced applications that are data and resource intensive, block data may be preferable. But for general stored data, NAS may be more straightforward and can lead to better performance.

  • **SAN storage devices connect multiple storage devices, while NAS storage devices operate as a single dedicated device. **SAN solutions essentially create a group of storage devices that all operate on the same network. NAS arrays, on the other hand, are stored on a single device. Functionally, this means that the two operate in very different ways: SAN is primarily hardware-dependent, while NAS is primarily network-dependent.

  • SAN storage provides raw devices, which look like empty hard disks on the host side; NAS storage has a file system, which looks like a directory on the host side.

3.3 Distributed storage

insert image description here

The real data of the physical host may be stored in centralized storage, and multiple hosts can share data. In order not to cause waste, the remaining hard disks are extracted from the idle hard disks on each host, and a resource pool is formed by using the copy mechanism technology for all hosts to use. .

3.4 Copy mechanism

①Data writing

insert image description here

Taking the three-copy form as an example, after data is written, the data will be written into the distributed storage pool, and the data will be copied into three copies when the disk is placed, and each copy of the data will be saved on a different hard disk

②Data reading

image.png

Only one copy of data is read, and the second and third copies of data are read only when the disk is damaged. Using the copy mechanism can effectively guarantee the security of data.

3.5 Common Distributed Storage Products

ceph, hadoop-HDFS, Huawei Cloud fusionStorage, VMware-vSAN

4. Virtualized storage and non-virtualized storage

4.1 Transformation path of virtualized storage in cloud computing

image.png

4.2 Non-virtualized storage conversion path in cloud computing

image.png

4.3 The difference between virtualized storage and non-virtualized storage

All virtualized storage and computing clusters use storage with a file system; non-virtualized storage and computing clusters do not use storage with a file system, but are formatted by the upper-layer operating system (the operating system of the virtual machine). change

4.4 Relationship between RAID and LUN

RAID is composed of several hard disks, which is equivalent to a large physical volume composed of multiple hard disks as a whole. Physical volumes cannot be directly used by hosts. Based on physical volumes, one or more logical units can be created according to a specified capacity. These logical units are called LUNs (Logical Unit Numbers), which can be used as basic block devices mapped to hosts.

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-h3oA3z4C-1689238887332) (C:\Users\C1YAS0\AppData\Roaming\Typora\typora-user-images\ image-20230713161854929.png)]

The process of creating a LUN:

image.png

4.5 File system

Common file systems: virtualized cluster file system, NAS storage file system, operating system file system

The process of mapping files to disk:

image.png

Formatting is the process of forming file system blocks.

The logical area corresponding to the LVM is found through the file system block, and the logical area records which sector and which track the file can be placed on the disk. Through the file system, the application and search of files can be realized.

5. Virtual machine disk introduction

From the user's perspective, the virtual disk is no different from the physical machine disk, but from the administrator's perspective, it is a file.

Common virtual machine disk formats:

Virtual Machine Disk File Format Support manufacturers
RAW Common to all manufacturers
VMDK VMware
VHD Microsoft Hyper-V, Huawei FusionCompute
QCOW Format specific to QEMU or KVM virtualization platforms
IS Format specific to QEMU or KVM virtualization platforms
VDI Oracle

6. Storage features of Huawei virtualization products

6.1 Storage Architecture of Huawei Virtualization Products

image.png

6.2 Huawei Virtual Disk Features

①Type

  • Normal: The virtual machine disk is only provided for this virtual machine.

  • Sharing: Multiple virtual machines can read and write to a virtual machine disk at the same time.

②Configuration mode

  • Normal: divide the required space in physical space at one time; the read and write speed is fast.
  • Streamlining: Promise to allocate the required physical space, and divide the physical space when it is actually needed; save space.

③Disk mode

  • Slave: In slave mode, this disk will be included in snapshot creation and snapshot restoration.
  • Independent-Persistent: In independent mode, this disk will not be included in creating snapshots and restoring snapshots. In persistent mode, the data will actually be stored on the hard disk, and the data will still be retained when the virtual machine is restarted.
  • Independent-non-persistent: In independent mode, the disk will not be included in snapshot creation and snapshot recovery. In non-persistent mode, the data will not be actually stored on the hard disk, and the data will not be retained when the virtual machine is restarted.

Guess you like

Origin blog.csdn.net/weixin_46706771/article/details/131706869