Advantages and disadvantages of block storage, file storage, an object storage, application scenarios

Reference article link:
 
  • Block storage is to provide services at the physical layer level, the use of its system, useful for its own file system format. Once such a system is used, it is the exclusive.
  • File storage, file system layer is to provide services, the system can use only one access to the file system, each system can get access based interface.
  • Providing an object store, a file system service is only optimized to the current file system, using flat manner, abandoned tree structure, easy to share, high-speed access.
  • Details:
  • These three storage, corresponding to a different access protocols, which also determines the nature of their differences.
  • Let me talk about file storage, it is the main operating target files and folders. NFS-for example, documents related interfaces include: LOOKUP / ACCESS / READ / WRITE / CREATE / REMOVE / RENAME, etc., folder-related interfaces include: MKDIR / RMDIR / READDIR and so on. There will also be FSSTAT / FSINFO interfaces for information to provide file system level. POSIX, SAMBA is a file storage and other protocols. Pay more attention to flexible interface protocol, and access control.
  • Block storage, mainly the operation target disk. With SCSI, for example, the main interface has a Read / Write / Read Capacity / Inquiry, and so on. FC, iSCSI, but also the block storage protocol. And file storage compared to no file and directory tree concept, general agreement does not define the disk creation and deletion. Pay more attention to the transmission control protocol.
  • Object storage, mainly the operation target object (Object). To S3, for example, the main interface has PUT / GET / DELETE and so on. And compared to stored files and objects, no random access interface. And file storage compared to no tree concept. Pay more attention to simple protocol.
 
 
  • [] Storage block
  • Typical devices: disk arrays, disk, virtual hard disk
  • This interface typically QEMU Driver Kernel Module manner or present, need to implement this interface is an interface for Linux or QEMU Block Device Block Driver provides an interface, such as Sheepdog, AWS of EBS, Yun cloud hard and Ali cloud system Banko , as well as Ceph of RBD (RBD memory is block-oriented interface Ceph)
  • The main memory block is mapped to the entire raw disk space used for the host, such as a disk array that is to say there are five drives (For convenience of explanation, assume that each disk. 1G), may then be designated by logical disks do Raid, or the LVM ( logical volume) and various embodiment of the logically divided N logical disk. (Assuming complete logical disk is divided into five, each is 1G, 1G but five logical disks 5 has been in the original meaning is completely different physical disk. A logical disk, for example, inside a first, a second possibility 1 from the physical hard disk 200M, 200M from the second physical hard disk 2, so a logical disk is a plurality of fictitious logical physical hard disk.)
  • Then block storage mode will use these maps several logical drive mapped to the host, the host operating system will recognize the above 5 to have a hard disk, but the operating system does not distinguish in the end is the logical or physical, that it can not and will only 5 bare physical hard drive only, with direct mount to take a physical hard disk to the operating system is no different, at least in the perception of the operating system is no different.
  • In this way, the operating system also needs to mount a bare hard disk partition, formatted to use, and the usual host built the hard way completely no different.
  • advantage:
  • 1, the benefits of this approach, of course, is because of the Raid and LVM and other means, to provide data protection.
  • 2, the other may be multiple inexpensive hard drives combined into a large logical disk capacity to provide services and improve capacity.
  • 3, when the write data, because it is out of the plurality of logical drive disk assembly, the disk can be written in parallel to a few, to enhance the efficiency of reading and writing.
  • 4, a lot of time using block storage network SAN architecture, and the reason the transmission rate of the encapsulation protocol, and the read and write rates such that the transmission rate may be raised.
  • Disadvantages:
  • 1, when using SAN architecture networking, need to purchase additional Fiber Channel card for the host, but also to buy fabric switches and high construction cost.
  • 2, data can not be shared between a host computer, not in a case where the server cluster, a bare disc storage blocks mapped to the host, then the format used for the host corresponds to a local disk, then the local disk of the host A can not to host B to use, you can not share data.
  • 3, is not conducive to data sharing between different host operating systems: Another reason is because the operating systems use different file systems, finished after formatting, data between different file systems is not shared. For example, an installed WIN7 / XP, the file system is FAT32 / NTFS, Linux is EXT4, EXT4 is not recognized by the NTFS file system. Like a NTFS U disk format, inserted into the Linux laptop and can not be identified. It is not conducive to file sharing.
    • Typical equipment: a disk array, a hard disk
    • The main raw disk space is mapped to the host used.
    • advantage:
    • Raid and LVM by other means, to provide data protection.
    • Multiple inexpensive disk combination, increase capacity.
    • Out of the plurality of logical drive disk assembly, to enhance the efficiency of reading and writing.
    • Disadvantages:
    • When using SAN networking architecture, optical switches, a high construction cost.
    • Data can not be shared between hosts.
    • scenes to be used:
    • docker container, the virtual machine disk storage allocation.
    • Log storage.
    • File storage.
  •  
  • Two types of storage are listed below memory block type:
  •         1) DAS (Direct Attach STorage): is directly connected to the host server as a storage method, each host server has a separate storage device, each host server's storage devices can not communicate, you need access to data across hosts when, It must be relatively complex setting, if the host server belong to different operating systems to access each other's data, more complex, and some can not even access the system. Typically used in a single network environment and small amount of data exchange, the performance requirements of the environment, an application can be said to be relatively earlier technology.
  •         2) SAN (Storage Area Network): A network is coupled to the host server as a storage method using specialized high speed (optical fiber), this group will be at the rear of the host, which uses high-speed I / O mode coupling, such as SCSI, ESCON and Fibre- Channels. In general, SAN applications on the network requirements of high speed, high data reliability and security requirements of high performance requirements for data sharing application environment, characterized by high costs, good performance. Critical applications such as large amount of data telecommunications, banks. It uses SCSI block I / O command set, provide high-performance random I / O data access and data throughput in a disk or FC (Fiber Channel) by level, which has a high bandwidth, low latency advantage in high performance calculations place, but because of higher prices SAN systems, and scalability is poor, can not meet the thousands of CPU-scale systems.
  •  
  • [File Storage]
  • Typical equipment: FTP, NFS server, SamBa
  • In the usual sense is support for POSIX interface, such as with a traditional file system Ext4 is a type, but the difference is that the distributed storage offers the ability to parallelization, such as Ceph's CephFS (CephFS Ceph is the interface for file storage), but there are when they will GFS, HDFS class file storage interfaces such non-POSIX interfaces fall into this category.
  • To overcome these files can not be shared problems, so have file storage.
  • File storage device also has an integrated hardware and software, but in fact common to take a server / laptop, just install the appropriate operating system and software, you can set up FTP and NFS service, the server rack after such services, is the file one of the store.
  • A host can be stored directly to the file upload and download files, and block storage different, Host A is no need for file storage format, because the file has been stored by the file management functions get their own.
  • advantage:
  • 1, low cost delivery: just a machine on it, in addition to the ordinary Ethernet you can, do not need a dedicated SAN network, so low cost.
  • 2, easy file sharing: for example, Host A (WIN7, NTFS file system), the host B (Linux, EXT4 file system), would like to copy each other a movie, would not work. C added a host (the NFS server), then copying the first C A, C then B Kaodao OK. (Example superficial, please forgive me ......)
  • Disadvantages:
  • Low literacy rate, slow transfer rate: Ethernet, upload and download speed is slow, the other all read and write every single server inside the hard disk to bear, compared to the disk array at every turn tens of hundreds of hard disk read and write at the same time, much slower rate.
    • Typical equipment: FTP, NFS server
    • In order to overcome the block storage file can not be shared problems, so have file storage.
    • Set up FTP and NFS service on the server, the file is stored.
    • advantage:
    • Low cost, just a machine on it.
    • Easy file sharing.
    • Disadvantages:
    • Low literacy rate.
    • Transfer rate is slow.
    • scenes to be used:
    • Log storage.
    • There are file storage directory structure.
  •        
  • Usually, NAS is a file-level storage products. NAS (Network Attached Storage): is a network storage device, typically attached directly to a network and provide access to information services, a storage device like the NAS system providing a data file and services, characterized by high cost. Such as education, government, business and other data storage applications.
  •         It uses NFS or CIFS command set to access the data, file-transfer protocols, network storage via TCP / IP, can be good scalability, inexpensive, user-friendly management, such as the current cluster computing in the application of more NFS file system but due to the high cost of NAS protocols, low bandwidth, latency, it is not conducive to the application in high-performance cluster.
  •  
  • [Object storage]
  • Typical equipment: built-in high-capacity hard disk distributed servers
  • Object storage: that is, in the usual sense of the key-value store, its interface is simple GET, PUT, DEL and other extensions, such as the seven cows, they shoot, Swift, S3
  •         Linux clusters for high-performance storage systems and data sharing needs, has begun to study a new storage architecture and a new file system on international, hoping to effectively combine the advantages of SAN and NAS systems support direct access to the disk to improve performance by sharing the files and metadata to simplify the management, the object storage system has become a hot topic Linux cluster system of high-performance storage systems, such as Panasas Inc. Object Base storage cluster system system and cluster file systems company Lustre and so on.
  •         Generally speaking, the object store with both SAN-speed direct access to the disk and distributed shared characteristic features of the NAS.
  •         The core is a data path (a data read or write) and the control channel (metadata) separate and build the storage system based on the object storage device (Object-based Storage Device, OSD), each object having a certain intelligent memory device, can be automatically data on the management of its distribution.
  •         (Client object, the object storage device, the metadata server object storage system) object storage structure components:
  •         1, Object
  •         Object is the basic unit of data storage system, an object is actually a combination of a set of data files and attribute information (Meta Data), which can be defined based on the attribute information file RAID parameters, data distribution, and quality of service, while the traditional the storage system using a file or a block as a basic unit of storage in the storage system block need always tracking system attribute of each block, the object properties by maintaining its own storage system in communication with. In a storage device, all objects have an object identifier identifying the object to access the object via OSD command. Various properties are usually several types of objects, identifying the root object storage device and a storage device of the apparatus, the object is a shared object set group resource management policies on the storage device and the like.
  •         2, object storage device
  •         Object storage device with some intelligence, it has its own CPU, memory, disk, and network systems, the OSD block device with a different storage medium is not, but in both the access interface provided. OSD's main functions include data storage and secure access. Currently used internationally blade structure to achieve the object storage device. OSD provides three main functions:
  •       (1) data storage. OSD data to be managed, and place them on a standard disk system, OSD block interface does not provide access method, with the object ID, while the offset data write request data Client.
  •       (2) intelligent distribution. OSD with its own CPU memory optimization and data distribution, and support data prefetching. Since the OSD can intelligently support object pre-fetching, which can optimize disk performance.
  •       (3) management of each object metadata. OSD management data stored thereon metadata object, the metadata similar to the metadata inode conventional, and typically comprises a length of the block target object. In the traditional NAS systems, these metadata are maintained by the file server, object storage system architecture will primarily metadata management work done by the OSD, reduces the overhead Client.
  •          3, the metadata server (Metadata Server, MDS)
  •          MDS control interaction with the Client OSD objects, mainly provides the following functions:
  •       (1) object storage access.
  •         MDS configuration, the file management view for describing the distribution of each, allowing direct access to Client objects. MDS is the Client provides the ability to access the objects contained in the file, each of the OSD upon receiving the request to verify this capability, and can access.
  •       (2) file and directory access management.
  •         MDS build a file structure on the storage system, including the creation and deletion, control limits access control, directory and files and so on.
  •       (3) Client Cache consistency.
  •         In order to improve the performance of Client, in the object storage system design generally support Cache Client side. Since the introduction of Cache Client side, brought Cache Coherence, MDS supports file-based Client Cache, Cache when the file is changed, a notification will refresh Client Cache, Cache to prevent the problems caused by inconsistencies.
  •         4, the client object storage system side Client
  •         In order to effectively support the Client supports access objects on the OSD, the need to achieve Client object storage system in a computing nodes, usually provided POSIX file system interface that allows applications to perform standard image as a file system operation.
    • Typical apparatus: Built-in high capacity hard disk distributed servers (swift, s3)
    • Multiple servers built-in high capacity hard drive, install the object storage management software, provide external read and write access.
    • advantage:
    • It has read and write cache block storage.
    • With shared document storage and other features.
    • Usage scenarios: (updated for changes less data)
    • Picture storage.
    • Video storage.
  •  
  • The most common object storage solution that multiple servers built-in high-capacity hard disk, and then loaded on the object storage software, and then do a few extra serving as a management node, installed on the target storage management software. Management node can manage other servers provide external read-write access.
  • The reason why there has been an object store this kind of thing, in order to overcome the disadvantages of each block storage and file storage, Talia carry forward their respective advantages. In short block storage fast read and write, is not conducive to sharing, file storage read and write slow, is conducive to sharing. You can get a quick read and write, which will help share out of it. Then there is the object store.
  • First, a file containing the property (term called Metadata, metadata such as the file size, modify time, a storage path, etc.) and content (hereinafter referred to as data).
  • Such as conventional FAT32 file system, is directly stored together with the metadata of a data file, the first file in accordance with the stored procedure minimum block size to break up the file system (e.g. file 4M, assuming 4K block of a file system of claim , then it will become broken up file 1000 pieces), and then written into the hard disk inside, the process does not distinguish between data / metadata of. And each block is the last block will tell you the address to be read next, and then have to find what you want in this order, to finalize the entire document read all the blocks.
  • In this case literacy rate is very slow, because even if you have 100 mechanical arm in reading and writing, but because you only have to read the first block, the next block in order to know where, in fact, the equivalent of only one robot work in practice.
  • And the object will be stored metadata separate out the control node called metadata server (server + object storage management software), the property is responsible for storing objects inside (mainly stored data objects are broken up into several distributed servers that the information), while the other is responsible for storing data distributed server is called OSD, is mainly responsible for the data portion of the file is stored. When a user accesses an object, it will first access the metadata server, metadata server is responsible only for feedback object is stored in which OSD, assuming that feedback file A is stored in the B, C, D three OSD, then the user would access the server directly OSD three again to read the data.
  • Because this time is three of external OSD data transmission simultaneously, the transmission speed is faster. The more the number of servers OSD this enhanced read and write speeds greater, by this way, a fast read and write purposes.
  • On the other hand, object storage software is a dedicated file system, so it corresponds to the external OSD file server, file sharing is so difficult does not exist, but also solve the problem of file sharing.
  • So there are stored in the object, it combines the advantages of good storage and file storage block.
  •  
  • Finally, the benefits of object storage Why both block storage and file storage, but also the use of block storage or file storage it?
  • 1, there is a class of applications need to be stored directly mapped bare disc, such as a database. Because the database needs to be stored after their mapped to a bare disc, and then to a bare disc formatted according to their database file system, it is not possible to adopt some other has been formatted as a file storage system. Such application is more suitable for mass storage.
  • 2, the cost of object storage than ordinary file storage or higher, you need to purchase a special object storage software, and large-capacity hard disk. If the amount of data required is not massive, but in order to do file sharing when a direct form of file storage good, cost-effective.
  • SAN and NAS technology has been around for decades, and is currently a single SAN or NAS device has reached the maximum capacity of PB level, but in responding to the challenges EB-level data, or looks bloated. This is mainly because of its architecture and service interfaces decision.
           At the same time the SCSI protocol does not provide mechanisms to ensure that the write lock concurrent read and write different applications; a SAN as the underlying protocol use the SCSI protocol, SCSI protocol management particle size is very small, typically in bytes (byte) or kilobytes (KB) units data consistency is difficult to achieve data sharing between EB-class storage resource management and multiple server / server cluster.
           NAS file protocol is used to access the data, the data can be accurately identified by the content file storage device protocol, and provides a very rich file access interface, including read-write lock complex directory / file. File and directory management tree-like structure, each node uses a structure called the inode is managed, each directory and file corresponds to a iNode. Directory depth or the number of children in the same directory with the increase in the overall number of files and the rapid increase in the number of usually more than one hundred million files, file systems, and sophisticated locking mechanism frequent metadata access will greatly reduce the overall system performance.
         Traditional RAID technology and Scale-up architecture also prevents traditional SAN and NAS became EB-class high-availability, high-performance mass storage unit. Traditional hard disk based RAID, a RAID group typically contains up to 20+ drives, even if the PB-scale SAN or NAS is also divided into a plurality of memory islands, increases the complexity of the management of the EB-scale application scenario; simultaneously Scale -up architectural decisions that even NAS and SAN storage capacity of EB stage, performance will be short barrel plate.
         How can we cope with the flood of data the era of information explosion it? Can we assume there is a "super data library" which provides vast amounts of storage space to be shared by many users (server / server cluster) to use, providing large storage capacity, storage capacity thousands of times the size of the current high-speed the storage unit (SAN and NAS), when users or applications to access data without having to know how these books for the library and put management (distribution management), only need to provide a unique number (ID) you can get to the content of the book (data ). Page (if a book becomes old broken, the system automatically expiring or has failed storage media data on a) copying (recovery / reconstruction) to a new paper (storage medium), and re-binding it the book, data users do not need to pay attention to this process, just to get the data resources as needed. Whether this "super data library" really exist?
  •   The birth of a distributed object store
         Object storage technology and the emergence of a large number of automated generation management technology, making the "super data library" is no longer a distant dream mankind. Object storage system (Object-Based Storage System) improves SAN and NAS storage disadvantage, retained the NAS data sharing and other advantages, through high-level abstract interface replaces SCSI storage block and file access interface (different users access to different parts of POSIX file system, not only a waste of time, and let the operation and maintenance management becomes more complicated. in contrast, the obvious advantages of a distributed storage system. do application development on a distributed storage system is more convenient, easy maintenance and expansion, automatic load balancing . in RESTful HTTP interfaces and interface instead of POSIX QEMU Driver Interface), shielding the underlying implementation details stored in the tree structure perpendicular to the NAS changed to equal a flat structure, thereby improving the scalability, reliability is enhanced, with the platform-independent storage and other important characteristics. (Erasure Code: converting a file into a set of fragments, each fragment small broken pieces are distributed to a set of server resource pool as long as a sufficient amount of debris remaining, can be synthesized as the original document which can be. to maintain the robustness of the data on the basis of the original greatly reduce the storage space needed. but Erasure Code not fit all scenarios, especially not for delay-sensitive network traffic (but Erasure Code not fit all scenarios, especially not for delay-sensitive network business))
  •   SNIA (Storage Networking Industry Association) object storage device is defined like this:
    Ø objects are self-contained, contain metadata and data attributes
         N distributed object storage device can determine its own location and the specific data storage
         n storage devices can provide different QoS for different objects
    Ø device relative to the object storage device has a higher block "smart", the upper layer through the object to access the object ID, rather than need to understand the spatial distribution of the specific object
         In other words the object is intelligent storage, the package better in the block is part of the "File" or other application-level logic structure, the file is mapped object directly controlled by the upper layer, object storage device itself may also be a distributed the system - which is a distributed object storage system.
  •   Instead of the traditional benefits of a block with itself an object that the content of the object from the application, which intrinsically linked with "atomic", and therefore can be done:
    Ø more intelligent management of space in the storage layer
    Ø contextual data prefetching and caching
    Ø reliable multi-user shared access
    Ø object-level security
         Meanwhile, the object storage architecture also has better scalability. In addition to an object ID and user data, the case also includes a main data source, time, size, location and other information predefined attributes, permissions, and the number of custom attributes.
         EB-scale distributed object includes stored scalable, is provided by the application unified namespace, EB build-level unified, shared storage resource pool available data, effectively fill the general-purpose computing model of " Network Computing " Scene mass storage unit blank, through high-level data model abstraction to simplify application access to data, while enabling mass storage more intelligent.
        Objects are data description information and self-assembly, in the magnetic disk basic unit of storage on. Simplified data object is stored in the form of tissue (e.g., the tree "directory" and "file" is replaced flat "ID" of the "object"), to reduce the complexity of the protocol interface (such as simplifying complex locking mechanism, ensure eventual consistency), thereby improving the scalability of the system to meet the challenges of the information explosion era of massive data. At the same time the object of intelligent self-management capabilities can reduce the complexity of system maintenance, help users reduce costs (TCO) of ownership.
  • To summarize, the fast block device speed, data storage management is not organized, but in most scenarios, inconvenient for the user to read and write data (block device to the position offset + length of data recorded position data, write data). After while on a block device built file system, file system block device to help organize data management, data storage more user-friendly (read and write data to a file name). Ceph file system interface to solve the "Ceph + local file system block device" does not support multiple clients to share read and write problem, due to the complexity of the file system structure leads to the storage device performance than Ceph difference block. Object storage interface is a compromise, guaranteed storage performance, while supporting multiple clients to share read and write.

Guess you like

Origin www.cnblogs.com/sylar5/p/11520017.html