Block / file object advantages and disadvantages of three storage /

 From the application point of view block / file / object three storage: http://www.talkwithtrend.com/Article/178247

Object Store from theory to practice: https://baijiahao.baidu.com/s?id=1608194600020248113&wfr=spider&for=pc

Object storage, file storage, block storage, from the application point of view What's the difference? "Reviving" is how we match of? : Https://www.sohu.com/a/144775333_151779

 

Object storage principle

 

An object data structure consisting of

Block and file storage and management of data in different ways, forms the management data object store based on objects. The biggest difference objects and files in the file is the basis of increased metadata. In general, the object is divided into three parts: data, metadata, and object id.

Data objects are typically unstructured data, such as: photos, videos or other documents; metadata objects it refers to the associated description of the object, such as: size of the picture, the document owner, etc.; object id is a global unique identifier used to distinguish objects.

 

From the data structure point of view, these three storage is fundamentally different. Storing block data structure is an array, and the file is stored in a binary tree (B, B-, B +, B * tree variety), a hash table object is stored basically.

And an array of binary tree are commonplace, not much to say, but Hashtable object is used to store core data structure often heard key (KeyVaule type) stored for each object to find a UID (so-called " key "kEY), the hash value calculated (so-called" vaule value ") corresponding to a target and after. Looking for a hash table example is as follows:

Key correspondence relationship between simple and crude, after all, considered a hash value is very fast, flat organizational forms that can be very large, to avoid the depth of a binary tree for True mass data storage and access can give massive support force . So not only is the object store, many NoSQL distributed database will use it, such as Redis, MongoDB, Cassandra there Dynamo and so on.

 

2 Object Access mode

Object storage, easy way to access the object, the object is to be operated through the REST interface with HTTP verbs (GET, POST, PUT, DELETE, etc.) described operations. In addition, there is an access method is to use the client's major cloud providers to operate object. For example: Amazon's s3cmd, Ali cloud osscmd / ossutil, Tencent cloud coscmd and so on. These clients are developed various cloud providers, command line tools can be in the form of operating objects in the operating system, the major cloud providers have detailed documentation, this will not see them here.

 

3 advantages and disadvantages of object storage

Let me talk about the advantages, probably before the next mention:

High scalability: object storage can extend the capacity of tens to hundreds of EB, to take advantage of high-density storage; High Efficiency: flat structure, the directory is not complex effects on system performance; without migration: an object store scale systems, as capacity increases, the data is automatically distributed to all objects according to the algorithm storage node; safe: object storage typically HTTP calls with an authentication key stored in itself to provide the object to provide data access; easy access: only support HTTP (S) protocol, using API mode REST and retrieval of data, and also increases the NFS SMB support; cost is relatively low: compared to bulk storage, object store is most cost-effective type of data storage, and cloud with computing, to the characteristics of the object store to play the most.

Mention Cons:

Eventual consistency: Due to different nodes at different locations, the data may be some time delay or error synchronization; not easy to do database: objects stored more suitable for storing those files were little changed even the same, and for the database like this need applied directly to a bare disc storage mapped to each other, or more appropriate memory block.

 

Product demand and market influence have relationships with each other, but no matter what kind, are the final presentation of products and applications require a corresponding match. The more diverse applications, the market will have more fine division, product range will be more abundant. In the storage industry, we can from the "application adaptation" this point to talk about all kinds of storage.

The conventional wisdom is, IT equipment into computing / storage / network into three categories, with each other is obviously a clear demarcation line. Computing we all know, servers, minicomputers, mainframes; that is, network switches, routers; storage has built-in memory and external storage, the most common is the disk array. In HCI (super fusion) before this concept has not been stir-fried, computing network storage also are entirely different, each bear their responsibility of. Today, let's not discuss the case of ultra-fusion, based only on the traditional understanding, look at the situation store.

Generally divided into logical memory from the block storage, file storage, object store. These three types of memory in the practical application of adaptation or environment has a significantly different.

Block storage (DAS / SAN) are commonly used in certain proprietary systems, such applications require high random access performance and high reliability, is typically mounted above Oracle / DB2 traditional database connection is usually in FC fiber (8Gb / 16Gb) based, fiber take protocol. If you require slightly lower, there will be connection-based Gigabit / Gigabit Ethernet, MySQL database that you may use IP SAN, iSCSI protocol to go. Usually block storage systems are not the user, concurrent access is not much, often only serve one set of storage applications, such as trading systems, billing systems. Typical industries such as finance, manufacturing, energy, telecommunications and so on.

Micro-letter picture _20170524153052.png

Micro-letter picture _20170524153052.png

 

File Storage (NAS), relatively speaking, better balance and more users access multiple applications while providing a convenient means of sharing data. After all, most of the user data are stored as files in the PC era, data sharing are mostly in the form of documents, such as common FTP service, NFS services, Samba share which are typical of file storage. Dozens of users shared access to files stored even hundreds of users can be addressed with NAS storage. In the SMB market, a two NAS storage devices will be able to support the entire IT department. CRM system, SCM system, OA system, mail system can be used to get all NAS storage. Even a few years ago the development of public cloud, the user does not scale up, cloud storage underlying hardware is also useful sets of NAS storage devices to solve, or even cloud hosting mirroring is also stored on the NAS examples. Wide compatibility and ease of file storage, is the outstanding characteristic of this type of storage. But from a performance point of view, relatively SAN will be lower. NAS storage is essentially Ethernet access mode, normal Gigabit Ethernet, go NFS / CIFS protocol.

2.png

2.png

 

对象存储概念出现得晚一些,存储标准化组织SINA早在2004年就给出了定义,但早期多出现在超大规模系统,所以并不为大众所熟知,相关产品一直也不温不火。一直到云计算和大数据的概念全民强推,才慢慢进入公众视野。前面说到的块存储和文件存储,基本上都还是在专有的局域网络内部使用,而对象存储的优势场景却是互联网或者公网,主要解决海量数据,海量并发访问的需求。基于互联网的应用才是对象存储的主要适配(当然这个条件同样适用于云计算,基于互联网的应用最容易迁移到云上,因为没出现云这个名词之前,他们已经在上面了),基本所有成熟的公有云都提供了对象存储产品,不管是国内还是国外。对象存储常见的适配应用如网盘、媒体娱乐,医疗PACS,气象,归档等数据量超大而又相对“冷数据”和非在线处理的应用类型。这类应用单个数据大,总量也大,适合对象存储海量和易扩展的特点。网盘类应用也差不多,数据总量很大,另外还有并发访问量也大,支持10万级用户访问这种需求就值得单列一个项目了(这方面的扫盲可以想想12306)。归档类应用只是数据量大的冷数据,并发访问的需求倒是不太突出。另外基于移动端的一些新兴应用也是适合的,智能手机和移动互联网普及的情况下,所谓UGD(用户产生的数据,手机的照片视频)总量和用户数都是很大挑战。毕竟直接使用HTTP get/put就能直接实现数据存取,对移动应用来说还是有一定吸引力的。对象存储的访问通常是在互联网,走HTTP协议,性能方面,单独看一个连接的是不高的(还要解决掉线断点续传之类的可靠性问题),主要强大的地方是支持的并发数量,聚合起来的性能带宽就非常可观了。

3.png

3.png

 

从产品形态上来看,块存储和文件存储都是成熟产品,各种规格形态的硬件已经是琳琅满目了。但是对象存储通常你看到都是一堆服务器或者增强型服务器,毕竟这东西现在还是互联网行业用得多点,DIY风格。

关于性能容量等方面,我做了个图,对三种存储做直观对比。

4.png

4.png

 

块存储就像超跑,根本不在意能不能多载几个人,要的就是极限速度和高速下的稳定性和可靠性,各大厂商出新产品都要去纽北赛道刷个单圈最快纪录,千方百计就为提高一两秒,跑不进7分以内都看不到前三名。(块存储容量也不大,TB这个数量级,支持的应用和适用的环境也比较专业(FC+Oracle),在乎的都是IOPS的性能值,厂商出新产品也都想去刷个SPC-1,测得好的得意洋洋,测得不好自动忽略。)
文件存储像集卡,普适各种场合,又能装数据(数百TB),而且兼容性好,只要你是文件,各种货物都能往里塞,在不超过性能载荷的前提下,能拉动常见的各种系统。标准POXIS接口,后车门打开就能装卸。卡车也不挑路,不像块存储非要上赛道才能开,普通的千兆公路就能畅通无阻。速度虽然没有块存储超跑那么块,但跑个80/100码还是稳稳当当.
而对象存储就像海运货轮,应对的是"真.海量",几十上百PB的数据,以集装箱/container(桶/bucket)为单位码得整整齐齐,里面装满各种对象数据,十万客户发的货(数据),一条船就都处理得过来,按照键值(KeyVaule)记得清清楚楚。海运速度慢是慢点,有时候遇到点网络风暴还不稳定,但支持断点续传,最终还是能安全送达的,对大宗货物尤其是非结构化数据,整体上来看是最快捷便利的。

从访问方式来说,块存储通常都是通过光纤网络连接,服务器/小机上配置FC光纤HBA卡,通过光纤交换机连接存储(IP SAN可以通过千兆以太网,以iSCSI客户端连接存储),主机端以逻辑卷(Volume)的方式访问。连接成功后,应用访问存储是按起始地址,偏移量Offset的方法来访问的。
而NAS文件存储通常只要是局域网内,千兆/百兆的以太网环境皆可。网线连上,服务器端通过操作系统内置的NAS客户端,如NFS/CIFS/FTP客户端挂载存储成为一个本地的文件夹后访问,只要符合POXIS标准,应用就可以用标准的open,seek, write/read,close这些方法对其访问操作。
对象存储不在乎网络,而且它的访问比较有特色,只能存取删(put/get/delete),不能打开修改存盘。只能取下来改好后上传,去覆盖原对象。//因为中间是不可靠的互联网啊,不能保证你在修改时候不掉线啊。所谓你在这头,对象在那头,所爱对象隔山海,山海不可平。

Also say that the problem of distributed storage, these three can be stored and distributed in conjunction with the concept, become distributed file system, distributed block storage, and natural distributed object store.
Put the definition of metadata management and data storage access objects stored separately on a different node, multiple nodes to deal with multiple concurrent access, which naturally is a distributed storage products. The distributed file system on the lot, the number of various open source closed source products derived dozens, each used in different fields. As a distributed block storage products is relatively small, it is difficult to do. I personally think this product shape and a little illegal, distributed block storage ideas and the pursuit of design is actually the conflict. Recall that the main storage block diagram is fast, definitely a serious drag on distributed, since all distributed open communication between nodes will increase an additional burden, coupled with the CAP, in order to maintain consistency sacrificing response time, resulting advantage is scalability. It's like to get hold of the ultra-running serial cable, where possible also ran high speed? Chain are heavier than cars, and wear when the train leaves?
The file is stored in the original container trucks, we have put together a train or play feasibility.

Guess you like

Origin www.cnblogs.com/sylar5/p/11520149.html