"Ceph Analysis" series (6)-some ideas about Ceph

The content of this article is mainly some of the thinking that the author produced during the investigation and analysis of Ceph. Because the content in it is relatively free to distribute, and most of it is the author's personal opinion, I will open another article for discussion.

 

7.1 About Ceph performance

        So far, there is no detailed discussion of Ceph performance in this series of articles, nor any Ceph performance data. The reason is simple: I have no opportunity to conduct detailed Ceph performance analysis and research, nor have I seen more comprehensive relevant data. Therefore, in order to avoid misleading readers with one-sided data, no information is provided.

        Based on my personal experience, it is not easy to discuss the performance of an open source project in the field of systems. The reason is that there are too many and complicated factors that affect the performance of a system in an actual deployment. Variations in factors such as hardware configuration, software version, parameter adjustment, application load, and scene settings will cause different performance test results. Therefore, it is difficult to say in a nutshell, whether a project's performance is good or bad.

        Give an example that is not directly related. In the hypervisor field, you are likely to tend to think that ESXi is better than KVM, but in fact, on the SPECvirt performance test results list, KVM-based systems often top the list. The reason is that in addition to the factors of hardware performance, KVM has a large number of configuration parameters that can be adjusted, and whether it is adjusted well or not will significantly affect the system performance.

        Another example is the commonly used open source big data tool software Hadoop. The same Hadoop cluster uses the same application to process the same data set. When the configuration parameters are different, the final running time length may differ by several times.

        It is precisely because of parameter configuration, hardware specifications, software version, application scenarios and other factors that may have a significant impact on performance. Therefore, for a system with many deployment options and many configuration parameters, such as Ceph, how to evaluate its system performance is needed. Think carefully.

        Conversely, this is also a way of making money from open source software. Although the software itself is open source, everyone can download and install it for free, but whether it can be used well depends on profound professional skills. Similar companies are not uncommon in foreign countries, and domestic ones have begun to appear.

 

7.2 Adaptability between Ceph architecture and hardware platform

        Since Ceph was officially released in 2006, its infrastructure (RADOS) has not undergone major changes. In essence, this is because RADOS's design is indeed excellent and forward-looking, so there is no need to move the muscles. But this does not mean that it is not necessary to reflect on it properly.

        As mentioned earlier, in 2006, the mainstream of commercial processors was still a single core, and the capacity of a single memory and a single hard disk were also much smaller than the current mainstream level. However, the basic hardware resource requirements of OSD have not changed. This means that in the current typical deployment plan, there are likely to be dozens of processor hardware threads and dozens of hard disks on a physical server, so it also carries dozens of OSDs running at the same time. However, the basic assumption of the RADOS architecture is that the cluster is composed of a large number of OSDs that operate independently of each other, and the current typical hardware solution may affect the validity of this assumption. For example, if a server fails and must be shut down for maintenance, it means that dozens of OSDs go offline suddenly. The PGs affected by this may reach tens of thousands more. This sudden event may cause some pressure on the automatic maintenance mechanism of the system.

 From this, the author thought that the hardware platform Sage faced when designing Ceph should in fact be a system that does not require too much processing power and has relatively simple hardware specifications. This system may be more similar to the current ARM architecture or Intel Atom architecture micro-server. Perhaps, deploying a Ceph cluster based on micro-server is a direction worth trying.

        In addition, Huawei and Seagate launched an IP hard drive product. Although there is a lack of further understanding, it is intuitively speculated that this new, lightweight, intelligent storage device may also be a hardware platform that is very similar to the OSD that Sage envisioned that year.

 

7.3 Ceph and software-defined storage

        The words "software definition" are one of the hottest and most confusing concepts. Software-defined computing, software-defined networking, software-defined storage, and software-defined data centers are probably the most common related terms.

        What exactly is "software-defined" has not yet formed a completely consistent view. Moreover, referring to some precedents in the history of technological development, it may not be possible to form so-called consensus in the future. In this case, starting with a specific example, it may be easier to obtain an intuitive understanding, and thus establish a more systematic view.

        The author believes that for any system, the concept of "software-defined" is more embodied here: what characteristics of this system, such as function or performance, were fixed before, or can only be limited configuration, but now Can be easily and flexibly defined and changed.

        For example, for a physical server, once its hardware configuration, such as CPU, memory, hard disk, etc. is connected, the specifications and performance of the server are determined, and the range of performance and functions that can be adjusted through BIOS configuration is very limited. of. However, for a virtual machine, even after the virtual machine has created and installed an operating system, its CPU core number and processing capacity, logical physical memory size and real physical memory size, hard disk number capacity and read and write performance, network card Features such as the number of models and network bandwidth can be easily and flexibly controlled and changed through software (some configuration operations require a virtual machine to be restarted to take effect), and this configuration can be controlled by the application layer software. Comparing the two, the definability of the virtual machine is an intuitive example of software-defined computing.

        The details will be discussed in the storage area. In general, the main characteristics of a storage system generally include: storage type (file system? Block storage? Object storage?), Storage capacity, storage performance (access bandwidth, access delay, etc.), storage strategy (backup strategy, access security Sexual strategy, advanced data processing functions, etc.). With reference to the example of software-defined calculations given above, it is conceivable that for a software-defined storage system, these characteristics (at least most of them) should be able to be defined by software.

As far as Ceph is concerned, its most consistent feature with software-defined storage is undoubtedly that the storage type of Ceph can be defined through software. The same RADOS cluster can implement block storage, object storage, and file system storage functions by installing different upper-layer software and corresponding client programs. This feature is hard to imagine for traditional storage systems. In addition, Ceph's storage strategies, such as backup strategies and background data processing functions, can also be easily defined or expanded through software. Therefore, from this perspective, Ceph can also be considered as one of the real cases of software-defined storage.

 

7.4 Ceph and Data Center Computing

        Traditionally, the design of computing systems is centered on computing. Data flows into the processor from storage, network, or other devices, and then flows to storage, network, or other devices after processing. However, as the amount of data to be processed increases at an explosive rate, and as the speed of computing power increases exceeds storage and transmission capacity, this processing method may become no longer economical because of the frequent hard disks for large amounts of data The cost of access and network transmission is very considerable.

        The concept of data center computing was proposed in this context. The core idea is to make calculations happen where the data is. Wherever the data is, send the computing tasks to be executed, instead of moving the data around to use "strong" computing power. In fact, the emergence of Hadoop is a realistic reflection of this data center computing idea.

        Another example of data center computing is a lightweight virtualization technology called ZeroVM that currently appears in the OpenStack community [ 1 ]. The idea of ​​ZeroVM is to let the calculation happen where the data is. Based on the information provided by the official, the integration of ZeroVM and Swift has been achieved so that processing tasks can be run directly on the server side of Swift.

        In fact, Ceph provides the same capabilities. The entire design of Ceph is based on a basic idea of ​​Sage: give full play to the computing power of the storage device itself. This idea not only allows OSDs to cooperate with each other to complete data access operations and cluster maintenance functions, but also allows OSDs to provide excess computing power for running data processing tasks.

        At present, the mechanism provided by RADOS allows running a dynamically loadable data processing program plug-in directly on the OSD for data processing on the server side, for example, automatic watermarking, automatic size and format conversion of pictures in the picture storage system Background operations. In fact, based on this ability, a big data processing system similar to Hadoop can also be realized.

        For big data, storage and processing are its two key technical areas. Since Ceph itself is an excellent storage system and has the ability to directly carry computing tasks, data center computing for big data is likely to be one of Ceph's potential applications.

 

7.5 Problems that Ceph may have in practical applications

        So far, this series of articles is basically introducing various advantages and specialties of Ceph. However, no system can be perfect. In the spirit of picking bones and picking faults in the egg, we still have to say a few words here.

        From a non-technical point of view, the biggest problem with Ceph is that it does not take long to fire, so there are not many documents that can be referenced, especially in Chinese. But there is no way to do this, everyone can only collect firewood with high flames and make contributions bit by bit.

        In addition, the most criticized of Ceph may not be mature enough. But an open source project always takes more people to mature, and Ceph is currently in the process, so it still takes time and participation.

        In addition, in my opinion, Ceph's high degree of automation may also be a double-edged sword. The benefits are many, but the disadvantage is that the operating state of the system is not completely under the control of the administrator, and there will be a number of operations triggered automatically in the system instead of triggered by the administrator. This feature may bring some complexity to the monitoring and control of the system state, which requires the administrator to adapt.

 

7.6 Ceph-based industry needs and possible business opportunities

        It is hereby stated that the content of this section is purely crazy idea and does not constitute investment advice :-)

        First of all, Ceph installation and deployment and performance optimization will inevitably become outstanding needs. Therefore, integrating Ceph and commercial servers into various storage solutions that are easy to deploy and have excellent performance should be one of the directions that can be considered.

        At the same time, due to Ceph's own special assumptions about the OSD hardware platform and the resulting optimization space, under the premise of reasonable cost, develop a customized hardware platform that is more suitable for Ceph OSD (similar to micro-server or IP hard disk, etc.) , And highlight the characteristics of storage such as high density, low power consumption, and high maintainability, may also become an option.

        In addition, special cluster monitoring, performance analysis and other tool software for Ceph clusters may also have certain needs.

        Finally, the Ceph-based background data processing software toolkit is also worth considering.

Published 59 original articles · 69 praises · 270,000+ views

Guess you like

Origin blog.csdn.net/pansaky/article/details/102455017