Mobile cloud operating system transformation technology practice sharing, cross-operating system cloud host migration optimization (1)

In recent years, the Linux operating system has achieved rapid development in technology, community, and commercial solutions. Mobile Cloud has successively released a new generation of Tianyuan operating system and easy migration tools to ensure the efficient migration of mobile cloud full-scenario business. In the process of mobile cloud CentOS migration practice, cross-operating system virtual machine migration is an important part of the transformation. The current network environment is complex. How to ensure that customer business is not interrupted during the virtual machine migration process and that the migrated virtual machine runs on the Linux operating system? It runs smoothly on the platform, but faces many technical challenges on the underlying virtualization side.

key challenges

Virtualization components are homogeneous and heterogeneous

New virtualization components need to run stably on multiple Linux operating systems, and share the same source code on different operating systems and CPU architectures. Therefore, it is first necessary to solve the problem of homogeneity and heterogeneity.

OS compatible adaptation

Core services such as computing, storage, and SDN need to be compatible with each other on the OS. On the new platform, some business compatibility and adaptation problems need to be solved at the middle layer, that is, the virtualization layer.

Live migration across OS without stopping

Migration across operating systems and large versions of virtualization components may fail due to differences in virtual machine cpu capabilities, memory layouts, device structures, etc., and affect business continuity. problem.

Virtualization components are homogeneous and heterogeneous

Homologous heterogeneity can shield the differences between different systems and different architectures, shrink the live network version, and reduce the pressure of code maintenance. Realize "a code, compile once, run everywhere". pictureWe have solved many problems in the process of homogeneous heterogeneous transformation, such as: different systems and different architectures have different compilation and installation dependencies, and the corresponding dependencies need to be specified in the spec file according to different systems and architectures. There are conflicts during the installation of the old and new virtualization component software packages, and the obsoletes mechanism of rpm must be used to delete the corresponding installation packages to achieve smooth upgrade of the components. In addition, due to the large difference between the old and new virtualization components, the function call of the old version failed after the patch round, and some functions need to be redesigned according to the code difference.

OS compatible adaptation

In order to ensure that the virtual machine can run smoothly on the OS, it is necessary to solve some problems in the adaptation of computing, storage, SDN and virtualization components on the platform, and make some optimizations.picture

Compatible with Python2 version

With the end of the life cycle of Python2, libvirt-python has stopped supporting Python2 since 6.0.0. However, due to the long transformation cycle of some products, it is necessary to temporarily maintain the use of the Python2 environment as a transition. As the underlying component, virtualization needs to be built based on The libvirt-python package for Python2. We started from the syntax differences between Python3 and Python2, interface changes, module changes, etc., and modified the libvirt-python code on the OS, including:

  • Corresponding modifications have been made for grammatical differences such as data types and class definitions.
  • Corresponding revisions have been made to the differences in the use of APIs such as exception capture, input and output, and iterators.
  • Modifications were made for name changes or deprecated modules between Python3 and Python2.

After modifying more than 50 files and thousands of lines of code, I finally got a stable and reliable libvirt-python based on Python2.

Adaptation of OVS-dpdk and QEMU

The stability and reliability of SDN directly affect the network service quality of user virtual machines. In order to ensure that SDN can run smoothly on the new platform, we actively promote the adaptation work of various SDN vendors on the OS and solve multiple adaptation problems. . pictureDuring SDN adaptation, it is found that when QEMU is used as the server, restarting ovs may cause the virtual machine to crash. Check the coredump file of QEMU, locate the following code that triggers the crash. pictureWhen QEMU is used as the server, once ovs restarts, QEMU will actively try to reconnect according to the logic of the above code. During the process, the tcp status word of the network card device will be changed to TCP_CHARDEV_STATE_DISCONNECTED, which will cause processing logic bugs, which will cause QEMU to crash (in fact, QEMU As the server, the reconnect operation should not be performed, but the reconnect should be performed by the ovs as the client). The specific process of triggering the crash is as follows: pictureThe solution is that only when QEMU is used as the client, restart ovs, QEMU will do reconnect, and the problem will be fixed. In addition to the above problems, we have also solved some other problems, including the network failure of the windows virtual machine after the ovs hot upgrade, and the virtual machine stuck when executing the testpmd test program on the Haiguang platform.

Efficiency Optimization for Volume Migration Operations

Distributed storage provides basic storage services for the cloud platform, and its use is often accompanied by volume migration and capacity query operations. However, these operations are actually not efficient and need to be optimized. pictureWhen the QEMU native version implements the Ceph volume migration function, every bitmap will be dirty before the migration. During the first round of migration, all data on the source disk will be migrated to the destination disk (the part that has not written data will be written as 0), resulting in migration The amount of data increases and the time becomes longer. We have optimized the migration of ceph volumes, reducing the amount of data in the first round of migration, and the efficiency has been greatly improved (especially when the source disk space is large and the amount of data is small). The specific steps are as follows:

  1. Modify the rbd driver of the QEMU component, and add an interface for obtaining the used space distribution of the back-end cluster Ceph volume.
  2. At the beginning of the migration, use the interface to initialize the dirty bitmap of the volume migration.
  3. During the migration process, if the virtual machine adds new IO, the corresponding dirty bitmap will be set dirty. The dirty bitmap is continuously iteratively cleaned, and only the storage blocks with data are migrated and copied, and the blocks without data are skipped directly.
  4. Volume live migration is complete when the dirty bitmap is completely cleared.

There are also some other optimizations:

  1. Optimize the ceph volume capacity query and call the new interface, and its query efficiency has increased by 30%+.
  2. QEMU supports ceph volume snapshot migration function, and the snapshot information of the source volume is still retained after migration.

Live migration across OS without stopping

After solving various problems of adaptation with other core products, our focus turned to cross-OS migration and adaptation. We have carried out in-depth collaborative joint creation with Virt SIG members of the openEuler community to solve the problem of considering multiple aspects such as Guest's CPU capability and device status in cross-OS migration.picture

Migration failed due to incompatible target motherboard type

When migrating from the BC-Linux7 series system to the BC-Linux For Euler series system, there will be an "unsupported machine type" error. Comparing the machine types supported by the QEMU components of the two operating systems, it is found that the virtualization components of BC-Linux7 cut the QEMU community The native Machine Type completely customizes the private motherboard type and cannot be hot migrated to openEuler normally. pictureSince the machine type cannot be changed during the migration, the higher version of QEMU on the BC-Linux For Euler system must be compatible with the lower version of the machine type in order to migrate successfully. To this end, we sort out the devices supported by each machine type in the lower version of QEMU, and transplant the corresponding machine type on the higher version of QEMU, so that there will be no problem of unsupported motherboard types during migration.picture

Migration fails due to differences in device structures

QEMU uses the VMStateDescription (VMSD) data structure to describe and manage the device state, and the fields and subsections of VMSD will be sent to the destination during migration. To successfully migrate the virtual machine, if the device structure fields at the source end are more than those at the destination end, when the device state is loaded by vmstate_load_state at the destination end, the extra fields need to be disabled; otherwise, the missing fields need to be skipped. pictureDuring the test, when we sent the virtual machine back from the BC-Linux For Euler series system to the BC-Linux7 series system, the peer end could not successfully receive the device status of the keyboard. Compared with the VMSD of the keyboard, the higher version of QEMU adds the sending of the kbd_extended_state field, and the migration fails due to the lack of this field at the destination. picturekbd_extended_state_needed is a function to judge whether to send the kbd_extended_state field (default True). In order to ensure that the virtual machine will not fail to fetch due to kbd_extended_state, the extra kbd_extended_state field must not be sent when fetching. pictureIn addition, we also compared other devices, especially the difference between the high and low QEMU versions of VMSD of virtio and vhost_user devices, and made changes to the differences.

cpu feature incompatible live migration failed

The virtual machine CPU has 3 modes: custom (the least instruction set but the best hot migration compatibility), host-passthrough (the most instruction set but the worst hot migration compatibility) and host-model (between the two), However, the virtual machine cpu features are not only related to the virtualization configuration, but also related to the CPU model and operating system kernel of the host machine. Even in custom or host-model mode, we encountered some migration failures due to lack of cpu features at the destination.picture

case1: The arch-facilities feature is missing at the destination

Because the higher version of libvirt changes the name of the same cpu feature from arch-facilities to arch-capabilities, the destination does not recognize arch-facilities, resulting in live migration failure. It needs to be modified to arch-capabilities in the cpu_map.xml of the source to pass the cpu feature compatibility check.

case2: The target end lacks the spec_ctrl feature

The 3.10 kernel used on the BC-Linux7 series system needs to enable spec_ctrl to avoid the "ghost vulnerability", but the 4.19 kernel used by the BC-Linux For Euler series system avoids this vulnerability by other means, closes spec_ctrl, and needs to update the microcode to Enable the spec_ctrl feature on the destination.

case3: The destination end lacks the hle/rtm feature

The live migration of the model virtual machine configured with host-model fails because the destination end lacks the hle/rtm feature. It is necessary to add "tsx=on" to the startup parameters of the kernel of the destination node to enable the relevant instruction set.

Summarize

We have made homogeneous heterogeneity of virtualization components, OS compatibility and adaptation, and in-depth joint creation with the openEuler community to achieve cross-OS non-stop live migration optimization. From the two aspects of principle and practice, the migration of CentOS is guaranteed. The transformation task can be carried out efficiently. However, there are a large number of nodes in the existing mobile cloud network that need to be migrated, which has higher requirements on the efficiency and success rate of the migration. In the next sharing, we will share the technical sharing on the performance improvement and optimization of hot migration. Dear Please look forward to!

Guess you like

Origin blog.csdn.net/openEuler_/article/details/132182296