Virtualization technology principle (CPU, memory, IO)

This article comes from: http: //www.ywnds.com/ p = 5856?

Virtualization

Cloud computing is now very mature, and virtualization is a key technology to build cloud computing infrastructure indispensable. Cloud cloud computing system, which is essentially a large-scale distributed systems. Virtualization virtual more virtual platform on a physical platform, and wherein each of the virtual platform can be used as a distributed system independent terminal joins the cloud. Compared to the direct use of the physical platform, virtualization has a huge advantage in the efficient use of resources, dynamic provisioning, and high reliability. The use of virtualization, enterprises do not have to abandon the existing infrastructure to build a new information infrastructure to more fully utilize existing IT investments.

Virtualization technology

Virtualization is a broad term, refers to a computing device running a virtual rather than a real basis, based on a To simplify administration, resource optimization solutions.

In the X86 platform virtualization technology, the introduction of new virtualization layer is often called a virtual machine monitor (Virtual MachineMonitor, VMM), also known as Hypervisor. Virtual machine monitor operational environment, that is the real physical platform, called the host. And out of the virtual platform it is commonly called clients, which correspond to operation of the system is also called a client operating system, as follows:

Virtualization technology principle (CPU, memory, IO)

In 1974, Popek and Goldberg defined in a paper in the "classical virtualization (Classical virtualization)" basic needs, they believe, VMM on a true sense at least meet the standards in three areas:

  • Equivalent execution (Equivalient execution): In addition to differences in the availability of resources and time, run in the virtual environment and the real environment of execution is exactly the same.
  • Performance (Performance): Most instruction set to be able to run directly on the CPU.
  • Security (Safety): VMM to be able to fully control the system resources.

There are many virtualization technology implementations, such as software virtualization and hardware virtualization, another example paravirtualization and full virtualization. The following will make a brief introduction for each implementation.

CPU virtualization technology

First, the software virtualization and hardware virtualization

1) virtualization - software solutions

Pure software virtualization, as the name suggests, is to use the method of pure software on existing physical platform (often does not support hardware virtualization) to achieve capture and simulation platform for physical access. Common software virtual machine such as QEMU, it was through pure software simulation platform X86 processor fetch, decode, and execute, the client does not execute instructions directly on a physical platform. Since all instructions are software simulation, performance is often poor, but you can simulate different virtual machines based platforms on the same platform.

VMWare virtualization software then uses dynamic binary translation (BT) technology, and QEMU such simulations in different ways, BT is one of an accelerated virtualization solution, another common solution is to accelerate the virtualization hardware assist virtualization technology. BT is in the range of virtual machine monitor machine can be controlled, allowing the client's instructions to run directly on a physical platform. However, the client instructions before running will be a virtual machine monitor machine scanning, which break instruction virtual machine monitor machine limit will be dynamically replaced with safety instructions that can be run directly on a physical platform, or replaced with a virtual machine monitor software called. The advantage of this is a substantial improvement over pure software simulation performance (simulated by a software which is simply to make a fake, can not exist; and this is virtual virtual device out through some kind of cut or otherwise made available a certain level of service), but also lost the ability to cross-platform virtualization.

With BT technology, users of space to run at CPU ring on Guest 3, and kernel space Guest's running on the CPU ring 1, the kernel space to run on the Host CPU ring 0. BT will monitor the CPU ring 1, at any time Guest kernel call to convert privileged instruction calls. Of course, CPU ring 1 and has not been used, BT this technology allows virtualization performance has been greatly improved. However, BT has a big drawback is not cross-platform, no matter what the underlying hardware emulator QEMU which is able to simulate various CPU architecture platforms such as PowerPC, ARM and so on; but BT did not do this, BT is strongly dependent on the underlying architecture, such as the bottom so you can only create X86 CPU X86 virtual machine of.

In a pure software virtualization solution, VMM position in the software suite is where the operating system is located in the traditional sense, but the operating system is the position in which the location of the application in the traditional sense, this conversion is bound to increase the system complexity. The increased complexity of the software stack means that the environment difficult to manage, and thus will increase the difficulty to ensure system reliability and security.

2) Virtualization - hardware solutions

Hardware-assisted virtualization (HVM), in short, is the physical platform itself provides the interception and redirection of hardware support for special instructions, even, new hardware will provide additional resources to help software implementation of key hardware resources virtualization, so as to enhance performance. Increase the CPU can be understood as an additional ring -1 ring specifically provided for virtual machines running. Virtualized X86 platform, for example, supports the X86 CPU virtualization technology with special optimized instruction set to control the virtual process, through which instruction set, VMM will be very easy to run in one of the client placed in a restricted mode Once the client attempts to access the physical resources, hardware pauses while the client will return control to the VMM. VMM can also take advantage of hardware virtualization enhancements mechanism, the client access to certain resources in restricted mode, the full redirector specified by the hardware to the VMM virtual resources, the whole process does not suspend the operation of the client software and the VMM Participation.

As the virtualization hardware provides a new framework to support the operating system running directly above, without the need for binary conversion, reducing the performance overhead associated, greatly simplifies the VMM design, so that VMM can then be written according to common standards, performance more powerful .

It should be noted that the hardware virtualization technology is a solution. Complete picture needs to support CPU, motherboard chipset, BIOS and software, such as VMM software or some of the operating system itself. Even if only CPU support virtualization technology, with the VMM in the case of software, there will be better performance than systems do not support virtualization technology. Given the huge demand for virtualization and hardware virtualization broad prospects products, Intel has been trying to improve and strengthen their hardware virtualization product line. Since late 2005, Intel began its processor product line application Intel Virtualization Technology (IntelVT) virtualization technology, released a series of products with IntelVT processor virtualization technology, including desktop Pentium and Core series, as well as Xeon and Itanium Itanium Xeon server. Intel has maintained optimized hardware virtualization performance and adding new virtualization technology in each new generation of processor architectures. Now on the market, from desktop Core i3 / 5/7, to the server E3 / 5/7/9, almost all of them support Intel VT technology. It can be said, in the near future, Intel VT is likely to become standard on all Intel processors. Of course, also support AMD's CPU virtualization technology.

to sum up

Hardware-assisted virtualization technology looks better than BT if BT technology allows virtual machine performance to the physical machine 80% of the performance, then the hardware-assisted virtualization (HVM) will be able to make physical machine virtual machine performance to reach about 85%. Of course, this intermediate conversion is still needed, but is done directly by the hardware, and that's it.

Second, full virtualization and paravirtualization

Full-virtualization (full virtualization)

Full virtualization client provides a complete virtual X86 platform, including processor, memory and peripherals to support any operating system can theoretically run on real physical platform, providing maximum flexibility to configure a virtual machine sex. The client does not need to make any changes to the operating system to run properly any non-virtualized environment based on existing X86 operating system and software platform, which is fully virtualized unparalleled advantage.

In full virtualization, the virtual machine does not know it is running in a virtualized environment is not perceived, with no difference in the physical machine during installation. But this requires full virtualization software do support center, you need software to simulate all the hardware resources, at least CPU privileged instruction requires the use of software to simulate, because you have to let each Guest did not know they run in a virtual environment, then you have to provide a CPU with a privileged instructions.

In a virtualized environment, usually with virtual simulation are two concepts, VMWare dynamic binary translation technology (BT) and QEMU is a virtual software technology is analog. The biggest difference is that, by requiring analog simulation CPU ring 0-3 when implemented in software, which is required of all instructions CPU ring 0-3 conversion, but only need to convert virtual CPU ring 0 privileged instruction can be.

Of course, no matter when it comes to the top of the BT technology or QEMU or hardware-assisted virtualization technology are all completely virtualization technology, are needed instruction conversion, complex steps are required to complete, if we can streamline this one step then virtual performance machine there will be improvement. So how to streamline it? This is below said paravirtualization. In addition, in full virtualization mode:

If the CPU does not support hardware virtualization technologies: VMM virtual then all instructions are passed by BT in the VMM dynamic translation technology to convert virtual machine is running privileged commands into physical instruction set, and then to run the CPU.

If the CPU supports hardware virtualization technologies: VMM running ring -1, while GuestOS run in ring 0.

Para-virtualization (paravirtualization)

Virtualization software can be fully realized monitoring of individual virtual machines through the VMM software on a lack of hardware virtualization support platform to ensure independent and isolated from each other between them. But at the cost of increased software complexity, and performance loss. One way to alleviate this burden is, changes to the guest operating system, so that it knows it is running in a virtual environment, can work with virtual machine monitoring machine. This method is called paravirtualized (para-virtualization). VM kernel clearly know they are running on top of virtualization, for the use of hardware resources BT no longer need to apply for their own use but to VMM, such as memory or CPU usage for direct application to the VMM use, rather than a direct call translation. Even if it can be (Hypervisor call system provided) for the use of I / O devices via Hyper Call can deal directly with hardware, reducing the translation natural step in the middle of the performance is good, it is said that paravirtualized mode enables virtualization 90% of the performance of a physical machine. In essence, paravirtualization weakened the requirements for passive interception virtual machine specific instructions, convert it to proactively notify the client operating system. However, paravirtualized guest operating systems need to modify the source code to implement proactive notification.

Xen is an open source example of paravirtualization, the operating system before running on the Xen Hypervisor, it must be certain changes in the kernel level as a virtual server. Therefore, Xen suitable for BSD, Linux, Solaris and other open source operating systems, but is not suitable for these proprietary operating systems like Windows virtualize because they are not
open source, so it can not be modified kernel.

to sum up

Since the emergence of hardware-assisted virtualization, full virtualization makes the performance has also been enhanced. And compared in terms of paravirtualization, full virtualization on the use of more streamlined, virtual process is transparent to the Guest. So full virtualization more in line with market demand, such as the back of said KVM virtual machines.

Memory virtualization technology

Having virtualization technology CPU most important related art, the following come to talk about the second largest component in the memory of the computer of the five components, memory virtualization technology.

First, we know that memory itself is similar to virtualization technology that provides external services through virtual addresses, all processes that they can use all the physical memory. Below provides a non-virtualized and the virtual addressing mode.

Virtualization technology principle (CPU, memory, IO)

No Virtualation

In a non-virtualized, the system provides the physical address to go through the process of using (a page frames) virtual addresses, each process that they can use all the physical memory. Originally there is something called the MMU (memory management unit) in the CPU, whenever you want to access the data of a certain data be their linear address when is the virtual address. This process will be an address to the CPU, and the need to read the data, but the CPU is not really know the address to access the data, so the CPU through MMU converts this address to access the corresponding physical address, so this data You will be able to visit. The resulting general process memory address space is a contiguous virtual address space, and in the real physical memory storage is generally not continuous address space.

In Virtualation

In order to implement virtual memory, allowing the client to use an isolated, scratch and has a continuous memory space, KVM virtual machine as the introduction of a new layer of address space, that is, the physical address space of the client (Guest Physical Address, GPA), this address space is not a real physical address space, it's just a host virtual address space mapped in the client address space. The client, the client's physical address space is contiguous address space from scratch, but for the host, the physical address space of the client is not necessarily continuous, guest physical address space mapping is possible a plurality of discontinuous host address range.

From the figure we see that, in a virtual environment, the virtual machine can not be used directly in the physical address of the physical host MMU addressed, it is necessary to physical address translation Chengsu virtual machine host virtual address (Host Virtual Address, HVA) . Hypervisor running on the hardware will first physical memory virtual address (Host Virtual Address, HVA) conversion, and also the need for virtual memory address space of the converted virtual again, and then output to the upper virtual machine, and virtual but also for the same machine to convert GPA GVA operations. Obviously by this mapping, each virtual machine memory accesses are required Hypervisor intervention by multiple address translation software, its efficiency is very low.

Therefore, in order to improve the conversion efficiency of GVA to the HPA, there are two ways to achieve the client direct conversion between the virtual address to the host physical address. One is a software-based implementation, ie to achieve direct conversion between the guest virtual address to the host physical address (KVM virtual machine is supported) by the shadow page table (Shadow Page Table). The second is based on hardware-assisted MMU support for virtualization, to achieve the conversion between the two.

Which Shadow Page Table (shadow page tables), its implementation is very complicated, because each virtual machine needs to have a Shadow Page Table. And this happens in a very poor results, it is TLB (Translation Lookaside Buffer, transmission-aside buffer) is difficult to hit, especially when a plurality of virtual hosts, because the TLB cache is GVA to GPA conversion relationship, so every virtual host switch needs to clear the TLB, otherwise data read error occurs (because the hosts are among the GVA to the GPA) between hosts. A transmission lookaside buffer is used to improve memory management unit a virtual address to physical address cache results after conversion, and this problem can also lead to poor performance of virtual machines.

In addition, Intel's EPT (Extent Page Table) technology and AMD's NPT (Nest Page Table) technologies for memory virtualization provides hardware support. Both techniques similar principles are to achieve the client virtual addresses to physical addresses between the host at the hardware level. Called Virtualation MMU. When the MMU virtualization With this technology, the virtual machine process is still the same GVA by internal MMU converted to GPA, it does not need to change anything, retaining the benefits of a fully virtualized. But at the same time will automatically GVA by Virtualation MMU technology into real physical address (HPA). It significantly reduced by the GPA to HPA process, improve virtual machine performance.

And the CPU manufacturers also offer TLB hardware virtualization technology, previously only TLB GVA to mark correspondence between the GPA, it is these two fields, now expanded to three fields, adds a host field, and by the GVA to the GPA and correspondence became a correspondence between the GVA to the HPA. Clearly stated that this is the result of the mapping which virtual machine its GVA to the HPA.

to sum up

This shows that memory virtualization, if you do not have hardware support, you can only use Shadow Page Table (shadow page tables), which means TLB need to be constantly emptied. With the memory while the virtual machine technology, virtual machine performance to some extent, has also been greatly enhanced.

I / O virtualization technology

Viewed from the perspective of the processor, via a set of peripheral I O resources (Port I / O or MMIO) to be accessed, the virtual device associated / is called I / O virtualization, such as:

1) external storage devices: hard drives, CD, U disk.

2) Network Equipment: card.

3) The display device: VGA (video card).

4) Keyboard Mouse: PS / 2, USB.

There are some, such as serial devices, COM port and other equipment collectively IO devices, the so-called IO virtualization is to provide support for these devices, the idea is to VMM intercepts guest operating system access requests to the device, and then to simulate the real device by way of software Effect. Based on the device type of diversification, I / O virtualization features and complicated way, we pick some common IO devices to talk about.

IO virtualization but generally there are three ways, as shown below:

Virtualization technology principle (CPU, memory, IO)

The first: Analog I / O device

Full use of software to simulate, this is the easiest way, but the lowest performance for IO devices for analog and full virtualization does not make much sense on the difference. Guest OS to the VMM simulates a device and device driver IO, IO devices in order to use Guest OS kernel then needs to adjust to the VMM to access IO devices via analog driving, an analog device and then reaches the VMM region. VMM simulates the above apparatus, and so much so much host VMM runs, it also provides a VMM I / O Stack (multiple queues) to schedule those requests to the IO device on real physical IO devices. Several steps to complete a request.

举例: Qemu, VMware Workstation

The second: Paravirtualization

半虚拟化比模拟性能要高,其通过系统调用直接使用I/O设备,跟CPU半虚拟化差不多,虚拟化明确知道自己使用的IO设备是虚拟出来的而非模拟。VMM给Guest OS提供了特定的驱动程序,在半虚拟化IO中我们也称为“前端IO驱动”;跟模拟I/O设备工作模式不同的是,Guest OS自己本身的IO设备不需要处理IO请求了,当Guest OS有IO请求时通过自身驱动直接发给VMM进行处理,而在VMM这部分的设备处理我们称之为“后端IO驱动”。

举例:Xen、virtio

第三种:I/O透传技术

I/O透传技术(I/O through)比模拟和半虚拟化性能都好,几乎进阶于硬件设备,Guest OS直接使用物理I/O设备,操作起来比较麻烦。其思想就是提供多个物理I/O设备,如硬盘提供多块,网卡提供多个,然后规划好宿主机运行Guest OS的数量,通过协调VMM来达到每个Guest OS对应一个物理设备。另外,要想使用I/O透传技术,不光提供多个I/O设备还需要主板上的I/O桥提供支持透传功能才可以,一般Intel提供的这种技术叫VT-d,是一种基于北桥芯片的硬件辅助虚拟化技术,主要功能是由来提高I/O灵活性、可靠性和性能的。

为什么I/O透传还需要主板支持呢?每个虚拟机直接使用一个网卡不就可以了吗?主要是因为在我们传统的X86服务器架构上,所有的IO设备通常有一个共享或集中式的DMA(直接内存访问),DMA是一种加速IO设备访问的方式。由于是集中式的,所以在VMM上管理多块网卡时其实使用的还是同一个DMA,如果让第一个Guest OS直接使用了第一块网卡,第二个Guest OS直接使用第二块网卡,但使用的DMA还是同一个,而DMA是无法区分哪个Guest OS使用的是哪块网卡,这就变的麻烦了。而像Intel的VT-d就是用来处理这些问题的,以及处理各主机中断。

举例:Intel VT-d

对应具体设备是如何实现?

1)硬盘如何虚拟化?

虚拟化技术中,CPU可以按时间切割,内存可以按空间切割,那么磁盘设备呢?也可以按照空间来切割,把硬盘划分成一个一个的区域。但是好像没有这么用的,一般磁盘虚拟化的方式就是通过模拟的技术来实现。

2)网卡如何虚拟化? 

网卡的虚拟化方式一般使用模拟、半虚拟化、IO透传技术都行,其实现方式根据VMM的不同有所不同,一般的VMM都会提供所有的方式。

3)显卡如何虚拟化?

显卡虚拟化通常使用的方式叫frame buffer(帧缓存机制),通过frame buffer给每个虚拟机一个独立的窗口来实现。当然其实对于显示设备的虚拟化是比较麻烦的,所以通常在虚拟化环境中我们的显示设备性能都不会很好的,当然安装个Windows显示还是没有问题的,但不适用图形处理类的服务。

4)键盘鼠标如何虚拟化?

我们在虚拟机中使用键盘鼠标通常都是通过模拟的方式实现的,通过焦点捕获将模拟的设备跟当前设备建立关联关系,比如你使用Vmware workstation时把鼠标点进哪个虚拟机后,相当于被此虚拟机捕获了,所有的操作都是针对此虚拟机了。

总结

简单描述了CPU虚拟化、内存虚拟化、IO虚拟化的实现方式。其一,我们大概知道了如何选择虚拟化主机性能会最大化,CPU支持硬件辅助虚拟化技术,如Intel的VT;内存支持硬件辅助虚拟化技术,如Virtualization mmu和TLB;IO支持硬件辅助虚拟化技术,如Intel的VT-d。当然光有硬件的支持还不是太够,在使用虚拟化时要能够充分利用到这些硬件才行。

虚拟化的运行模式

Type-I:直接运行在操作系统之上的虚拟化,模式如下图:

Virtualization technology principle (CPU, memory, IO)

如:Vmware workstations、Kvm等。

Type-II:直接运行在硬件之上的(提供各种硬件驱动),模式如下图:

Virtualization technology principle (CPU, memory, IO)

如:Vmware EXSI、Xen等。

但是Xen有点特别,虽然也是直接安装在硬件之上,提供Hypervisor,但是只负责CPU、内存、中断,不提供I/O驱动,需要额外安装一个虚拟机再安装一个Linux系统用来管理I/O设备,如下图:

Virtualization technology principle (CPU, memory, IO)

Type-III:其他类型

当然,除了上面提到的基于操作系统或直接基于硬件的虚拟化外,还有如下常见的类型。

容器虚拟化

基于内核的虚拟化,所有的虚拟机都是一个独立的容器,但共同运行硬件之上,使用着同一个内核。优点就是速度快,部署容易,缺点就是相互间的资源相互隔离比较麻烦,但现在市场也都有了相对成熟的解决方案。如,如今大火的Docker,网上都有人说Docker具有取代虚拟化的势头。

模拟器虚拟化

通过模拟器模拟所有的硬件,如QEMU,KVM就是使用QEMU。

库虚拟化

通过在操作系统之上模拟出不同系统的库,如Linux上运行Wine就可以支持Windows上的软件运行,Windows上运行Cywin就可以支持Linux上的软件运行。因为现在操作系统都是遵循POSIX标准,所以各自提供的库接口都是同一个标准,只需要在对应的平台上运行一个可以提供对方库的软件,然后在此软件之上运行针对对方系统编译好的软件即可。为什么要运行针对对方平台编译好的软件,因为虽然库统一了,但是各自的ABI(应用二进制接口)接口还是不同的。

X86平台实现虚拟化技术的挑战?

首先我们知道X86处理器有4个特权级别,Ring 0~Ring 3,只有运行在Ring 0 ~ 2级时,处理器才可以访问特权资源或执行特权指令,运行在Ring 0级时,处理器可以运行所有的特权指令。X86平台上的操作系统一般只使用Ring 0和Ring 3这两个级别,其中,操作系统内核运行在Ring 0级,也被称为内核空间指令,用户进程运行在Ring 3级,也被称为用户空间指令。

特权级压缩(ring compression)

为了满足上面所述的需求,VMM自身必须运行在Ring 0级,同时为了避免Guest OS控制系统资源,Guest OS不得不降低自身的运行级别而运行于Ring 3(Ring 1、2 不使用)。

此外,VMM使用分页或段限制的方式保护物理内存的访问,但是64位模式下段限制不起作用,而分页又不区分Ring 0,1,2。为了统一和简化VMM的设计,Guest OS只能和用户进程一样运行在Ring 3。VMM必须监视Guest OS对GDT、IDT等特权资源的设置,防止Guest OS运行在Ring 0级,同时又要保护降级后的Guest OS不受Guest进程的主动攻击或无意破坏。

特权级别名(Ring Alias)

设计上的原因,操作系统假设自己运行于ring 0,然而虚拟化环境中的Guest OS实际上运行于Ring 1或Ring 3,由此,VMM必须保证各Guest OS不能得知其正运行于虚拟机中这一事实,以免其打破前面的“等价执行”标准。例如,x86处理器的特权级别存放在CS代码段寄存器内,Guest OS却可以使用非特权PUSH指令将CS寄存器压栈,然后POP出来检查该值;又如,Guest OS在低特权级别时读取特权寄存器GDT、LDT、IDT和TR时并不发生异常。这些行为都不同于Guest OS的正常期望。

地址空间压缩(Address Space Compression)

地址空间压缩是指VMM必须在Guest OS的地址空间中保留一段供自己使用,这是x86虚拟化技术面临的另一个挑战。VMM可以完全运行于自有的地址空间,也可以部分地运行于Guest OS的地址空间。前一种方式,需在VMM模式与Guest OS模式之间切换,这会带来较大的开销;此外,尽管运行于自己的地址空间,VMM仍需要在Guest OS的地址空间保留出一部分来保存控制结构,如IDT和GDT。无论是哪一种方式,VMM必须保证自己用到地址空间不会受到Guest OS的访问或修改。

非特权敏感指令

x86使用的敏感指令并不完全隶属于特权指令集,VMM将无法正确捕获此类指令并作出处理。例如,非特权指令SMSW在寄存器中存储的机器状态就能够被Guest OS所读取,这违反了经典虚拟化理论的要求。

Silent privilege failure (Silent Privilege Failures)

Certain privileges x86 instruction at the time of failure does not return an error, and therefore, the error will not be captured VMM, which will cause it to violate the tenets of classical virtual "equivalent execution" rule.

Interrupt virtualization (Interrupt Virtualization)

Virtualized environment, shielded and unshielded interrupt interrupt management should be carried out by the VMM; however, GuestOS Each access to privileged resources trigger processor exceptions, which inevitably frequently masked or enable interrupts, if requested by the VMM process, is bound to greatly affect the overall system performance.

Guess you like

Origin www.cnblogs.com/bj-mr-li/p/11407927.html