Capacity management using genetic algorithms to make better virtual machine placement strategy

by Wang Guobing

With the rapid development of Internet technology, cloud computing has become all walks of life "water, electricity" as the "Internet +" infrastructure, data center and cloud services is behind the rigid protection. Whether traditional data center, or cloud form of data centers, virtualization technology is to enhance its resource utilization and reduce an important way to manage costs.
According to statistics, in early 2016, AWS server scale has reached 550,000 units, while the number of virtual machines and more than 800 million. With the dramatic increase in the number of virtual machines, virtual machine placement program naturally become a key determinant of data center resource utilization and other needs.

If you want to increase the resource utilization of the data center, you need to use the appropriate program placement, minimal physical machine will be able to optimize resource utilization, and to play to save energy. In order to maintain high availability of the system, but also to select the proper placement scheme, such that the resource usage among different servers try to balance.
Many studies place the virtual machine described as "multi-dimensional packing" problem, the loaded article is a virtual machine, the resources used for its size articles; physical machine is the case, the capacity of the box is the physical machine configurable threshold, the resources may include CPU the number of types, memory, disk, network bandwidth and other resources, is the dimension of the problem of packing.
Suppose the number of physical machine M, the number of virtual machines is N, theoretically up to the N-th power of the M types of deployment scenarios, NP-hard problem. In the process of solving NP-hard problem, the heuristic algorithms such as genetic algorithms, simulated annealing algorithm has excellent performance and so on. Here we introduce classical genetic algorithm ideas and specific floor plan in solving the problems.
Capacity management using genetic algorithms to make better virtual machine placement strategy
Global optimization algorithm is a genetic search algorithm, its simple versatile, robust, suitable for parallel processing, as well as efficient and practical, and other notable features, in various fields has been widely used.
All individuals in the genetic algorithm kind of group as an object, and encoded on a parameter space efficient random search techniques. Wherein, selection, crossover and mutation operations constitute genetic GA; encoding, setting the initial population, fitness function design, design of genetic manipulation, the control parameter setting of five elements of the core of the genetic algorithm. Well, for a "virtual machine is placed on a physical machine," this issue, how to understand the five elements of genetic algorithms it?
Capacity management using genetic algorithms to make better virtual machine placement strategy
By the following interpretation of the concept, we will be in the real world of virtual machines, physical machines, virtual machine placement program, evaluation of the current program indicators, and maps to "genetic algorithm" in the process of proper nouns, and then by the genetic algorithm Solution:
L chromosome: a chromosome represents a "virtual machine program current position of a sequence", also referred to individuals in the population;
Evaluation chromosomally l: different placement adaptation schemes have different values, and adapted to determine the merits of the current value of the program; the number of physical machines commonly used to assess chromosome, considering availability constraints, and other factors limit the number of virtual machines, which kinds of problems for multiple targets can use the "fast non-dominated sorting" binding "local crowding distance" algorithm to solve;
L genetic component acts (crossover, mutation): changing portion on chromosome behavior of the virtual machine placement sequence;
L population: a plurality of chromosomes, chromosome evolution represents the current collection (a collection of various placement schemes) under the number;
after proper noun is mapped to real physical problems and genetic algorithms, we will demonstrate how the five elements GA mapped to real-world problems:
encoding
Capacity management using genetic algorithms to make better virtual machine placement strategy
as shown above, each of the "full sequence for a virtual machine program is placed on the physical machine sequence" as a chromosome, which is typical of a "set code" manner, we can see the result of a different physical machine, the number of physical machines used in the final two chromosomes is not of.

初始群体设定
基于启发式的初始化群体方法可以降低算法优化过程的复杂度,但同时要保证群中染色体的多样性,这样是为了扩大搜索解空间,从而求解近似于全局最优解而非局部最优解;
在启发式算法中,第一个虚拟机放入第一个物理机,然后根据虚拟机序列顺序,依次放入其他虚拟机,直到第一个物理机资源不足。该过程可理解为:对于任意一台虚拟机,需先遍历当前已使用的物理机序列,直到被使用的物理机无法满足该资源需求。此时新开辟一台物理机为其使用——这是生成一条染色体的方式,同时为了保证种群的多样性,在每次生成新染色体时,我们需要随机打乱虚拟机序列的顺序,这种方法显然是简单而有效的。
适应度函数的设计
适应度是对当前种群中个体的评价方式,对于多目标优化问题一般可采用二代多目标优化算法NSGA2中的快速非支配排序进行个体等级划分,然后通过局部密度算法进行密度估计,确定每个个体的优先级。
这个过程听起来很抽象,它的具体操作是怎样进行的呢? 如下图所示:若Costf1 和 Cost f2 分别表示个体在“使用物理机数量”、“整体高可用性”上的度量,且取值越大代表该方案在该维度上越优秀,则对于个体集合{1,2,3,4}来说,每个元素都至少存在另外一个在各方面比自身优秀的个体,如对于个体4来说,个体5在“使用物理机数量”、“整体高可用性”上的维度上都优于个体4,此时我们说“个体5是支配个体4的”;而对于集合{5,6,7}来说,因为不存在支配他们的个体,所以他们被称为“非支配解”;这种“非支配解”的适应度在理论上应该是最高的。
Capacity management using genetic algorithms to make better virtual machine placement strategy
遗传操作设计
遗传算法中的核心操作便是“选择”、“基因交叉”、“基因变异”,这三个操作决定了种群的进化方向,是让算法朝着我们的目标不断优化的基础。
"Select" (Selection): Constraint competition using 2 selection method, in this method, the current from two randomly selected individuals in the population, and the maximum value of the adaptation of the individual as a parent individuals, the operation is repeated until the number reaches the individual predetermined population size, elected as the next individual parent population. Comparison method using two individual crowded NSGA2 introduced comparison operator.
"Cross" (Crossover): the selection process two "select" is selected as the parent individual objects intersecting A, B, A random selection of a physical unit, the physical machines into a B, while B individual original physical machine comprising a virtual machine is removed from the sequence B, the record is deleted while the virtual machine is again placed on the B. This operation implements a virtual machine assigned physical sequence of the heavy machine, and because the selection is random physical machine, to ensure the diversity "crossover"; and
Capacity management using genetic algorithms to make better virtual machine placement strategy
"variant" (Mutation): Operation "variant" is relatively simple, and explanatory having straightforward: the parent from a randomly selected individual chromosome a, then a randomly selected individual from a physical machine a, which was deleted, and the original sequence of the virtual machine on a re-inserted into the a - this the purpose is to reduce the use of physical machines, but due to a physical machine is randomly selected, it does not guarantee that every variation of the direction is right, so we can reduce the probability of "mutation" operation occurs, but it is also It reflects the importance of the right of "choice".
The control parameter setting
parameter set mainly refers to the probability of occurrence of each operation in the genetic manipulation, such as: we can set the probability of occurrence of "cross" operation is 0.6, the probability of occurrence of "mutation" operation is 0.05; valid parameter set given population can be controlled faster and better to give near-optimal solution.
By implementing the above five elements, we can fix the steps genetic algorithm to solve the problem of virtual machine placement.
Capacity management using genetic algorithms to make better virtual machine placement strategy
In general, the conditions out of the iterative algorithm is twofold: First, to adapt to the current value of the best chromosome has met the target set value, the second is the number of iterations specified. Choose a reasonable strategy through genetic manipulation of human experience, we can quickly get our objectives of the program.

Guess you like

Origin blog.51cto.com/14281532/2448698