Detailed explanation of BlendTree animation blending algorithm

【Mixed Essence】

If you understand skeletal animation, you know that the character's Pose at a certain moment is interpolated by all bones in sequence through two adjacent keyframes. In other words, it is a mixture of two keyframes.

So can it be mixed from multiple keyframes? sure.

More keyframes can come from different animation clips (AnimationClip), and each animation clip is given a different weight when blending, but the sum of the weights is still 1.

We can specify in advance which animation clips are required for blending, and the weight of each animation clip is calculated at runtime through the blending algorithm.

Through blending, a limited number of animation clips can be used to mix multiple character states, thereby reducing the production of animation clips.

Blending is generally used for character movement locomotion.

[One-dimensional linear interpolation mixing]

One-dimensional linear interpolation blending requires that different animation clips have only one attribute that differs, such as the speed and direction of character movement.

Taking speed as an example, there are five animation clips, the speeds are 1 2 3 4 5m/s respectively. The different animation clips are arranged in a line from left to right according to the size of the attributes.

Then the BlendTree has five AnimationClips, records the attributes of each AnimationClip, and has an attribute parameter speed.

The movement speed of the character will change during runtime, and the value of the character's speed is the current value of the Speed attribute of the BlendTree.

You can find the two closest to the current speed from the five AnimationClips, so you can find the animation clips that need to be mixed during runtime.

(Note that if the current speed is less than the minimum value of the animation clip, or greater than the maximum value, then there is only one adjacent animation clip, and there is no need to mix it)

Next, you need to determine the weights of the two animation clips. Obviously, the weights of different animations can be determined based on the speed values of the two animation clips and the current speed value. (I don’t understand how to determine it here.Please understand linear interpolation first)

[Two-dimensional bilinear interpolation hybrid]

Bilinear interpolation requires four points, Q11, Q21, Q12, Q22.

There is an interpolation coefficient a1, which allows Q11 and Q21 to complete linear interpolation to obtain R1. Q12 and Q22 also use a2 to complete linear interpolation to obtain R2.

There is another interpolation coefficient a2, let R1 and R2 complete linear interpolation to obtain P

In animation mixing, four points represent four animation fragments, and the The animation clip's properties are calculated. This coordinate system is called parameter space.

Among them, because there is a problem of sharing a1 and a2, it is assumed that a1 corresponds to the speed attribute and a2 corresponds to the orientation attribute. It is required that the animation clips Q11 and Q12, Q21 and Q22 have different speed attributes from each other, and Q11 and Q12, Q21 and Q22 have different orientation attributes from each other.

That is, two different values of the two attributes correspond to 4 animation clips.

If two attributes have N and M different values respectively, they correspond to N*M animation clips.

In actual operation, you need to first know the four animation clips based on the attribute values, and search them twice. Each search method is the same as the one-dimensional linear interpolation mixed search method, and then perform bilinear interpolation to obtain the final character pose.

[2D Triangular Interpolation Mixing]

Linear interpolation uses two known points to interpolate to calculate an unknown point. Of course, you can also interpolate three known points to calculate an unknown point.

y = a*y1 + b*y2 + c*y3. Generally, a + b + c = 1.

Therefore, to calculate the interpolation result, two unknowns a and b need to be calculated first.

This can be calculated exactly throughthe barycenter coordinates of the triangle: three known points in the two-dimensional coordinate system can be combined to form three vectors, The dot product of vectors gives the area, and the ratio of the areas gives the unknown.

In animation blending, three points represent three different animation clips, and the coordinates of the points are the property values of the animation clip.

What if there are many animations? Just like in 2D bilinear interpolation blending, first find the three animation clips you want to blend.

How to find it?

We usually assume that there is a central animation clip near the coordinate origin (preferably at the coordinate origin). This clip is generally an Idle-like animation clip, called the center point.

There is a connection between the center point and all other points (also called sampling points). By judging the angle formed by the two connecting lines, the point (also called the target point) formed by the two attribute values passed in during runtime is located. That’s it.

How to judge? Vector cross product can determine whether the point is on the left or right side of the line.

[More complex interpolation mixing]

The above three mixing algorithms are relatively easy to think of and easy to understand. When actually mixing, only 2 or 3 animation clips are actually involved in the mixing. How do you want more animation clips to participate in the mixing, so how to give the weight value of each parameter?

This is essentially a discrete data interpolation problem. There is a basic assumption here. Assume that the influence factor of a certain sampling point on the target point is h, and the sum of the influences of all sampling points is H, then the weight of the sampling point w = h/H.

Note that if H = 1, then the influence factor is the weight.

We only need to find how to calculate the impact factor h. The same is true for the previous three simple methods. The interpolation method is as follows:

Inverse Distance Weighted Interpolation
Natural Neighbors Interpolation
Radial Basis Function Interpolation Radial Basis Function Interpolation
K-Nearest-Neighbors Interpolation
Gradient Band InterpolationGradient Band Interpolation

[Inverse distance weight interpolation]

Calculate the distance from all sampling points to the target point, the influence factor is the reciprocal of the distance h= 1/distance(s, p)

It can be seen that the closer the sampling point is to the target point, the greater the influence factor.

In order to reduce the amount of calculation, you can

1. Limit the influence distance D of a point. If it exceeds this distance, it is considered that there will be no influence.

2. Don’t count the distance, take the square of the distance

[Natural neighbor interpolation]

Natural neighbor interpolation is based on area (area-based). First, the influence area of each sampling point must be calculated, using the Voronoi diagram algorithm calculate.

Then calculate the Voronoi diagram formed by the target point and surrounding points after adding the target point.

The overlapping area of the Voronoi diagram of the target point and the Voronoi diagram of the surrounding points is the influence factor, as shown in the figure:

This method requires a large amount of calculation and is not suitable for real-time calculation.

[Radial basis function interpolation]

After studying linear algebra, we know that in vector space, any vector can be represented by a linear combination of a set of basis vectors. The number of basis vectors in a two-dimensional vector space is 2, and the number of basis vectors in a three-dimensional vector space is 3. , the number of basis vectors in N-dimensional vector space is N

In function space, any function can be represented by a linear combination of a set of basis functions.

Radial basis functions are real-valued functions whose values depend only on the distance from the origin. The dependent representation function only has one independent variable, which is the distance from the point to the origin. Note that the origin is not the origin of the coordinate system. The origin is chosen by ourselves. The distance to the origin generally refers to the Euclidean distance, which is the commonly used method for calculating the distance between two points.

The basic radial basis functions are:

In animation mixing, the target point is the origin, and the value of the influence factor can be calculated based on the distance from other points to the origin and the radial basis function.

[K neighbor interpolation]

Similar to inverse distance weight interpolation, here only the K sampling points closest to the target point are selected to participate in the actual mixing. The weight of the Kth sampling point is 0, and the weight of further sampling points is also 0. The calculation of the influence factor is the same as inverse distance weight interpolation.

[Gradient band interpolation]

There are three points O, A, and B. In one-dimensional linear interpolation mixing, they are on a straight line. The influence factor of each point is calculated by judging the distance from point O to point A and from point O to point B. Then point A The influence factors of point B and B are respectively (at this time the weight value is equal to the value of the influence factor)

h(A) =1 - OA/AB h(B) =1 - OB/AB。

Note that if O is to the left of point A, then the influence factor of A is 1 and the influence factor of B is 0. vice versa.

What if point O is not on the straight line where AB is?

At this time, to calculate the distance, there is an additional distance from O to the straight line where AB is. This term is the same for point A and point B and can be offset. We only need to get the horizontal distances from point O to point A and point B respectively. Using the same formula, we can get the influence of A and B on O.

How to find this distance? You can see that this is actually finding the projection length of vector OA on vector AB. (If you don’t know how to calculate the projection length, you can learn it first), and then you can get h(A), h(B)

h(A) is the gradient value of point A to point O relative to point B. If there are points C, D, and E, the gradient value of point A to point O relative to these points can be found.

From these gradient values, find the minimum value, which is the influence factor of point A on the target point among all sampling points, expressed by the formula:

For a given sampling point, if the target point is an unknown number, the gradient value distribution (i.e. gradient band) of a certain sampling point with respect to the target point can be calculated, as shown in the following figure:

It can be seen that P1 still has a certain influence near the vertex of the purple triangle. For mixtures containing speed, you may not want to be affected by P1.

At this time, the angle needs to be considered when calculating the impact factor. The angle is the angle between the target point P and the vector of a certain sampling point A. The smaller the angle, the greater the impact.

Similarly, if the origin is O, if point P is on the left side of straight line OA, and sampling point B is on the right side of straight line OA, then the weight of point A should be 1.

How to calculate the impact factor by combining distance and angle? Referring to the polar coordinate system, establish a coordinate system of speed and angle, find the position of the target point and the sampling point in the coordinate system, and then make a gradient band difference based on distance. That is, modify the coordinate representation of the vector, as shown in the figure below, where α is the control factor.

[Evaluation Criteria for Weighting Algorithms in Animation Mixing]

Accuracy: If the target point coincides with a certain sampling point, its weight must be 1, and the weight of other sampling points must be 0. In fact, if the target point is close to a small range of the sampling point, then the target point and the sampling point are considered to coincide. The purpose of this is to preserve the original motion as the animator intended. In regression analysis, a best fitting value is estimated through statistical methods.
Cumulative sum: The cumulative sum of all weights must be 1
Continuity: The weight function must haveC0 continuity. Small changes in the target point can only make the changes in the weight small, otherwise it will cause Mixed motion dither
Boundedness: All weights must be between 0 and 1, otherwise unpredictable and undesirable results will be mixed.
Locality: Only the weights of sampling points within a certain range near the target will be greater than 0, and the weights of other sampling points are 0, which can improve mixing efficiency.
Monotonicity: Each weight function should decrease from its global maximum to its global minimum. Strictly speaking, the weight function should not have a local minimum. For example, the weight of a running action should decrease from 1 to 0 as the speed decreases from running speed to walking speed.
Density invariance: It has the same effect on uniformly and unevenly distributed sampling points. Places with many nearby sampling points should not be given greater weight than similarly adjacent single sampling points.

【reference】

Unity - Manual: 2D Blending

"Game Engine Architecture"

Automated Semi‐Procedural Animation for Character Locomotion