"Towards Fast, Accurate and Stable 3D Dense Face Alignment" 3DDFA-V2 paper study and engineering implementation

1: The main 4 parts of the paper

Insert image description here
1: Using a lightweight network, the training model regresses 62 3DMM parameters.
default output is dimension with length 62 = 12(pose) + 40(shape) +10(expression)
The project announced three models: mb05_120x120, mb05_120x120, resnet_120x120 (mobilenet 0.5, mobilenet 1.0, resnet 1.0)
2: Proposed meta-joint optimization Optimization Strategy.
That is, dynamically combining fwpdc and vdc loss functions to speed up fitting.
That is, a set of meta-learning methods is used to dynamically adjust the weights of fwpdc and vdc.
The specific operation is:
use vdc to fine-tune the model trained by the wpdc loss function (or use wpdc to fine-tune vdc). That is, first use the wpdc loss function as above to train the model; save the parameters of the model; when restarting this training, initialize the parameters of the neural network model to the parameters of the model just trained with wpdc, and then continue to use the vdc loss function to train the model ,As shown below.
Insert image description here
Insert image description here
(1) The Vertex Distance Cost (Vdc)
Insert image description here
parameter here is the 62 3DMM parameters of the model regression (12(pose) + 40(shape) +10(expression)).
V3d here means using 62 3DMM parameters to calculate points on the 3DMM head model.
The code is implemented as:

pts3d = R @ (self.u_base + self.w_shp_base @ alpha_shp + self.w_exp_base @ alpha_exp). \
                    reshape(3, -1, order='F') + offset

T here is a 3X4 matrix containing pose coefficients. The last 3X1 matrix represents offset information.

offset = R_[:, -1].reshape(3, 1)

Insert image description here
Basic head mold.
Using the parameters predicted by the model, calculate the 3D face point P_pre, and then use the 3D points with the same index point on the head model (these data have been previously stored in the PKL file) to make the L2 paradigm loss.
(2) Fwpdc
Insert image description here
calculates the distance loss between the regressed 3DMM parameters and the real 3DMM parameters on the fixed weight.
3: Feature label regression regularization:
Add a regression branch to return 68 2D points after the global pooling layer.
4: Short video synthesis (don’t pay attention)

2: Use the paper project to run the face 3D point data set

Official project address: https://github.com/cleardusk/3DDFA_V2

The project did not release the training code, only the test demo. You can use the project as a tool for running data to generate 3D face points, and the effect is not bad.
1: Directly use the demo data published in the paper:
Insert image description here
the face shape of the same person at different angles is different in 2D vision. This phenomenon is normal. 2D vision does not combine depth information, so there will be visual differences. However, no matter how the perspective changes for the same person, the relationship between any two 3D points in the 3D space remains unchanged. For example, in the figure below, the module lengths of vector BA and vector BC remain unchanged, and the angle between them also remains unchanged.
Therefore, it is normal for a fat face to look smaller when viewed from above.
Insert image description here
2: Straighten the face from all angles .
That is, the pose coefficient in the head model is removed, and the face shape can be compared intuitively.
Change the coefficients predicted by the pose 3*3 matrix to the initialized identity matrix.
The code needs to be updated:

def recon_vers(self, param_lst, roi_box_lst, **kvs):
        dense_flag = kvs.get('dense_flag', False)
        size = self.size

        ver_lst = []
        inp_lst=[]
        for param, roi_box in zip(param_lst, roi_box_lst):
            R, offset, alpha_shp, alpha_exp = _parse_param(param)
            if dense_flag:
                inp_dct = {
    
    
                    'R': R, 'offset': offset, 'alpha_shp': alpha_shp, 'alpha_exp': alpha_exp
                }
                pts3d = self.bfm_session.run(None, inp_dct)[0]

                pts3d = similar_transform(pts3d, roi_box, size)
            else:
                a = [-(self.param_std[0] + self.param_mean[0]),0,0]
                b = [0,self.param_std[5] + self.param_mean[5],0]
                c = [0,0,-(self.param_std[10] + self.param_mean[10])]
               
                R2 = []
                R2.append(a)
                R2.append(b)
                R2.append(c)
                
                R2=np.array(R2)

                #d=np.array([self.param_std[3] + self.param_mean[3],self.param_std[7] + self.param_mean[7],self.param_std[11] + self.param_mean[11]])
                d=np.array([0,0,self.param_std[11] + self.param_mean[11]])
                #d=np.array([self.param_std[3] + self.param_mean[3],self.param_std[7] + self.param_mean[7],0])
                d=d.reshape(3, 1)

                pts3d = R @ (self.u_base + self.w_shp_base @ alpha_shp + self.w_exp_base @ alpha_exp). \
                    reshape(3, -1, order='F') + offset
                
                pts3d_front = R2 @(self.u_base + self.w_shp_base @ alpha_shp + self.w_exp_base @ alpha_exp). \
                    reshape(3, -1, order='F')+ d

               
                pts3d = similar_transform(pts3d, roi_box, size)
                pts3d_front = similar_transform(pts3d_front, roi_box, size)

            ver_lst.append(pts3d)
            inp_lst.append(pts3d_front)


        return ver_lst,inp_lst

Effect:
Insert image description here
Insert image description here
Insert image description here

Guess you like

Origin blog.csdn.net/jiafeier_555/article/details/127307853