Turing Award winner Geoffrey Hinton's latest research: A method for estimating jointed three-dimensional models using neural networks NASA

NASA: Neural Articulated Shape Approximation

Authors:

Timothy Jeruzalski, Boyang Deng, Mohammad Norouzi, JP Lewis, Geoffrey Hinton, Andrea Tagliasacchi(Google Research)

Click here to enter "Thesis Address"

12191.png

Preface

This article is an interpretation of "NASA: Neural Articulated Shape Approximation". This article proposes a method of using neural networks to estimate joint and deformable 3D models. Compared with traditional methods, NASA has low complexity and water tightness of the model. Good, high resolution, good model estimation effect and other advantages .

 

Introduction

As deep learning plays an increasingly important role in the fields of computer vision and graphics, more and more methods have established the expression of three-dimensional geometric models based on neural network models. However, these works are mainly based on ordinary non-deformable three-dimensional models, and there are still few researches on deformable three-dimensional models with joint structures. However, because the three-dimensional model with joint structure includes the human body model, it is widely used in the fields of games, movies, virtual reality and augmented reality, so the research on this type of model is very important.

In this article, the author proposes a new method for training decoder D to estimate a jointed 3D model . The 3D model generated by the decoder is represented by an indicator function. This indicator function is represented by the 3D model. As a parameter, the input is a point x in the three-dimensional space. When the point is in the three-dimensional model, the output is 1; when it is outside the three-dimensional model, the output is 0. Unlike other methods, NASA focuses on estimating the indicator function of the 3D model through attitude parameters, which describe how the 3D model deforms.

The contributions of this paper are :

1. Propose a method of estimating a three-dimensional model with deformable joints through a neural network;

2. By clearly expressing the deformed structure of the model in the network, using fewer model parameters to achieve similar performance and better generalization ability compared with the previous method;

3. The indication function is a representation method that supports intersection and collision queries, and there is no need to convert it to other 3D model representations;

4. Compared with the previous method, the model result can better learn the actions of the three-dimensional human body model.

12192.jpg

Figure 1: NASA model generation effect (source [7])

 

Related work

For the deformation of jointed 3D models, skinning algorithms are traditionally used to combine the changes of the vertices on the triangular mesh surface with the changes of the skeleton of the 3D model. Among them, the LBS (Linear Blend Skinning) algorithm [1] The transformed vertex is expressed as the weight sum of the influence of the skeleton associated with the vertex on the point, but the LBS algorithm also has the problems of "collapsing elbow" and "candy wrapper" [2]. For the representation of the 3D model, researchers have proposed a series of deep learning-based methods to represent the 3D model in blocks [3] [4]. For the indicator function, a three-dimensional model representation based on the implicit field, there are also many related works [5], but these works do not consider the factor of deformation.

 

Introduction to the NASA model

12193.jpg

1. Unstructured model (Unstructured model-"U")

12194.png

2. Piecewise rigid model (Piecewise rigid model-"R")

12195.png

3. Piecewise deformable model (Piecewise deformable model-"D")

12196.png

4. Implementation details

12197.png

 

Experimental result

The effect of the model was tested on 2D and 3D data sets. The performance of the model was evaluated by the intersection of the prediction result and the ground-truth.

1. Two-dimensional data

The two-dimensional data set contains 100 sets of actions. In this data set, the geometric shapes are generated in two ways: ①For the rigid data set, it contains a set of shapes, and each shape corresponds to each skeleton of the model. In the process of changing the posture of the entire model, each individual shape will not change. ②For the mixed data set, the deformed shape is obtained through the LBS algorithm. The experimental results on the two-dimensional data set are shown in the following figure:

12198.png

Figure 3: Two-dimensional data estimation effect (source: [7])

12199.png

Figure 4: R estimation effect of the fragmented rigid body model (source: [7])

121910.png

Figure 5: Estimation effect of segmented deformable model D (source: [7])

It can be seen that for the training set, the three methods have fitted relatively good results. For the two structured methods, since "D" does not restrict each part from changing the shape, "D" is compared to "R" has a better effect on the LBS data set. However, both "U" and "D" have over-fitting phenomenon, and only "R" still shows good results on the test set.

121911.jpg

Figure 6: The effect on the test set (source: [7])

2. Three-dimensional data

The test of the three-dimensional model is implemented on the AMASS data set [6]. The experimental results are as follows, which are similar to the results of the two-dimensional data:

121912.png

 

121913.png

Figure 7: The effect on the 3D dataset (source: [7])

 

to sum up

This paper proposes a new way of thinking, using deep learning methods to estimate the articulated deformable three-dimensional model through the pose parameters of the model, and compares the structured model (R, D) compared to the unstructured model (U ) Has higher efficiency and better generalization ability. The proposal of this method is of great significance for representing complex jointed models such as the human body.

Future direction:

1. Compared with "D", "R" shows better generalization ability in experiments, but "D" still has higher utilization value in some scenarios. Is it possible to combine these two models?

2. For a deformable model, can the pose parameters of the model be learned {B_b};

3. Can the symbolic distance function be used to replace the current indicator function;

4. Whether NASA can be used for differentiable rendering;

5. Whether the representation of the motion of the deformable three-dimensional model can be obtained only through two-dimensional information.

 

references:

【1】Alec Jacobson, Zhigang Deng, Ladislav Kavan, and J.P.Lewis. Skinning: Real-time shape deformation. In ACMSIGGRAPH Courses, 2014.

【2】J. P. Lewis, Matt Cordner, and Nickson Fong. Pose spacedeformation: A unified approach to shape interpolation andskeleton-driven deformation. In Proceedings of the 27thAnnual Conference on Computer Graphics and InteractiveTechniques, SIGGRAPH ’00, pages 165–172, New York,NY, USA, 2000. ACM Press/Addison-Wesley PublishingCo.

【3】Dominik Lorenz, Leonard Bereska, Timo Milbich, andBjÃ˝urn Ommer. Unsupervised part-based disentangling ofobject shape and appearance. arXiv:1903.06946, 2019.

【4】Lin Gao, Jie Yang, Tong Wu, Yu-Jie Yuan, Hongbo Fu, YuKun Lai, and Hao Zhang. Sdm-net: deep generative network for structured deformable mesh. ACM TOG, 2019.

【5】Jeong Joon Park, Peter Florence, Julian Straub, RichardNewcombe, and Steven Lovegrove. DeepSDF: Learningcontinuous signed distance functions for shape representation. CVPR, 2019.

【6】Naureen Mahmood, Nima Ghorbani, Nikolaus F Troje, Gerard Pons-Moll, and Michael J Black. Amass: Archive ofmotion capture as surface shapes. ICCV, 2019.

【7】Jeruzalski, T., Deng, B., Norouzi, M., Lewis, J. P., Hinton, G., & Tagliasacchi, A. (2019). NASA: Neural Articulated Shape Approximation. arXiv preprint arXiv:1912.03207.

 

Author | Xiao Yunpeng

Typography | Academic Spinach

Proofreading | Academic Youth Association

Responsible Editor | Academic Youth Excellent Academic

 

Past review:

[NeurIPS100] Seven award-winning papers of NeurIPS2019 are announced and in-depth analysis of selected papers!

[NeurIPS100] Interpretation of ten latest machine learning papers from Google, Facebook, Stanford, etc.

[NeurIPS100] Who are the highly productive Chinese authors of NeurIPS2019? Which paper has the highest number of citations, just read this one!

Guess you like

Origin blog.csdn.net/AMiner2006/article/details/103611791