Paper: DensePose: Dense Human Pose Estimation In The Wild reading notes

Disclaimer: This article is a blogger original article, reproduced, please attach Bowen link! https://blog.csdn.net/m0_37263345/article/details/83015268

First, the paper Introduction:

Attitude estimation model based on a dense point

Author :

This article is Facebook AI and INRIA ( the French National Institute of Information and Automation ) co-produced

Effect :

Image key corresponding to 2D à 3D Surface Coordinate (UV coordinates)

The key point 2d image is mapped into a uv coordinates, and then back to the uv coordinates posted on the 2d of the image.

Difficulty :

The presence of background, block, attitude, change scale. Previous work to deal with this problem requires a depth sensor

Innovation:

1 : marked a new data set, based on coco data sets, an increase of uv mark, the open source data set.

2 : Design a picture rgb output input a uv coordinate the network framework. Changes in the structure of the mask rcnn basis.

3 : Design of a "teacher" network (used to generate the data set).

data set:

Data Source COCO2014 data set

Train : 26437 pictures

valminusminival: 5984 images
All:32421 images  48k humans
minival: 1508 images 2.3k humans

It was marked 50K individual, manual annotation and point of more than 5 million.

teacher network:

Manual annotation phase: human, divided into 24 block area. Each local cluster number is generated by key points equidistant, to the number of keypoints depends on the size of the area, a maximum of 14 points. These point to a manual annotation 3d surface ( uv coordinates) everyone about 100-150 points

Teacher: training a full convolution Teacher Network (sparse feature points input manual annotation of uv coordinates) output is more dense feature points. When training is based on the picture bounding box cut people out of the area, use this to train the network, reduce background impact.

Network architecture:

backbone networkResnet50,Resnet101,Resnext

RPN: fpnROI-Align pooling

Head: mandatory Fast-rcnn body_uv

Optional: mask , Keypoint

The main contribution to the structure of the network is body_uv branch and cascade structure

Respectively, in terms of:

(1), body_uv branch and using the mask rcnn keypoint branch in the same branch of the network structure (equivalent to the resolution of the eight convolution structure) followed by the loss of two functions:

Using a classifier to classify each pixel, (24 + 1 (bachground)), the cross-entropy loss function is classified into the loss zone belongs. 25 Road

Regression Regression is then accurate position, L1 loss function loss, passage 24

(2) modify the network tips: the output multitasking cascade synergy through the task, as well as some of the advantages brought about by different oversight complementary information,

Specifically, the keypoint is the output and body_uv / mask of fusion, and then outputs the calculated loss.

effect:

A GTX 1080 GPU

20-26 frames / s 240 * 320 images

4-5 frames / s 800 * 1100 picture

detail:

And DenseReg like, but DenseReg using a full convolution , and DenseReg merely to predict the face, and changes to a lesser extent. (The first author is the same person)

 

Second, the experimental part :

Comparison of the neural network based on the full convolution ( deeplab ) based network region (RCNN-Mask) , found in the superior region-based network

Posetract challenge

The same task again marked a set of data in the data set on posetrack

Train:1680 images
Val:782 images
Test: 2698 images
  • data analysis
  • , Training and validation

data

COCO_Densepose

Train

Images  annotations  categories
小训练集的图片数:26437
图片的的字典:[u'license', u'file_name', u'coco_url', u'height', u'width', u'date_captured', u'flickr_url', u'id']
图片字典举例:{u'license': 2, u'file_name': u'COCO_train2014_000000262145.jpg', u'coco_url': u'http://images.cocodataset.org/train2014/COCO_train2014_000000262145.jpg', u'height': 427, u'width': 640, u'date_captured': u'2013-11-20 02:07:55', u'flickr_url': u'http://farm8.staticflickr.com/7187/6967031859_5f08387bde_z.jpg', u'id': 262145}
小训练集的标注文件数:100403
标注文件的字典:[u'segmentation', u'num_keypoints', u'dp_masks', u'area', u'iscrowd', u'dp_I', u'keypoints', u'id', u'dp_U', u'image_id', u'dp_V', u'bbox', u'category_id', u'dp_y', u'dp_x']
关键字:segmentation长度:1
关键字:num_keypoints值:15
关键字:dp_masks长度:14
关键字:area值:21258
关键字:iscrowd值:0
关键字:dp_I长度:115
关键字:keypoints长度:51
关键字:id值:1218400
关键字:dp_U长度:115
关键字:image_id值:262145
关键字:dp_V长度:115
关键字:bbox长度:4
关键字:category_id值:1
关键字:dp_y长度:115
关键字:dp_x长度:115
标注文件字典举例:{u'segmentation': [[453, 292.1, 457, 253.1, 439, 245.1, 438, 215.1, 439, 198.1, 420, 223.1, 414, 233.1, 401, 227.1, 400, 226.1, 398, 229.1, 391, 231.1, 387, 213.1, 399, 203.1, 404, 200.1, 413, 194.1, 418, 186.1, 408, 181.1, 415, 154.1, 418, 142.1, 419, 127.1, 422, 125.1, 419, 120.1, 412, 122.1, 407, 112.1, 402, 105.1, 389, 113.1, 390, 105.1, 395, 100.1, 395, 97.1, 398, 83.1, 407, 72.1, 417, 71.1, 424, 72.1, 428, 73.1, 436, 80.1, 441, 90.1, 446, 96.1, 456, 101.1, 472, 110.1, 480, 113.1, 493, 123.1, 499, 136.1, 504, 147.1, 509, 167.1, 515, 182.1, 531, 205.1, 532, 218.1, 525, 229.1, 514, 246.1, 499, 283.1, 499, 307.1, 499, 323.1, 499, 343.1, 505, 367.1, 505, 380.1, 505, 381.1, 486, 387.1, 482, 392.1, 479, 393.1, 469, 363.1, 453, 343.1, 451, 339.1, 454, 321.1, 453, 312.1, 460, 313.1, 458, 298.1, 452, 293.1]], u'num_keypoints': 15, u'dp_masks': [{u'counts': u'\\Qa03k7200M210N110N1N2100000O1000O0010O2O0O10001O0O1N200O2O001O0O1O?B0O:G0O;F017H5K011N001O010O2N001O010O002N001O1O011N001O010OO1000001O000O2O00000000001O00001O001N1000001O0O1000001N101O00001N101O002N1O002N002N002M102N1N103M001O002M2O003M001N102N2N002N001O002M3N002N002M101O002N2M102N001O002N2N002M102N001OUU9', u'size': [256, 256]}, {u'counts': u'lk5110l70TH0j7400O2O0NNZHNf72ZHNe77O01003M0O010O10O100000O10000O01000000O10000O10O1001O001N101M3N101M20[\\_1', u'size': [256, 256]}, {u'counts': u'i33m70L5O0O2O0O100O1OL`HIa77_HI_79_HIa77_HI`77`HK_75aHK_7:1N20OO200000000000O1000O10O0101N2N101O001N102M`Tf1', u'size': [256, 256]}, [], {u'counts': u'e_Y12m73N0O2O002N0O10000000000000000000000000000000M2L50_P`0', u'size': [256, 256]}, {u'counts': u'im[14l70N110L400M3N200M300M300M3N200M300L400N2M4O0M@c00ON30NN40ON0120ON30NO30NN3O20M120O010NO2010OO201N100M3O100N20P\\6', u'size': [256, 256]}, {u'counts': u'[SP16_14b4;_J9]5GcJ9]5X1M0O5L004K3N0000000000000000000O100000000O1000000O2O0000000O10001O000000000000000O1000O0N300N200L40ON3N200M30ON300M3N200M300L40OO2M300M30O0100N200O0N300O100N200N2O10ON300O100N10101L301N101M3M201M201N101E:0m\\5', u'size': [256, 256]}, {u'counts': u']oX17i70D<J600K410M300K410E;L400O100N20000000000000BoImNQ6S1oImNQ6k0f0I700I700E;0TR?', u'size': [256, 256]}, {u'counts': u'jfo03m70I8J6O0I8O0J7O0J8N2N001O002N001O002N2N001O011N001OL400L400J600K5L400N201M200M3N200M300O100M300IZRd0', u'size': [256, 256]}, {u'counts': u'fZ>4l70E;00E;F:00J600O1000O11O0O100O100O1O10000000000O1O1001O0000001O00000O101F901A>00BQWW1', u'size': [256, 256]}, [], {u'counts': u'Z[71o7001O1O0O2O001O0O3M100O100O0100O0100O010O100O01000M300L31000000O10000O100O10O11O0000000O100O1O100O1000000O1O100O100O1000000O1O100O10VmW1', u'size': [256, 256]}, {u'counts': u'_[>2n70O2O0O2O1O0O2O0O100O1O0100000O010O00010O1000O10O00100O010O0100000O12M103L104L00g\\W1', u'size': [256, 256]}, {u'counts': u'`P52n70L6aHJP77jHNV72jHNU7?00O100O100O102N0O2O002N0O2O000O10001O0O101O001O00000O2O00000000O100O100O1O100N200N200O10001N100O101O00001O001N101O1N102M101N102N002M2M201N102N000O2N10[gV1', u'size': [256, 256]}], u'area': 21258, u'iscrowd': 0, u'dp_I': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 1.0, 1.0, 2.0, 1.0, 1.0, 2.0, 24.0, 24.0, 24.0, 24.0, 24.0, 24.0, 24.0, 24.0, 24.0, 24.0, 24.0, 24.0, 24.0, 24.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 9.0, 1.0, 1.0, 2.0, 8.0, 10.0, 10.0, 8.0, 10.0, 10.0, 8.0, 10.0, 8.0, 10.0, 8.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 21.0, 13.0, 13.0, 13.0, 13.0, 13.0, 13.0, 13.0, 13.0, 11.0, 13.0, 10.0, 12.0, 14.0, 12.0, 14.0, 12.0, 14.0, 12.0, 14.0, 12.0, 14.0, 14.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 20.0, 20.0, 20.0, 20.0, 20.0, 20.0, 6.0, 6.0, 6.0, 6.0, 6.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0], u'keypoints': [407, 115, 1, 407, 105, 2, 0, 0, 0, 425, 95, 2, 0, 0, 0, 435, 124, 2, 457, 105, 2, 428, 187, 2, 447, 182, 2, 404, 210, 2, 419, 213, 2, 488, 222, 2, 515, 213, 2, 471, 293, 2, 487, 297, 2, 462, 372, 1, 486, 374, 2], u'id': 1218400, u'dp_U': [0.1581381857395172, 0.34230467677116394, 0.18912158906459808, 0.48821353912353516, 0.3884721100330353, 0.5135913491249084, 0.6443536877632141, 0.38792282342910767, 0.7409800291061401, 0.6505376100540161, 0.520325779914856, 0.8216454982757568, 0.8167153596878052, 0.6329784393310547, 0.8746959567070007, 0.7455276250839233, 0.9248602390289307, 0.5307193994522095, 0.5957505702972412, 0.5772902369499207, 0.42637890577316284, 0.9215373992919922, 0.4085750877857208, 0.557478129863739, 0.23966775834560394, 0.36956870555877686, 0.2535756826400757, 0.13388468325138092, 0.33851897716522217, 0.4326576590538025, 0.49585190415382385, 0.5520332455635071, 0.6304510235786438, 0.7085527181625366, 0.7812169194221497, 0.09183210134506226, 0.9279903769493103, 0.9287686944007874, 0.7573585510253906, 0.11356967687606812, 0.92024827003479, 0.8299093842506409, 0.28700605034828186, 0.75279301404953, 0.6569482088088989, 0.4226320683956146, 0.5276551842689514, 0.5945624709129333, 0.39777621626853943, 0.8695116639137268, 0.06964663416147232, 0.18622958660125732, 0.2207595258951187, 0.35709473490715027, 0.36292386054992676, 0.5079752802848816, 0.5260767936706543, 0.6477560997009277, 0.6901158690452576, 0.8074332475662231, 0.8204708099365234, 0.0556868351995945, 0.05974288657307625, 0.19443081319332123, 0.3120681643486023, 0.43140751123428345, 0.5478249788284302, 0.6082472801208496, 0.6847281455993652, 0.7598679661750793, 0.1376686841249466, 0.9295018315315247, 0.13133512437343597, 0.9445589184761047, 0.16949176788330078, 0.7630615234375, 0.3395075500011444, 0.6026873588562012, 0.5121622085571289, 0.4318861663341522, 0.6470393538475037, 0.2936002016067505, 0.7786632776260376, 0.9409124255180359, 0.21560901403427124, 0.16927339136600494, 0.39818519353866577, 0.3544045686721802, 0.5688581466674805, 0.4773864150047302, 0.641610324382782, 0.8311001062393188, 0.8021774291992188, 0.625206470489502, 0.6972959041595459, 0.5318204164505005, 0.4284036159515381, 0.27827775478363037, 0.7861599922180176, 0.7340675592422485, 0.6325574517250061, 0.712203323841095, 0.6250669956207275, 0.6883633136749268, 0.7496581077575684, 0.6079471707344055, 0.25784364342689514, 0.6282087564468384, 0.7382251620292664, 0.6516999006271362, 0.6727080941200256, 0.5526376366615295, 0.7019383311271667, 0.5480226874351501, 0.6156435608863831], u'image_id': 262145, u'dp_V': [0.435714453458786, 0.4540521800518036, 0.18669910728931427, 0.4730636179447174, 0.1817651242017746, 0.2397734522819519, 0.5154634714126587, 0.0778612270951271, 0.47144177556037903, 0.24830983579158783, 0.10124488919973373, 0.42400211095809937, 0.17712591588497162, 0.16078531742095947, 0.7430467009544373, 0.8068962097167969, 0.5839595198631287, 0.8275789022445679, 0.6089735627174377, 0.4219698905944824, 0.7108063101768494, 0.27599748969078064, 0.4922265112400055, 0.17351959645748138, 0.4757252335548401, 0.31542858481407166, 0.34038570523262024, 0.2190510332584381, 0.284136563539505, 0.18819834291934967, 0.24188412725925446, 0.3010953962802887, 0.2897706627845764, 0.3234020173549652, 0.3240533769130707, 0.7283157706260681, 0.32366207242012024, 0.11646141856908798, 0.1395215094089508, 0.698147177696228, 0.09631854295730591, 0.3345729112625122, 0.6794231534004211, 0.10249005258083344, 0.33502835035324097, 0.6944578886032104, 0.19180409610271454, 0.7063501477241516, 0.17014046013355255, 0.660202145576477, 0.4905553162097931, 0.6760637760162354, 0.4298979938030243, 0.6051788330078125, 0.33344659209251404, 0.6024321913719177, 0.304057776927948, 0.4545254111289978, 0.24095730483531952, 0.42929837107658386, 0.22992774844169617, 0.505514919757843, 0.10040315985679626, 0.09234366565942764, 0.10799026489257812, 0.17639049887657166, 0.19133812189102173, 0.35492274165153503, 0.1694529801607132, 0.369148850440979, 0.24146153032779694, 0.3719113767147064, 0.18182919919490814, 0.8576633930206299, 0.7511200308799744, 0.8856542110443115, 0.7903587222099304, 0.935828447341919, 0.7384026646614075, 0.8597866296768188, 0.7211704850196838, 0.7972701191902161, 0.7776200771331787, 0.7320044636726379, 0.641746997833252, 0.4209004044532776, 0.6608603000640869, 0.4275743067264557, 0.4784872233867645, 0.3132534921169281, 0.31148022413253784, 0.30090242624282837, 0.3639029562473297, 0.36897680163383484, 0.6103492975234985, 0.5535940527915955, 0.27160295844078064, 0.40101170539855957, 0.3286058008670807, 0.46157601475715637, 0.5251432061195374, 0.24991905689239502, 0.3960190713405609, 0.15772105753421783, 0.4226440191268921, 0.3048115074634552, 0.6505873799324036, 0.5413646101951599, 0.8349341750144958, 0.19350852072238922, 0.3841789960861206, 0.28090330958366394, 0.6460604071617126, 0.5135382413864136, 0.8778793215751648], u'bbox': [387, 71.1, 145, 322], u'category_id': 1, u'dp_y': [28.909210205078125, 36.55131530761719, 36.783695220947266, 48.302513122558594, 49.214847564697266, 60.31871795654297, 62.91701126098633, 67.61942291259766, 76.40342712402344, 76.62415313720703, 84.08012390136719, 88.83182525634766, 92.05416870117188, 97.28132629394531, 6.887820243835449, 7.758489608764648, 12.378670692443848, 13.159947395324707, 15.277416229248047, 19.964256286621094, 20.220537185668945, 20.284868240356445, 24.30759620666504, 26.47214126586914, 28.3717041015625, 29.94404411315918, 33.47541809082031, 37.36276626586914, 131.643798828125, 139.17393493652344, 146.8321533203125, 154.6294708251953, 162.16929626464844, 169.56320190429688, 176.72129821777344, 183.9130401611328, 101.44630432128906, 105.18871307373047, 110.8502197265625, 116.28874969482422, 119.52222442626953, 125.48912811279297, 131.7722930908203, 134.32630920410156, 141.18894958496094, 148.46072387695312, 154.85458374023438, 164.68812561035156, 169.036376953125, 180.37696838378906, 45.588768005371094, 49.0302734375, 53.33025360107422, 56.44441604614258, 61.177650451660156, 64.0594711303711, 69.0973129272461, 71.62738037109375, 77.1959457397461, 79.06790161132812, 85.4889144897461, 86.12149047851562, 191.22286987304688, 197.965087890625, 204.69712829589844, 211.28915405273438, 217.55638122558594, 222.9026641845703, 226.37686157226562, 231.09103393554688, 236.93728637695312, 239.00830078125, 190.00265502929688, 191.6421356201172, 197.13681030273438, 198.41720581054688, 204.24436950683594, 204.8842010498047, 211.41250610351562, 211.76991271972656, 218.84889221191406, 219.04078674316406, 225.9837188720703, 233.1320343017578, 91.54401397705078, 92.83362579345703, 94.70879364013672, 97.85394287109375, 100.72178649902344, 101.68196868896484, 105.78349304199219, 107.72846984863281, 100.16338348388672, 104.2774887084961, 105.09274291992188, 108.75057220458984, 109.46098327636719, 113.06881713867188, 243.55999755859375, 243.78909301757812, 245.17449951171875, 248.11851501464844, 248.7158203125, 115.09693145751953, 117.0737075805664, 120.05931854248047, 120.77220916748047, 123.07715606689453, 125.72364807128906, 109.85590362548828, 112.39854431152344, 114.20746612548828, 116.95733642578125, 117.73690032958984, 122.15008544921875], u'dp_x': [111.5760498046875, 140.59548950195312, 88.38611602783203, 158.2552947998047, 115.38705444335938, 140.92547607421875, 172.6801300048828, 115.71037292480469, 183.8076171875, 149.18008422851562, 120.6231689453125, 196.7709503173828, 164.02395629882812, 134.32180786132812, 64.5458755493164, 46.724369049072266, 77.15164184570312, 32.21230697631836, 52.95413589477539, 67.25080871582031, 33.050376892089844, 87.5832290649414, 49.61521911621094, 77.81704711914062, 31.799861907958984, 62.45124435424805, 42.439605712890625, 54.14159393310547, 219.4923553466797, 210.85491943359375, 203.37953186035156, 198.5994110107422, 194.32286071777344, 189.5088653564453, 187.13658142089844, 185.8370361328125, 214.15895080566406, 181.10533142089844, 146.47915649414062, 211.83815002441406, 177.25453186035156, 144.3924102783203, 199.31747436523438, 170.66110229492188, 144.83860778808594, 180.55224609375, 145.70240783691406, 172.3448486328125, 141.46939086914062, 157.65899658203125, 87.35091400146484, 72.78260040283203, 88.93101501464844, 71.13123321533203, 87.25609588623047, 69.8150405883789, 86.60945892333984, 69.17989349365234, 85.21100616455078, 67.95042419433594, 83.97472381591797, 65.75786590576172, 183.85504150390625, 182.17401123046875, 180.99432373046875, 178.43093872070312, 178.2622528076172, 170.95863342285156, 180.86807250976562, 171.43788146972656, 180.3823699951172, 168.86575317382812, 144.04090881347656, 163.4275665283203, 143.07244873046875, 162.6227569580078, 139.92333984375, 158.15179443359375, 136.32968139648438, 153.267333984375, 135.05577087402344, 150.7586212158203, 145.03981018066406, 148.38365173339844, 71.36695098876953, 86.16627502441406, 61.22429275512695, 75.84166717529297, 51.103759765625, 64.59684753417969, 51.51070022583008, 38.840843200683594, 87.83291625976562, 78.37515258789062, 90.03551483154297, 79.78938293457031, 69.83355712890625, 63.51275634765625, 185.44496154785156, 176.7206268310547, 169.04324340820312, 183.05271911621094, 173.74090576171875, 53.849117279052734, 41.75295639038086, 56.012142181396484, 32.203887939453125, 45.75273132324219, 34.0209846496582, 27.40822410583496, 16.41045379638672, 30.757139205932617, 8.695992469787598, 20.475650787353516, 8.169807434082031]}

 

Model output:

Body_uv:

<type 'list'>
子列表总数为:12158
body_uv的结果文件的关键字:[u'image_id', u'category_id', u'uv', u'score', u'bbox']
uv数据:(3, 224, 44)
body_uv的结果文件举例:{u'image_id': 885, u'category_id': 1, u'uv': array([[[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]],
       [[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]],
 
       [[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]]], dtype=uint8), u'score': 0.9986090064048767, u'bbox': [594.8278198242188, 26.12091064453125, 44, 224]}

 

The picture is :

 

Size of the picture : (427, 640, 3)

visualize the I, U and V images.:

visualize the isocontours of the UV fields:

u'uv'输出:
(3, 224, 44)
u的总坐标数:9856
u的有效坐标数:4281(数组其余元素都是0)
结论:
uv数据(三维数组)的大小是bounding box一样的。
输出的uvi数组((3, 224, 44)三维分别是u、v、I,和bbox一样大小的宽和高是由输出的56*56的feature map 通过cv2.resize()得到的,也就是线性插值得到的。)
每个index的那部分数据,是由相应通道的feature中的那一小块数据得到的

uv coordinate data analysis

(1) (training data) provides dp_x, dp_y, dp_I, dp_U, dp_V

dp_x, dp_y: it is collected by manual annotation point in the image space coordinates

dp_I: 24 block regions representing each point belongs in a

dp_U, dp_V: uv coordinate space is a space, a surface of each such model has a number of two-dimensional parameters 100

For example:

The first region partitioned view of each point belongs dp_I

The second, third position of FIG using x, y positioning point, respectively, and then using the values ​​of u and v rendered color.

(2) (model outputs)

The result is an output of the model i, u as large, mask v value, mask the size and character of bbox, there is the value of u and v is the value of intensive, is the number of million level. That is the thesis intensive correspondence.

The output rendering results show:

Loss (3) calculation model

     L1 loss softmax loss classification, and key points   

Evaluation Criteria (4) model

Body_uv OGPS used as a measure of evaluation, using AP as the standard

OGPS: (1) calculating a predicted key Key tag data from the nearest pair of points.

            (2) calculating the distance of this point.

 

Data processing training process:

将输出的(56*56)的feature map 根据x y index 信息双线性差值池化成196*25形状的数组,然后与标签的uv196*25的数组做损失,标签本身的body_u,body_v的大小是196*1,然后复制25遍变成196*25

(二)、文件简介

  1. 预训练参数文件:detectron-download-cache
  2. detectron-output:训练保存的参数的文件夹
  3. configs:网络配置文件夹
  4. detectron:detectron平台核心代码文件夹
  5. tools:训练、测试、推断的python脚本
  6. sh:训练、测试的shell脚本

(三)、修改的文件

  1. config.py

增加了nonlocal板块

  1. dataset_catalog.py

增加了测试集COCO_test2015

  1. task_evaluation.py

body_uv_rcnn_heads.py

增加了一系列的头

解耦:add_roi_body_uv_head_v1convX_Decoupling

仿resnet结构:add_roi_body_uv_head_Modification_resnet

  1. fast_rcnn_heads.py

增加了一个卷积加全链接的头(卷积共用,全链接只有分类用)

头:add_roi_Xconv1fc_gn_head_test

配合用的输出:add_fast_rcnn_outputs_test

  1. model_builder.py

If replacement of the head needs to be replaced in the FAST line 287

fast_rcnn_heads.add_fast_rcnn_outputs_test(model, blob_frcn, dim_frcn)

If you use a decoupling needs to be modified _add_roi_body_uv_head 355 line

Within the function part

  1. nonlocal_helper.py

To increase the nonlocal part

  1. ResNet_nonlocal.py

To add a ResNet basic network of nonlocal

Fourth, the experiment (with ResNet50 as backbone)

(A), modifying body_uv_head, increase in the depth of the network layers, Residual structure 24 layer, and the use of GN, 8% improvement

(B) adding 1,3,5 nonlocal structure of th rear residual_block 1,3residual_block back and fourth stage of the third stage of ResNet50, can bring improvement of 1.3% -1.7%

(Iii) the two original fast_head replaced fully connected, a common four convolution, and then add the branch structure classified two layers fully connected, has been raised 0.5

(Iv), the decoupling convolution of the original eight body_uv_head, ann and a branch index, u, v a branch, each branch eight convolution bring improvement of 1.4%

(E), when the synchronization training reduced image size, pixel sync narrowing 64, single-scale tests, a 0.8% improvement brought

(Vi), reinforcing the test, the SCALES : (400, 500, 600, 700, 900, 1000, 1100 is, 1200) with 2% improvement

(Seven), Multi-task, keypoint body_uv branch and branch joint training, bringing about a 2% lift

Guess you like

Origin blog.csdn.net/m0_37263345/article/details/83015268