The architecture for the YOLOv2 can be visualized
graph: http://ethereon.github.io/netscope/#/gist/d08a41711e48cf111e330827b1279c31
layer filters size input output 0 conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32 1 max 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32 2 conv 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64 3 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64 4 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 5 conv 64 1 x 1 / 1 104 x 104 x 128 -> 104 x 104 x 64 6 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 7 max 2 x 2 / 2 104 x 104 x 128 -> 52 x 52 x 128 8 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 9 conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 10 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 11 max 2 x 2 / 2 52 x 52 x 256 -> 26 x 26 x 256 12 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 13 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 14 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 15 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 16 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 17 max 2 x 2 / 2 26 x 26 x 512 -> 13 x 13 x 512 18 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 19 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 20 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 21 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 22 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 23 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 24 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 25 route 16 26 reorg / 2 26 x 26 x 512 -> 13 x 13 x2048 27 route 26 24 28 conv 1024 3 x 3 / 1 13 x 13 x3072 -> 13 x 13 x1024 29 conv 125 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 125 30 detection