Hello! ImageNet ILSVRC 2012!

As the representative of the CV poor people, this time I want to touch the super data set that is the most active in the papers but daunts the poor with "Huge": ILSVRC2012 in ImageNet.


Remember BigGAN, which relied on Krypton Gold's "dominant presidential style" to crush all "noise 2 image" ?

来不及时间解释了,快上车!
## 下载可怜穷人的 BigGAN-4~8 gpus version
>> git clone https://github.com/ajbrock/BigGAN-PyTorch.git
Prepare the data set and its preprocessing

Resources

http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_test.tar
http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar
http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_train.tar
http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_devkit_t12.tar
http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_bbox_train_v2.tar

This step is the key point and describes how the UP master downloads the entire data set.

  • The first is to download some auxiliary information (BBox, etc.) and smaller test set and verification set data through Thunder (recommended to rent a one-day membership) (copy the links above directly);
  • But when it reaches 137 GB of the training set, it will not work. Because the intranet is used and the network speed is forcibly limited, so I have to find a resource shared on the network disk of an enthusiastic netizen. Each category corresponds to a compressed package. 1,000 in total,
## 原来的资源找不到了,这里有个也是一样的
https://pan.baidu.com/s/1hsDmdNI

After downloading, it was decompressed. It took a few hours to write a script (decompress 1000 .tar compressed packages),

import tarfile
import os
from tqdm import tqdm

if __name__ == '__main__':
    src_pth = 'xxxxxx/ILSVRC2012_img_train'
    des_pth = 'xxxxxx/SuperDatasets/Image-Net/ILSVRC2012_img_train'

    for i in tqdm(range(1000)):
        dir_name = '%04d' % (i+1)
        if os.path.exists(os.path.join(des_pth, dir_name)) is not True:
            os.mkdir(os.path.join(des_pth, dir_name))

        tar_file = os.path.join(src_pth, dir_name+'.tar')
        dir_file = os.path.join(des_pth, dir_name)

        tar_fh = tarfile.open(tar_file)

        for f_name in tar_fh.getnames():
            tar_fh.extract(f_name, path=dir_file)
            # 解压到指定文件夹

I got
Insert picture description here
Insert picture description here
it in the end. Okay, I was really touched♪(^∀^●)ノ


Each subfolder here is an image of the same category. For 1000 categories of Chinese and English information, please refer to this brother’s blog: imagenet dataset category labels and corresponding English-Chinese comparison tables,
but in fact we don’t care about the specifics. What each category refers to, just use 0, 1, 2, ... to refer to it.


We just need to create a new training phase in the project directory datadirectory, and then inside the new ImageNetdirectory, and then inside the new I128directory, that is data/ImageNet/I128. After that, move the 1000 sub-directories of train .


Anyway, after collecting the data, the following is the preprocessing.

  • The implementation of python make_hdf5.py --dataset I128 --batch_size 128 --data_root data
    hdf5 is to process the data into a format that is more conducive to fast I/O, similar to the previous blog EDVR-lmdb ; but unlike lmdb, lmdb stores a huge dict, and hdf5 also comes with The function of controlling batch is simply Σ( ° △ °|||) ︴ specially designed for dataset
  • 其中, b a t c h _ s i z e _ o v e r a l l = n u m _ g p u × b a t c h _ s i z e _ p e r _ g p u batch\_size\_overall = num\_gpu \times batch\_size\_per\_gpu batch_size_overall=num_gpu×b a t c h _ s i z e _ p e r _ g p u
    Here batch_size isoverall; the author said that a single VRAM of 16GB supports batch_size_per_gpu which can be 256, so the author said that the original BigGAN setting is:
    2048 = 256 × 8 2048=256\times 82048=256×8
  • Benqiu UP finally borrowed 4 pieces of 1080ti, which is assumed to be 8GB ✖ 4, so
    128 × 4 = 512 128\times 4=512128×4=5 1 2 , think about it or forget it, just press half of the 256 he gave 128.
    Let's take a look at what happens under this setting?
经由代码文件
make_hdf5.py, utils.py, datasets.py
我们可以推断出数据集应该这样准备:
'''
/data/ImageNet
 ├I128   # ImageNet 128x128
 │ ├dog
 │ │ ├xxx.jpg
 │ │ ├xxy.jpg
 │ │ ...
 │ │ └zzz.jpg
 │ ├cat
 │ │ ├xxx.jpg
 │ │ ...
 │ │ ├xxy.jpg
 │ │ ...
 │ │ └zzz.jpg
 │ ...
 │ └class_n
 │   ├xxx.jpg
 │   ...
 │   ├xxy.jpg
 │   ...
 │   └zzz.jpg
 ├I256   # ImageNet 256x256
 ...
 └XXXX
'''

After finishing, we should get one:data/ImageNet/ILSVRC128.hdf5

TIP

We found that the code does not necessarily require 1000 categories of images to participate in training, so we can only take some of the categories. Let's go to the top 25 categories, just to Mr. Owl, there are also a total of 32,500 images. The processing is very fast, less than one minute, and the generated file size is 1.4 GB.Insert picture description here


  • Performed python calculate_inception_moments.py --dataset I128_hdf5 --data_root data
    here to use torchvision.models.inceptionpre-trained models to calculate IS fraction ( [mu] \ MUμσ \ sigmaσ
    Insert picture description here

Start training

python train.py \
--dataset I128_hdf5 \  # which Dataset to train on, out of I128, I256, C10, C100;
--parallel --shuffle  --num_workers 8 --batch_size 128 \ # for data loader settings
--num_G_accumulations 2 --num_D_accumulations 2 \
--num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 \
--G_attn 64 --D_attn 64 \
--G_nl relu --D_nl relu \
--SN_eps 1e-8 --BN_eps 1e-5 --adam_eps 1e-8 \
--G_ortho 0.0 \
--G_init xavier --D_init xavier \
--ema --use_ema --ema_start 2000 --G_eval_mode \
--test_every 2000 --save_every 1000 --num_best_copies 5 --num_save_copies 2 --seed 0 \
--name_suffix SAGAN_ema \
Successful setting of UP master

The UP master later turned into only two cards, then try again, change batch _ size _ overall batch\_size\_overallb a t c h _ s i z e _ o v e r a l l is set to 32, and finally ran on two cards temporarily.

CUDA_VISIBLE_DEVICES=0,1 python train.py --dataset I128_hdf5 --parallel --shuffle  --num_workers 8 --batch_size 32 --num_G_accumulations 1 --num_D_accumulations 1 --num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 --G_attn 64 --D_attn 64 --G_nl relu --D_nl relu --SN_eps 1e-8 --BN_eps 1e-5 --adam_eps 1e-8 --G_ortho 0.0 --G_init xavier --D_init xavier --ema --use_ema --ema_start 2000 --G_eval_mode --test_every 2000 --save_every 1000 --num_best_copies 5 --num_save_copies 2 --seed 0 --name_suffix SAGAN_ema

Guess you like

Origin blog.csdn.net/WinerChopin/article/details/102504046