训练流程

1. LOSS & OPTIMIZER

###########   LOSS & OPTIMIZER   ##########

criterion = nn.BCELoss()

optimizerD = torch.optim.Adam(netD.parameters(),lr=opt.lr, betas=(opt.beta1, 0.999))

optimizerG = torch.optim.Adam(netG.parameters(),lr=opt.lr, betas=(opt.beta1, 0.999))

（1） nn.BCELoss( )

二分类用的交叉熵，用的时候需要在该层前面加上 Sigmoid 函数。

如果是二分类的话，因为只有正例和反例，且两者的概率和为 1，那么只需要预测一个概率就好了，因此可以简化成

Loss (xi,yi) = −wi [yi log xi+(1−yi) log (1−xi) ]

注意这里 x,y可以是向量或者矩阵，i只是下标；xi表示第 i 个样本预测为正例的概率，yi表示第 i个样本的标签，wi表示该项的权重大小。可以看出，loss, x, y, w 的维度都是一样的。

（2）torch.optim.xxx( )

为了使用torch.optim，需先构造一个优化器对象Optimizer，用来保存当前的状态，并能够根据计算得到的梯度来更新参数。
要构建一个优化器optimizer，你必须给它一个可进行迭代优化的包含了所有参数（所有的参数必须是变量s）的列表。然后，您可以指定程序优化特定的选项，例如学习速率，权重衰减等。

optimizerD = torch.optim.Adam(netD.parameters(),lr=opt.lr, betas=(opt.beta1, 0.999))

优化步骤：
所有的优化器Optimizer都实现了step()方法来对所有的参数进行更新，它有两种调用方法：

optimizer.step()

这是大多数优化器都支持的简化版本，使用如下的backward()方法来计算梯度的时候会调用它。

for input, target in dataset:

    optimizer.zero_grad()

    output = model(input)

    loss = loss_fn(output, target)

    loss.backward()

    optimizer.step()

optimizer.step(closure)

2. GLOBAL VARIABLES

##########   GLOBAL VARIABLES   ###########

noise = torch.FloatTensor(opt.batchSize, opt.nz, 1, 1)

real = torch.FloatTensor(opt.batchSize, nc, opt.imageSize, opt.imageSize)

label = torch.FloatTensor(opt.batchSize)

real_label = 1

fake_label = 0



noise = Variable(noise)

real = Variable(real)

label = Variable(label)

if(opt.cuda):

    noise = noise.cuda()

    real = real.cuda()

label = label.cuda()

3. TRAINING

判别器

netD.zero_grad()

训练真实数据 resize > output = netD(real)

errD_real = criterion(output, label)

errD_real.backward()

训练虚假数据 resize > noise产生 > fake = netG(noise) >output=netD(fake.detach())

errD_fake = criterion(output,label)

errD_fake.backward()

errD = errD_fake + errD_real

optimizerD.step()

生成器

netG.zero_grad()

训练虚假数据 output = netD(fake)

errG = criterion(output, label)

errG.backward()

注意：此时label标签为1，判别器训练时虚假数据标签为0

optimizerG.step()

########### Training   ###########

for epoch in range(1,opt.niter+1):

    for i, (images,_) in enumerate(loader):

        ########### fDx ###########

        netD.zero_grad()

        # train with real data, resize real because last batch may has less than

        # opt.batchSize images

        real.data.resize_(images.size()).copy_(images)

        label.data.resize_(images.size(0)).fill_(real_label)



        output = netD(real)

        errD_real = criterion(output, label)

        errD_real.backward()



        # train with fake data

        label.data.fill_(fake_label)

        noise.data.resize_(images.size(0), opt.nz, 1, 1)

        noise.data.normal_(0,1)



        fake = netG(noise)

        # detach gradients here so that gradients of G won't be updated

        output = netD(fake.detach())

        errD_fake = criterion(output,label)

        errD_fake.backward()



        errD = errD_fake + errD_real

        optimizerD.step()



        ########### fGx ###########

        netG.zero_grad()

        label.data.fill_(real_label)

        output = netD(fake)

        errG = criterion(output, label)

        errG.backward()

        optimizerG.step()



        ########### Logging #########

        print('[%d/%d][%d/%d] Loss_D: %.4f Loss_G: %.4f '

                  % (epoch, opt.niter, i, len(loader),

                     errD.data[0], errG.data[0]))



        ########## Visualize #########

        if(i % 50 == 0):

            vutils.save_image(fake.data,

                        '%s/fake_samples_epoch_%03d.png' % (opt.outf, epoch),

                        normalize=True)



torch.save(netG.state_dict(), '%s/netG.pth' % (opt.outf))

torch.save(netD.state_dict(), '%s/netD.pth' % (opt.outf))

源代码网址：https://github.com/sunshineatnoon/Paper-Implementations/tree/master/dcgan

pytorch 7月19日学习---dcgan代码学习3