“RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time”

When multiple output model, when prone to this problem, as shown in the following procedure:

        # zero the parameter gradients
        model.zero_grad()

        # forward + backward + optimize
        outputs, hidden = model(inputs, hidden)
        loss = _loss(outputs, session, items)
        acc_loss += loss.data[0]

        loss.backward()
        # Add parameters' gradients to their values, multiplied by learning rate
        for p in model.parameters():
            p.data.add_(-learning_rate, p.grad.data)

 

The first solution:

detach/repackage the hidden state in between batches. There are (at least) three ways to do this.

  1. hidden.detach_()
  2. hidden = hidden.detach()
  3. hidden = Variable(hidden.data, requires_grad=True) 

The second solution:

replace loss.backward() with loss.backward(retain_graph=True) but know that each successive batch will take more time than the previous one because it will have to back-propagate all the way through to the start of the first batch.  

Typically, the second solution is very slow, if memory is small, then run out of memory 

Guess you like

Origin www.cnblogs.com/carlber/p/11959526.html