遇到两个问题:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
先看第一个问题,怎么把cpu上的tensor仍到gpu上。
我遇到问题场景如下:定义了一个model,里面有很多类,每个类里有定义的各种tensor。之前是哪里报错在cpu上就加个.cuda()
, 太麻烦了,改改,将定义的矩阵都写到函数或者类里:
之前是这样:
qtable_8x8 = [[16, 16, 16, 16, 17, 18, 20, 24],
[16, 16, 16, 17, 18, 20, 24, 25],
[16, 16, 17, 18, 20, 24, 25, 28],
[16, 17, 18, 20, 24, 25, 28, 33],
[17, 18, 20, 24, 25, 28, 33, 41],
[18, 20, 24, 25, 28, 33, 41, 54],
[20, 24, 25, 28, 33, 41, 54, 71],
[24, 25, 28, 33, 41, 54, 71, 91]]
qtable_8x8 = nn.Parameter(torch.from_numpy(np.array(qtable_8x8)), requires_grad=False)
改改,放到函数里:
def def_param():
qtable_8x8 = [[16, 16, 16, 16, 17, 18, 20, 24],
[16, 16, 16, 17, 18, 20, 24, 25],
[16, 16, 17, 18, 20, 24, 25, 28],
[16, 17, 18, 20, 24, 25, 28, 33],
[17, 18, 20, 24, 25, 28, 33, 41],
[18, 20, 24, 25, 28, 33, 41, 54],
[20, 24, 25, 28, 33, 41, 54, 71],
[24, 25, 28, 33, 41, 54, 71, 91]]
qtable_8x8 = nn.Parameter(torch.from_numpy(np.array(qtable_8x8)), requires_grad=False)
函数放到class 的 init 里:
class quantize(nn.Module):
def __init__(self, factor):
super(quantize, self).__init__()
self.factor = factor
self.rounding = diff_round
self.qtable_8x8 = def_param() # Here
def forward(self, image):
image = image.float() * 16 / (self.qtable_8x8 * (self.factor)) # factor
image = self.rounding(image)
return image
同时,要想定义的tensor能够随class一起初始化到相应的device,则应把tensor 变成 nn.Parameter() 的形式。
如下:
qtable_8x8 = nn.Parameter(torch.from_numpy(np.array(qtable_8x8)), requires_grad=False)
注意若不需要 gradient 则 required_grad=False
class rgb_to_ycbcr_jpeg(nn.Module):
def __init__(self):
super(rgb_to_ycbcr_jpeg, self).__init__()
matrix = np.array(
[[0.299, 0.587, 0.114], [-0.168736, -0.331264, 0.5],
[0.5, -0.418688, -0.081312]], dtype=np.float32).T
self.shift = nn.Parameter(torch.tensor([0., 128., 128.]), requires_grad=False)
self.matrix = nn.Parameter(torch.from_numpy(matrix), requires_grad=False)
至此,to(device)
或 .cuda()
整个大的class DiffIpred
的时候内部定义的tensor也就能一起放到 cuda上了
elif layer == 'H264':
self.noise_layers = DiffIpred(opt['noise']['H264']['factor']).to(device)
另外,外部传入的参数(比如自己定义的一个参数矩阵)要想在多卡并行的时候分别复制到不同的卡上,则需要用 torch.nn.Parameter()。
再看第二个问题,怎么把cuda1上的tensor也给cuda2搞一份。
我遇到的问题场景如下:初始化了多个类,调用的时候想要同时调用多个,但是又不想写太多if else 判断,就用了字典,结果就会出现多卡并行的时候 自定义的 nn.parameter()的参数不能被复制到多卡上。
改之前是这个样子:
class Noise_pool(nn.Module):
def __init__(self, opt, device):
super(Noise_pool, self).__init__()
.........
self.Hue = ColorJitter(opt, distortion='Hue')
self.Rotation = kornia.augmentation.RandomRotation(degrees=opt['noise']['Rotation']['degrees'], p=opt['noise']['Rotation']['p'], keepdim=True)
self.Affine = kornia.augmentation.RandomAffine(degrees=opt['noise']['Affine']['degrees'], translate=opt['noise']['Affine']['translate'], scale=opt['noise']['Affine']['scale'], shear=opt['noise']['Affine']['shear'], p=opt['noise']['Affine']['p'], keepdim=True)
#
self.Cropout = Cropout(opt)
self.Dropout = Dropout(opt)
# 就是下面这个字典搞的 init的时候并行失败!
self.noise_pool_in1 = {'Identity': self.Identity, 'JpegTest': self.JpegTest, 'Jpeg': self.Jpeg, \
'JpegMask': self.JpegMask, 'DiffJPEG': self.DiffJPEG, 'Crop': self.Crop, 'Resize': self.Resize, \
'GaussianBlur': self.GaussianBlur, 'Salt_Pepper': self.Salt_Pepper, 'GaussianNoise': self.GaussianNoise, \
'Brightness': self.Brightness, 'Contrast': self.Contrast, 'Saturation': self.Saturation, \
'Hue': self.Hue, 'Rotation': self.Rotation, 'Affine': self.Affine}
self.noise_pool_in2 = {'Cropout': self.Cropout, 'Dropout': self.Dropout}
def forward(self, encoded, cover_img, noise_choice):
.........
解决方案:
将’字典‘定义为单独的函数,并且不要放到 init 里初始化!…用的时候直接在forward里调用。原因可能是dict封装改变了类的nn.module的属性,导致初始化的时候被判定为不用parallel的参数。
def forward(self, encoded, cover_img, noise_choice):
return self.noise_pool_in1()[noise_choice](encoded)
def noise_pool_in2(self):
return {'Cropout': self.Cropout, 'Dropout': self.Dropout}
def noise_pool_in1(self):
......