pytorch的dataloader没法进行迭代

说来也是很神奇,一般来说不管是自己定义是dataset还是自带的dataset的工具,加载进DataLoader后就变成可迭代的对象

但是我确定dataset可以打印出来,加载DataLoader也没问题,但使用dataloader居然失败:

from torch.utils.data import DataLoader
dataloader = DataLoader(dataset, batch_size=123, shuffle=True, num_workers=6, drop_last=True)
for i in dataloader:
    print(123)

会报错,出现类似以这篇issue上的内容:https://github.com/AliaksandrSiarohin/first-order-model/issues/197

因此找不到自己的报错的内容了,直接拷贝上述链接的内容:

RuntimeError                              Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/imageio/plugins/ffmpeg.py in _read_frame_data(self)
   620                     raise RuntimeError(
--> 621                         "Frame is %i bytes, but expected %i." % (len(s), framesize)
   622                     )

RuntimeError: Frame is 0 bytes, but expected 12288.

During handling of the above exception, another exception occurred:

CannotReadFrameError                      Traceback (most recent call last)
5 frames
/usr/local/lib/python3.6/dist-packages/imageio/plugins/ffmpeg.py in _read_frame_data(self)
   626                 err2 = self._stderr_catcher.get_text(0.4)
   627                 fmt = "Could not read frame %i:\n%s\n=== stderr ===\n%s"
--> 628                 raise CannotReadFrameError(fmt % (self._pos, err1, err2))
   629             return s, is_new
   630 

CannotReadFrameError: Could not read frame 1178:
Frame is 0 bytes, but expected 12288.
=== stderr ===
ffmpeg version 3.4.6-0ubuntu0.18.04.1 Copyright (c) 2000-2019 the FFmpeg developers
 built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)
 configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
 libavutil      55. 78.100 / 55. 78.100
 libavcodec     57.107.100 / 57.107.100
 libavformat    57. 83.100 / 57. 83.100
 libavdevice    57. 10.100 / 57. 10.100
 libavfilter     6.107.100 /  6.107.100
 libavresample   3.  7.  0 /  3.  7.  0
 libswscale      4.  8.100 /  4.  8.100
 libswresample   2.  9.100 /  2.  9.100
 libpostproc    54.  7.100 / 54.  7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/content/gdrive/My Drive/first-order-motion-model/DBnuggest_1.mp4':
 Metadata:
   major_brand     : mp42
   minor_version   : 0
   compatible_brands: mp42mp41
   creation_time   : 2020-07-28T17:25:57.000000Z
 Duration: 00:00:19.71, start: 0.000000, bitrate: 368 kb/s
   Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, smpte170m), 64x64, 186 kb/s, 60 fps, 60 tbr, 60k tbn, 120 tbc (default)
   Metadata:
     creation_time   : 2020-07-28T17:25:58.000000Z
     handler_name    : Alias Data Handler
     encoder         : AVC Coding
   Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz, mono, fltp, 158 kb/s (default)
   Metadata:
     creation_time   : 2020-07-28T17:25:58.000000Z
     handler_name    : Alias Data Handler

总之一句话,就是dataset出问题了

在迭代dataloader的时候会去加载dataset的内容,而dataset里面读取数据的部分发生异常了,但没有抛出,上述的链接解决的很好,直接抛异常就行:

总之一句话,要是使用pytorch时,发现使用不了dataloader,可以考虑是dataset加载数据的步伐出异常了,然后抛出去或者解决就行。

看起来很简单,但是真没想到dataloader里面还有这样的问题,自己定义dataset的时候需要谨慎一些。

归根结底,dataset加载数据的部分出问题了,一部分没问题,这就为啥我可以打印dataset的数据内容,但是还是会出错,反正有条件都给抛个异常就好。

猜你喜欢

转载自blog.csdn.net/zhou_438/article/details/114435812