(1) Data loss: corrupted record at 443118 tensorflow 可能是因为tfrecord文件

https://github.com/tensorflow/tensorflow/issues/13463
这个issue里面有各种各样的问题和解决方案。
可能是因为tfrecord文件。
甚至还有说是编辑器问题的:
https://blog.csdn.net/Rrui7739/article/details/81003577

采用这个办法可以解决我的问题:

parsed_image_dataset = parsed_image_dataset.apply(tf.data.experimental.ignore_errors())

不过随即报出来其他的错误:

2021-10-26 22:05:03.367532: W tensorflow/core/framework/op_kernel.cc:1680] Invalid argument: required broadcastable shapes
Traceback (most recent call last):
  File "vae_no_r_12.py", line 579, in <module>
    VAER_GAN_instance.fit(parsed_image_dataset,epochs=epochs,callbacks=[MyPlotCallback_test(VAER_GAN_instance,a128batch),MyepochsaveCallback(save_dir,VAER_GAN_instance)])
  File "/home/dutxutengfei/anaconda3/envs/tensorflowpython37/lib/python3.7/site-packages/keras/engine/training.py", line 1184, in fit
    tmp_logs = self.train_function(iterator)
  File "/home/dutxutengfei/anaconda3/envs/tensorflowpython37/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 885, in __call__
    result = self._call(*args, **kwds)
  File "/home/dutxutengfei/anaconda3/envs/tensorflowpython37/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 917, in _call
    return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
  File "/home/dutxutengfei/anaconda3/envs/tensorflowpython37/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3040, in __call__
    filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
  File "/home/dutxutengfei/anaconda3/envs/tensorflowpython37/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1964, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/home/dutxutengfei/anaconda3/envs/tensorflowpython37/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 596, in call
    ctx=ctx)
  File "/home/dutxutengfei/anaconda3/envs/tensorflowpython37/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  required broadcastable shapes
	 [[node add_6 (defined at vae_no_r_12.py:514) ]] [Op:__inference_train_function_7743]

Errors may have originated from an input operation.
Input Source operations connected to node add_6:
 binary_crossentropy_6/weighted_loss/Mul (defined at vae_no_r_12.py:507)	
 binary_crossentropy_5/weighted_loss/Mul (defined at vae_no_r_12.py:506)

关于这个错误,可以修改batch_size为数据集的因数,之后好像就行了。。。不,还是不行。这个错误似乎是随机发生的,每次的轮数都不一样。

parsed_image_dataset = parsed_image_dataset.shard(3,1)
添加了这行命令后,变成了总是在第46个数据上出错:

tensorflow.python.framework.errors_impl.InvalidArgumentError:  required broadcastable shapes
         [[node add_6 (defined at /vae-gan/vae_no_r_16_20211125.py:493) ]] [Op:__inference_train_function_7578]

Errors may have originated from an input operation.
Input Source operations connected to node add_6:
 binary_crossentropy_6/weighted_loss/Mul (defined at /vae-gan/vae_no_r_16_20211125.py:486)
 binary_crossentropy_5/weighted_loss/Mul (defined at /vae-gan/vae_no_r_16_20211125.py:485)

Function call stack:
train_function

猜你喜欢

转载自blog.csdn.net/qq_44065334/article/details/120980014