When modifying huggingface model output_hidden_states=True, CUDA out of memory problem occurs.

When using the Trainer provided by huggingface for model prediction, if output_hidden_states=True during training, the video memory usage will increase infinitely, eventually leading to a CUDA out of memory memory overflow error.
Solution:

At the final return value of the model, just set hidden_states to None. I don't know the specific reason.

Insert image description here

Insert image description here

Guess you like

Origin blog.csdn.net/q506610466/article/details/127195815