The following is the code I wrote to calculate the perplexity PPL
According to the definition of perplexity: (The definition is other forms of definition, not book definitions, this is what is commonly used in experiments form)
Source explanation:https://stackoverflow.com/questions/61988776/how-to-calculate-perplexity-for-a-language- model-using-pytorch
P P L = e c r o s s _ e n t r o p y PPL=e^{cross\_entropy} PPL=It iscross_entropy
其中 c r o s s _ e n t r o p y cross\_entropy cross _entr opy It is because of the demand that the intersection is lost exp()
Note: F.cross_entropy factor reduction required mean Ready to use Mean
from torch import Tensor
import numpy as np
import torch.nn.functional as F
def perplexity(outputs: Tensor, targets: Tensor, config=None):
"""
计算语言模型困惑度
:param outputs: [batch_size,seq_len,vocab_size]
:param targets: [batch_size,seq_len]
:param config: 配置文件 default:None
:return: 困惑度数值
"""
ce = F.cross_entropy(outputs.view(-1, outputs.size(-1)), targets.view(-1),
ignore_index=config.data.pad_id if config is not None else None)
return torch.exp(ce)