Softmax的数值(overflow)问题

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/Richard__Ting/article/details/87924416

Softmax的数值(overflow)问题

一、Softmax(Normalized exponential function)定义

Normalized exponential function,归一化指数函数

σ : R k { z R k z i > 0 , i = 1 K z i = 1 } \sigma: \boldsymbol{R^k} \to \{ z \in \boldsymbol{R^k} | z_i>0, \sum\limits_{i=1}^Kz_i = 1\}
σ ( z ) j = e z j / k = 1 K e z k , f o r j = 1 , . . . , K \sigma(z)_j=e^{z_j}/\sum\limits_{k=1}^Ke^{z_k}, for j=1,...,K

二、Python简单实现

>>> import numpy as np
>>> def softmax(x):
...     return np.exp(x)/sum(np.exp(x))
... 
>>> softmax([1,2,3])
array([ 0.09003057,  0.24472847,  0.66524096])
>>> softmax([10,20,30])
array([  2.06106005e-09,   4.53978686e-05,   9.99954600e-01])

三、溢出问题

>>> softmax([100,200,300])
array([  1.38389653e-87,   3.72007598e-44,   1.00000000e+00])
>>> softmax([1000,2000,3000])
__main__:2: RuntimeWarning: overflow encountered in exp
__main__:2: RuntimeWarning: invalid value encountered in true_divide
array([ nan,  nan,  nan])

四、解决方案

>>> def softmax(x):
...     max = np.max(x)
...     return np.exp(x-max)/sum(np.exp(x-max))
... 
>>> softmax([1000,2000,3000])
array([ 0.,  0.,  1.])

五、解决原理

一个式子说明: e x i j e x j = e m e m e x i j e x j = e x i m j e x j m \frac{e^{x_i}}{\sum\limits_{j}e^{x_j}}=\frac{e^{-m}}{e^{-m}}\frac{e^{x_i}}{\sum\limits_{j}e^{x_j}}=\frac{e^{x_i-m}}{\sum\limits_{j}e^{x_j-m}}

猜你喜欢

转载自blog.csdn.net/Richard__Ting/article/details/87924416