softmax函数与参数 (x, dim = -1,0,1,2)

文章目录

softmax 函数简介

经常用于预测与Multinoulli分布相关联的概率，定义为：
$soft\max(x_i) = \frac{\exp(x_i)}{\sum^n_j \exp(x_j)}$

Softmax函数：在数学，尤其是概率论和相关领域中，Softmax函数，或称归一化指数函数，是逻辑函数的一种推广。它能将一个含任意实数的K维的向量z的”压缩”到另一个K维实向量σ(z)中，使得每一个元素的范围都在(0,1)之间，并且所有元素的和为1。

Softmax可作为神经网络中的输出层，用于多分类(sigmoid只能用于二分类, 使用Logistic函数来实现二分类问题，对于多分类问题，可以使用多项Logistic回归，该方法也被称之为softmax函数)；Softmax还可以解决学习速度衰退的问题；softmax还可作为loss function。

softmax函数的导数如下
$\mathrm{ \frac{ \partial{y_i}}{ \partial{z_j}}} = \begin{cases} \mathrm{y_i(1-y_i)} & \text{ if } (i=j) \\ \mathrm{-y_iy_j} & \text{ if }(i \neq j) \end{cases}$

softmax 函数代码

softmax(x,dim = -1,0,1,2) 【参数详解】

numpy 实现

# 对列求值
def softmax2d(x):
    max_ = np.max(x, axis=1, keepdims=True)
    e_x = np.exp(x - max_)
    sum = np.sum(e_x, axis=1, keepdims=True)
    f_x = e_x / sum
    return f_x

# 对行求值
def softmax2d(x):
    max_ = np.max(x, axis=2, keepdims=True)
    e_x = np.exp(x - max_)
    sum = np.sum(e_x, axis=2, keepdims=True)
    f_x = e_x / sum
    return f_x


def softmax(f):
    f -= np.max(f)
    return np.exp(f) / np.sum(np.exp(f))

C++ 实现

#include <math.h>
#include <vector>
#include "common.hpp"
 
// ========================= Activation Function: softmax =====================
template<typename _Tp>
int activation_function_softmax(const _Tp* src, _Tp* dst, int length)
{
    
    
	const _Tp alpha = *std::max_element(src, src + length);
	_Tp denominator{
    
     0 };
 
	for (int i = 0; i < length; ++i) {
    
    
		dst[i] = std::exp(src[i] - alpha);
		denominator += dst[i];
	}
 
	for (int i = 0; i < length; ++i) {
    
    
		dst[i] /= denominator;
	}
 
	return 0;
}

向量值直接求softmax

def chw2hwc(vector):
    assert len(vector) == channel * height * width
    """
    ### 原始图像（BGR）
    imgbgr = np.array([
        [[12, 10, 14], [18, 0, 14], [54, 23, 65], [54, 32, 65]],
        [[11, 23, 16], [25, 23, 19], [12, 9, 14], [21, 65, 65]],
        [[36, 15, 47], [52, 7, 14], [74, 23, 85], [54, 32, 65]],
        [[12, 3, 14], [12, 7, 14], [12, 69, 89], [54, 32, 65]],
    ], dtype=np.uint8)
    height, width, channel = imgbgr.shape
    
    ### 图像格式HWC2CHW (可直接使用 transpose((1, 2, 0)))，这里用来展示数据转换流程
    b, g, r = cv2.split(imgbgr)
    list_arr = np.array([b, g, r]).reshape(-1)
    
    ### 展示CHW 向量
    vector = [12 18 54 54 11 25 12 21 36 52 74 54 12 12 12 54 
              10  0 23 32 23 23  9 65 15  7 23 32  3  7 69 32 
              14 14 65 65 16 19 14 65 47 14 85 65 14 14 89 65]
    """
    dst_mat = np.zeros((height, width, channel), dtype=np.uint8)
    for i in range(height * width):
        dst_mat[i//width, i % width, 0] = vector[0 * height * width + i]
        dst_mat[i//width, i % width, 1] = vector[1 * height * width + i]
        dst_mat[i//width, i % width, 2] = vector[2 * height * width + i]
    return dst_mat

def chw2hwc_and_softmax2(vector):
    vector = vector.reshape((channel, width*height))
    vector = vector.T
    for i in vector:
        print(softmax(i))