Explain softmax and sigmoid in detail

Introduction to activation functions

For those who are engaged in or have knowledge of the artificial intelligence industry, I believe that the two activation functions of softmax and sigmoid are not unfamiliar. These two activation functions are not only used in logistic regression, but also included in interviews or written examinations. It is very important and the most basic to master these two activation functions and their derivative functions. The following details softmax and sigmoid:

softmax function

In mathematics, the softmax function, also known as the normalized exponential function, is an extension of the logic function. It can "compress" a k-dimensional vector z calling any real number into two k-dimensional vectors σ(z), so that the range of each element is between (0, 1), and the sum of all elements is 1.

Properties of softmax function

The formula of the softmax function is:
F (xi) = exp (xi) ∑ i = 0 kexp (xi) (i = 0, 1, 2,... k) F(x_i)=\frac {exp(x_i)} {\sum_{i=0}^k{exp(x_i)}} (i = 0,1,2,...k)F(xi)=i=0kexp(xi)exp(xi)i=0,1,2,. . . K )
  x: the input data;
  exp: exponentiation;
  F (x): function output;
  maps all x values into the interval from 0 to 1;
 mapping of all values of x and is equal to 1.

Use of softmax function

  • Used for multiple classification logistic regression models.
  • When building a neural network, use the softmax function in different layers; softmax appears as a fully connected layer in the neural network, and the function is to map the result of the network calculation to the (0, 1) interval, giving each category The probability.

Implementation code of softmax

python implementation

import numpy as np

def softmax(x):
    orig_shape=x.shape
    #根据输入类型是矩阵还是向量分别做softmax
    if len(x.shape)>1:
        #矩阵
        #找到每一行的最大值max,并减去该max值,目的是为了防止exp溢出
        constant_shift=np.max(x,axis=1).reshape(1,-1)
        x-=constant_shift
        #计算exp
        x=np.exp(x)
        #每行求和
        normlize=np.sum(x,axis=1).reshape(1,-1)
        #求softmax
        x/=normlize
    else:
        #向量
        constant_shift=np.max(x)
        x-=constant_shift
        x=np.exp(x)
        normlize=np.sum(x)
        x/=normlize
    assert x.shape==orig_shape
    return x

softmax function image

import numpy as np
import matplotlib.pyplot as plt
def softmax(x):
    orig_shape=x.shape
    if len(x.shape)>1:
        constant_shift=np.max(x,axis=1).reshape(1,-1)
        x-=constant_shift
        x=np.exp(x)
        normlize=np.sum(x,axis=1).reshape(1,-1)
        x/=normlize
    else:
        constant_shift=np.max(x)
        x-=constant_shift
        x=np.exp(x)
        normlize=np.sum(x)
        x/=normlize
    assert x.shape==orig_shape
    return x

softmax_inputs = np.arange(0,5)
softmax_outputs=softmax(softmax_inputs)
print("softmax input:: {}".format(softmax_inputs))
print("softmax output:: {}".format(softmax_outputs))
# 画图像
plt.plot(softmax_inputs,softmax_outputs)
plt.xlabel("Softmax Inputs")
plt.ylabel("Softmax Outputs")
plt.show()

Insert picture description here
It can be seen from the figure that the larger the input value of softmax, the larger its output value.

sigmoid function

The sigmoid function is a common sigmoid function in biology, also known as the sigmoid growth curve. The sigmoid function is often used as the threshold function of neural networks to map variables between 0 and 1.

The nature of the sigmoid function

The formula of the sigmoid function is:
F (x) = 1 1 + exp (− x) F(x) = \frac{1} {1+exp(-x)}F(x)=1+exp(x)1
  x: input data;
  exp: exponential operation;
  f(x): function output, which is a floating point number;

Use of sigmoid function

  • Sigmoid function is used for binary classification in logistic regression model.
  • In neural networks, the sigmoid function is used as the activation function.
  • In statistics, the sigmoid function image is a common cumulative distribution function.

Implementation code of sigmoid

python implementation

import numpy as np
def sigmoid(x):
    return 1.0/(1+np.exp(-x))

sigmoid function drawing

def sigmoid(x):
    return 1.0/(1+np.exp(-x))

sigmoid_inputs = np.arange(-10,10)
sigmoid_outputs=sigmoid(sigmoid(sigmoid_inputs))
print("sigmoid Input :: {}".format(sigmoid_inputs))
print("sigmoid Output :: {}".format(sigmoid_outputs))

plt.plot(sigmoid_inputs,sigmoid_outputs)
plt.xlabel("Sigmoid Inputs")
plt.ylabel("Sigmoid Outputs")
plt.show()

Insert picture description here

Comparison of softmax and sigmoid

common softmax sigmoid
official F ( x i ) = e x p ( x i ) ∑ i = 0 k e x p ( x i ) F(x_i)=\frac {exp(x_i) } {\sum_{i=0}^k{exp(x_i)}} F(xi)=i=0kexp(xi)exp(xi) F ( x ) = 1 1 + e x p ( − x ) F(x)= \frac{1} {1+exp(-x)} F(x)=1+exp(x)1
Nature Discrete probability distribution Nonlinear mapping
task Multi-category Two categories
Domain A one-dimensional vector Single value
range [0, 1] (0, 1)
Sum of results Must be 1 Is a positive number

Guess you like

Origin blog.csdn.net/CFH1021/article/details/104841428