Introduction to activation functions

For those who are engaged in or have knowledge of the artificial intelligence industry, I believe that the two activation functions of softmax and sigmoid are not unfamiliar. These two activation functions are not only used in logistic regression, but also included in interviews or written examinations. It is very important and the most basic to master these two activation functions and their derivative functions. The following details softmax and sigmoid:

softmax function

In mathematics, the softmax function, also known as the normalized exponential function, is an extension of the logic function. It can "compress" a k-dimensional vector z calling any real number into two k-dimensional vectors σ(z), so that the range of each element is between (0, 1), and the sum of all elements is 1.

Properties of softmax function

The formula of the softmax function is:
$F(x_i)=\frac {exp(x_i)} {\sum_{i=0}^k{exp(x_i)}} (i = 0,1,2,...k)$
x: the input data;
exp: exponentiation;
F (x): function output;
maps all x values into the interval from 0 to 1;
mapping of all values of x and is equal to 1.

Use of softmax function

Used for multiple classification logistic regression models.
When building a neural network, use the softmax function in different layers; softmax appears as a fully connected layer in the neural network, and the function is to map the result of the network calculation to the (0, 1) interval, giving each category The probability.

Implementation code of softmax

python implementation

import numpy as np

def softmax(x):
    orig_shape=x.shape
    #根据输入类型是矩阵还是向量分别做softmax
    if len(x.shape)>1:
        #矩阵
        #找到每一行的最大值max，并减去该max值，目的是为了防止exp溢出
        constant_shift=np.max(x,axis=1).reshape(1,-1)
        x-=constant_shift
        #计算exp
        x=np.exp(x)
        #每行求和
        normlize=np.sum(x,axis=1).reshape(1,-1)
        #求softmax
        x/=normlize
    else:
        #向量
        constant_shift=np.max(x)
        x-=constant_shift
        x=np.exp(x)
        normlize=np.sum(x)
        x/=normlize
    assert x.shape==orig_shape
    return x

softmax function image

import numpy as np
import matplotlib.pyplot as plt
def softmax(x):
    orig_shape=x.shape
    if len(x.shape)>1:
        constant_shift=np.max(x,axis=1).reshape(1,-1)
        x-=constant_shift
        x=np.exp(x)
        normlize=np.sum(x,axis=1).reshape(1,-1)
        x/=normlize
    else:
        constant_shift=np.max(x)
        x-=constant_shift
        x=np.exp(x)
        normlize=np.sum(x)
        x/=normlize
    assert x.shape==orig_shape
    return x

softmax_inputs = np.arange(0,5)
softmax_outputs=softmax(softmax_inputs)
print("softmax input:: {}".format(softmax_inputs))
print("softmax output:: {}".format(softmax_outputs))
# 画图像
plt.plot(softmax_inputs,softmax_outputs)
plt.xlabel("Softmax Inputs")
plt.ylabel("Softmax Outputs")
plt.show()

Insert picture description here
It can be seen from the figure that the larger the input value of softmax, the larger its output value.

sigmoid function

The sigmoid function is a common sigmoid function in biology, also known as the sigmoid growth curve. The sigmoid function is often used as the threshold function of neural networks to map variables between 0 and 1.

The nature of the sigmoid function

The formula of the sigmoid function is:
$\frac{1} {1+exp(-x)}$
x: input data;
exp: exponential operation;
f(x): function output, which is a floating point number;

Use of sigmoid function

Sigmoid function is used for binary classification in logistic regression model.
In neural networks, the sigmoid function is used as the activation function.
In statistics, the sigmoid function image is a common cumulative distribution function.

Implementation code of sigmoid

python implementation

import numpy as np
def sigmoid(x):
    return 1.0/(1+np.exp(-x))

sigmoid function drawing

def sigmoid(x):
    return 1.0/(1+np.exp(-x))

sigmoid_inputs = np.arange(-10,10)
sigmoid_outputs=sigmoid(sigmoid(sigmoid_inputs))
print("sigmoid Input :: {}".format(sigmoid_inputs))
print("sigmoid Output :: {}".format(sigmoid_outputs))

plt.plot(sigmoid_inputs,sigmoid_outputs)
plt.xlabel("Sigmoid Inputs")
plt.ylabel("Sigmoid Outputs")
plt.show()

Insert picture description here

Comparison of softmax and sigmoid

common	softmax	sigmoid
official	$F(x_i)=\frac {exp(x_i) } {\sum_{i=0}^k{exp(x_i)}}$	$\frac{1} {1+exp(-x)}$
Nature	Discrete probability distribution	Nonlinear mapping
task	Multi-category	Two categories
Domain	A one-dimensional vector	Single value
range	[0, 1]	(0, 1)
Sum of results	Must be 1	Is a positive number

Explain softmax and sigmoid in detail

Explain softmax and sigmoid in detail

Introduction to activation functions

softmax function

Properties of softmax function

Use of softmax function

Implementation code of softmax

python implementation

softmax function image

sigmoid function

The nature of the sigmoid function

Use of sigmoid function

Implementation code of sigmoid

python implementation

sigmoid function drawing

Comparison of softmax and sigmoid

Guess you like