Deep Learning Practical Tutorial (1): Perceptron

What is deep learning

image-20220809090239057

​ Each circle in the above figure is a neuron, and each line represents the connection between neurons. We can see that the above neurons are divided into multiple layers, the neurons between the layers are connected, and the neurons between the layers are not connected. The leftmost layer is called the input layer , which is responsible for receiving input data; the rightmost layer is called the output layer , and we can get the neural network output data from this layer. The layers between the input layer and the output layer are called hidden layers .

A neural network with many hidden layers (greater than 2) is called a deep neural network. Deep learning, on the other hand, is a machine learning method that uses deep architectures such as deep neural networks.

​ So what are the advantages of a deep network over a shallow network? Simply put, deep networks are more expressive. In fact, a neural network with only one hidden layer can fit any function, but it requires many, many neurons. Deep networks, on the other hand, can fit the same function with far fewer neurons. That is, to fit a function, either use a shallow and wide network, or use a deep and narrow network. The latter are often more resource-efficient.

​ Deep networks also have the disadvantage that they are not easy to train. Simply put, you need a lot of data and a lot of skills to train a deep network. It's a craft.

sensor

​ Seeing this, if you are still confused, that's normal. In order to understand neural networks, we should first understand the building blocks of neural networks - neurons . Neurons are also called perceptrons . The perceptron algorithm was very popular in the 1950s and 1970s and successfully solved many problems. Also, the perceptron algorithm is very simple.

Perceptron Definition

The following picture is a perceptron:

Deep Learning Practical Tutorial (1): Perceptron

It can be seen that a perceptron has the following components:

(1) Input weight

A perceptron can receive multiple inputs:

Deep Learning Practical Tutorial (1): Perceptron

There is a weight on each input :

Deep Learning Practical Tutorial (1): Perceptron

In addition there is a bias term :

Deep Learning Practical Tutorial (1): Perceptron

It is w0 in the figure above. R refers to the set of real numbers.

(2) Activation function

There are many options for the activation function of the perceptron. For example, we can choose the following step function f as the activation function:

Deep Learning Practical Tutorial (1): Perceptron

(3) output

The output of the perceptron is calculated by the following formula:

Deep Learning Practical Tutorial (1): Perceptron

If you feel dizzy after reading the above formula, it doesn't matter, let's use a simple example to help understand.

Example: Implementing the and function with a perceptron

​ We design a perceptron and let it implement the and operation. Programmers know that and is a binary function (with two parameters x1 and x2), and the following is its truth table :

Deep Learning Practical Tutorial (1): Perceptron

For the convenience of calculation, we use 0 to represent False and 1 to represent True . This is nothing difficult to understand, for C language programmers, this is a matter of course.

We order:

Deep Learning Practical Tutorial (1): Perceptron

The activation function is the step function f written earlier . At this time, the perceptron is equivalent to the and function. not understand? Let's do the math:

Input the first line of the truth table above, that is, x 1 =0; x 2 =0, then according to the formula (1), calculate the output:

Deep Learning Practical Tutorial (1): Perceptron

That is, when x 1 x 2 are all 0, y is 0, which is the first line of the truth table . Readers are free to verify the second, third, and fourth lines of the above truth table.

Example: implementing the or function with a perceptron

​ Similarly, we can also use perceptrons to implement OR operations. Just set the value of the bias item to -0.3. Let's check the calculation, the following is the truth table of the or operation :

Deep Learning Practical Tutorial (1): Perceptron

Let's check the second row. The input at this time is x 1 =0; x 2 =1, which is brought into the formula (1):

Deep Learning Practical Tutorial (1): Perceptron

That is, when x 1 =0; x 2 =1, y is 1, which is the second line of the or truth table . It is up to the reader to verify the other lines themselves.

what else can a perceptron do

In fact, perceptrons can do more than simple Boolean operations. It can fit any linear function, and any linear classification or linear regression problem can be solved with a perceptron. The previous Boolean operations can be seen as a binary classification problem, that is, given an input, the output is 0 (belonging to category 0) or 1 (belonging to category 1). As shown below, the and operation is a linear classification problem, that is, a straight line can be used to separate category 0 (false, indicated by a red cross) and category 1 (true, indicated by a green dot).

Deep Learning Practical Tutorial (1): Perceptron

However, the perceptron cannot implement the XOR operation. As shown in the figure below, the XOR operation is not linear, and you cannot use a straight line to separate category 0 from category 1.

Deep Learning Practical Tutorial (1): Perceptron

perceptron training

​ Now, you may be confused about how the values ​​​​of the previous weight and bias items are obtained? This requires the perceptron training algorithm: initialize the weight item and bias item to 0, and then use the following perceptron rules to iteratively modify w i and b until the training is completed.

Deep Learning Practical Tutorial (1): Perceptron

in:

Deep Learning Practical Tutorial (1): Perceptron

​ w i is the weight item corresponding to the input x i , and b is the bias item. In fact, b can be thought of as the weight corresponding to the input x b whose value is always 1. t is the actual value of the training sample , generally called label . And y is the output value of the perceptron, which is calculated according to formula (1) . η is a constant called the learning rate , and its role is to control the magnitude of the weight adjustment at each step.

​ Take the input vector x of a sample from the training data each time, use the perceptron to calculate its output y, and then adjust the weight according to the above rules. The weights are adjusted every time a sample is processed. After multiple rounds of iterations (that is, all the training data is repeatedly processed for multiple rounds), the weight of the perceptron can be trained to achieve the objective function.

Programming in Action: Implementing a Perceptron

Please refer to GitHub for the complete code: Click to view

For programmers, there is nothing faster than learning by doing it yourself, and in many cases, a line of code is worth a thousand words. Next we will implement a perceptron.

Here are some instructions:

  • Use python language. Python is widely used in the field of machine learning, and writing python programs is really easy.
  • Object-Oriented Programming. Object-oriented is a particularly good tool for managing complexity. When dealing with complex problems, it is easy to disassemble complex problems into multiple simple problems with object-oriented design methods, thus saving our brains.
  • No numpy is used. Numpy implements many basic algorithms and is an essential tool for implementing machine learning algorithms. But in order to reduce the difficulty for readers to understand, the following code only uses basic python (saving you the time to learn numpy).

Below is the implementation of the Perceptron class, which is very simple. It's only 27 lines without comments, and it also includes many newlines added for aesthetics (each line does not exceed 60 characters).

from functools import reduce

class Perceptron:
    def __init__(self, input_num=0, activator=None):
        """
        初始化感知器,设置输入参数的个数,以及激活函数。
        激活函数的类型为double -> double
        """
        self.input_num = input_num
        self.activator = activator
        # 权值初始化为0
        self.weights = [0.0 for _ in range(input_num)]
        # 偏置项初始化为0
        self.bias = 0.0

    def __str__(self):
        """
        打印学习到的权重、偏置项
        """
        return "weights\t:%s\nbias\t:%f\n" % (self.weights, self.bias)

    def predict(self, input_vec):
        """
        输入向量,输出感知器的计算结果
        """
        # 把input_vec[x1,x2,x3...]和weights[w1,w2,w3,...]打包在一起
        # 变成[(x1,w1),(x2,w2),(x3,w3),...]
        # 然后利用map函数计算[x1*w1, x2*w2, x3*w3]
        # 最后利用reduce求和
        return self.activator(
            reduce(lambda a, b: a + b, list(map(lambda x, w: x * w, input_vec, self.weights)), 0.0) + self.bias)

    def train(self, input_vecs, labels, iteration, rate):
        """
        输入训练数据:一组向量、与每个向量对应的label;以及训练轮数、学习率
        """
        for i in range(iteration):
            self._one_iteration(input_vecs, labels, rate)

    def _one_iteration(self, input_vecs, labels, rate):
        """
        一次迭代,把所有的训练数据过一遍
        """
        # 把输入和输出打包在一起,成为样本的列表[(input_vec, label), ...]
        # 而每个训练样本是(input_vec, label)
        samples = zip(input_vecs, labels)
        # 对每个样本,按照感知器规则更新权重
        for (input_vecs, labels) in samples:
            # 计算感知器在当前权重下的输出
            output = self.predict(input_vecs)
            # 更新权重
            self._update_weights(input_vecs, output, labels, rate)

    def _update_weights(self, input_vec, output, label, rate):
        """
        按照感知器规则更新权重
        """
        # 把input_vec[x1,x2,x3,...]和weights[w1,w2,w3,...]打包在一起
        # 变成[(x1,w1),(x2,w2),(x3,w3),...]
        # 然后利用感知器规则更新权重
        delta = label - output
        self.weights = list(map(lambda x, w: w + rate * delta * x, input_vec, self.weights))
        # 更新偏置
        self.bias += rate * delta


def f(x):
    """
    定义激活函数
    """
    return 1 if x > 0 else 0


def get_training_dataset():
    """
    基于and真值表构建训练数据
    """
    # 构建训练数据
    # 输入向量列表
    input_vecs = [[1, 1], [0, 0], [1, 0], [0, 1]]
    # 期望的输出列表,注意要与输入一一对应
    # [1,1] -> 1, [0,0] -> 0, [1,0] -> 0, [0,1] -> 0
    labels = [1, 0, 0, 0]
    return input_vecs, labels


def train_and_perceptron():
    """
    使用and真值表训练感知器
    """
    # 创建感知器,输入参数个数为2(因为and是二元函数),激活函数为f
    p = Perceptron(2, f)
    # 训练,迭代10轮, 学习速率为0.1
    input_vecs, labels = get_training_dataset()
    p.train(input_vecs, labels, 10, 0.1)
    # 返回训练好的感知器
    return p


if __name__ == '__main__':
    # 训练and感知器
    and_perception = train_and_perceptron()
    # 打印训练获得的权重
    print(and_perception)
    # 测试
    print('1 and 1 = %d' % and_perception.predict([1, 1]))
    print('0 and 0 = %d' % and_perception.predict([0, 0]))
    print('1 and 0 = %d' % and_perception.predict([1, 0]))
    print('0 and 1 = %d' % and_perception.predict([0, 1]))

Save the above program as perceptron.py file, execute this program through the command line, and the running result is:

image-20220809112314833

Amazing! The perceptron actually fully implements the and function. Readers can try to use perceptrons to implement other functions.

summary

Finally read (written) to the summary... Everyone is tired. For you with zero foundation, it should be very brain-burning to come here. It's okay, take a break. The good news is that you have finally taken the first step of getting started with deep learning, which is a huge progress; the bad news is that this is only the easiest part, and there are countless difficulties and obstacles waiting for you. However, the difficulty for you to learn often means that it is also difficult for others to learn. It is very worthwhile to master a high-threshold skill, advance and survive, retreat and pretend.

In the next article, we will discuss another kind of perceptron: the linear unit , and from this lead to a possibly the most important optimization algorithm: the gradient descent algorithm.

References

PS : This article is a reposted article, I feel that it is very well written, and I will share it with you. Some minor adjustments, corrections, etc. have been made to the original text.

Original link: https://www.zybuluo.com/hanbingtao/note/433855

Reference link: https://cuijiahua.com/blog/2018/10/dl-7.html

Thanks to the original author for his contribution!

Guess you like

Origin blog.csdn.net/qq_43300880/article/details/126245477
Recommended