Use feed-forward neural network (parameter self-simulation) to solve the XOR problem (python implementation)

0. Problem

Exclusive OR (exclusive OR, XOR, EOR, EX-OR). In 1969, Marvin Minsky published the book "Perceptron", which pointed out two key flaws of neural networks: one is that the perceptron cannot handle the "exclusive OR" circuit problem; the other is that the computer at that time cannot support the processing of large-scale neural networks. Calculate ability. These assertions have made people question the neural network represented by the perceptron, and led to the "ice age" of more than ten years of neural network research. It can be considered that the perceptron is a layer of feedforward neural network (without input layer). Try to use a two-layer feedforward neural network (without the input layer) to solve the XOR problem, requiring 100% accuracy on the test set. Specifically, the training set and test set of the neural network are the same, both are S = {(1, 0, 1), (1, 1, 0), (0, 0, 0), (0, 1, 1)} , The third dimension of each data is label.

1. Problem analysis

The title pointed out that the perceptron (a layer of feedforward neural network) cannot realize the XOR problem. This is mainly because a layer of feedforward neural network can only represent linear space, and the XOR problem is a problem in nonlinear space.
According to the knowledge of digital circuit, the exclusive OR gate can be configured through AND gate, NAND gate, or gate, as shown in the figure.
Insert picture description here
According to the knowledge of the perceptron (a layer of feedforward neural network), it can be known that the AND gate, NAND gate, and OR gate can all be realized by a layer of neural network. Therefore, it can be inferred that the XOR gate can be realized by a two-layer feedforward neural network.

2. Implementation steps

First list the truth table of AND gate, NAND gate, OR gate, XOR gate, as shown in the figure.
Insert picture description here
Due to the simple logic of the truth table, there is no need to train the neural network, and directly set the weights and biases to construct a two-layer neural network.
The two-layer feedforward neural network constructed according to the truth table is shown in the figure.

Insert picture description here
The activation function of this neural network is a step function, and the weight and bias of each layer of the neural network can be seen in the following formula.
Insert picture description here
Based on this, you can first use python to construct a step function, initialize a neural network, and construct a feedforward function.

import numpy as np

def step_function(x):
    y = x > 0
    return y.astype(np.int)

def init_network():
    network = {
    
    }
    network['W1'] = np.array([[-0.5, 0.5], [-0.5, 0.5]])
    network['b1'] = np.array([0.7, -0.2])
    network['W2'] = np.array([0.5, 0.5])
    network['b2'] = np.array([-0.7])
    
    return network

def forward(network, x):
    W1, W2 = network['W1'], network['W2']
    b1, b2 = network['b1'], network['b2']
    
    NAND_OR = step_function(np.dot(x, W1) + b1)
    XOR = step_function(np.dot(NAND_OR, W2) + b2)    
    
    return XOR

Finally, the test set S = ((1, 0, 1), (1, 1, 0), (0, 0, 0), (0, 1, 1)) can be input to the constructed two-layer feedforward In the neural network, the output result is compared with the label in the test set, and the accuracy is finally obtained.

network = init_network()
s_test = np.array([[1, 0, 1], [1, 1, 0], [0, 0, 0], [0, 1, 1]])

y = []
accuracy_cnt = 0
for i in range(0, 4):
    y.append(forward(network, s_test[i, 0:2]))
    if y[i] == s_test[i, 2]:
        accuracy_cnt += 1
print('accuracy:', str(accuracy_cnt / len(s_test)))

3. Experimental results and analysis

The output after running the code is shown in the figure.
Insert picture description here
As shown in the figure, the constructed two-layer feedforward neural network solves the XOR problem, and the accuracy on the test set is 100%.

4. Source Code

XOR.py

# -*- coding: utf-8 -*-
"""
Created on Wed Dec 23 20:30:24 2020

@author: jiawei
"""

import numpy as np

def step_function(x):
    y = x > 0
    return y.astype(np.int)

def init_network():
    network = {
    
    }
    network['W1'] = np.array([[-0.5, 0.5], [-0.5, 0.5]])
    network['b1'] = np.array([0.7, -0.2])
    network['W2'] = np.array([0.5, 0.5])
    network['b2'] = np.array([-0.7])
    
    return network

def forward(network, x):
    W1, W2 = network['W1'], network['W2']
    b1, b2 = network['b1'], network['b2']
    
    NAND_OR = step_function(np.dot(x, W1) + b1)
    XOR = step_function(np.dot(NAND_OR, W2) + b2)    
    
    return XOR

network = init_network()
s_test = np.array([[1, 0, 1], [1, 1, 0], [0, 0, 0], [0, 1, 1]])

y = []
accuracy_cnt = 0
for i in range(0, 4):
    y.append(forward(network, s_test[i, 0:2]))
    if y[i] == s_test[i, 2]:
        accuracy_cnt += 1
print('accuracy:', str(accuracy_cnt / len(s_test)))

Guess you like

Origin blog.csdn.net/mahoon411/article/details/112299993