Chapter 4 Building Neural Networks Using JAVA

• build a neural network

• Activation function

• Encog persistence

• Use Encog Analyst in code

This chapter will show how to construct feed-forward and simple recurrent neural networks with Encog, and how to save these neural networks in the final part. The two classes BasicNetwork and BasicLayer are used to create neural network types. In addition to these two classes, activation functions are also used, and the role of activation functions will also be discussed.

Considering that neural networks take a lot of time to train, it is important to save your neural network. Encog neural networks can be persisted using java's built-in serialization. This persistence can be written to the EG file of the neural network. Implementation, which is a cross-platform text file, this chapter will introduce these two persistence methods.

In the previous chapter, using EncogAnalyst to automatically normalize data, EncogAnalyst can also automatically create a neural network based on CSV data. This chapter will show how to use Encog Analyst to create a neural network from code.

4.1 Building a Neural Network

A simple neural network can be created using BasicLayer and BasicNetwork objects. The following code creates several BasicLayer objects, defaulting to the hyperbolic tangent activation function.
insert image description here
This neural network will have an input layer of two neurons, a hidden layer of three neurons and an output layer of a single neuron. To use an activation function other than the hyperbolic tangent, code similar to the following can be used: Sigmoid
insert image description here
activation The function is passed to the hidden layer and the output layer by calling AddLayer, a value of True specifies that this BasicLayer should have a paranoid neuron, that the output layer has no paranoid neurons, and that the input layer has no activation function because the paranoid neurons affect the next layer, and Activation functions affect data from previous layers.

Unless Encog is used in some experiments, it is always necessary to use a paranoid neuron. The paranoid neuron allows the activation function to shift the origin to zero, which allows the paranoid neuron to generate a value of 0 when the input is not 0. The following URL provides more Important mathematical principles about paranoid neurons:

http://www.heatonresearch.com/wiki/Bias

Activation functions are attached to layers and used to scale the data from a layer, Encog applies the layer's activation function to the data that this layer will output. If no activation function is specified in BasicLayer, the hyperbolic tangent function is used by default.

You can also create a context layer. The context layer can be used to create an Elman or Jordan style neural network. The following code can be used to create an Elman neural network.

insert image description here
Note the hidden.setContextFedBy line. This creates a context link from the output layer to the hidden layer. The hidden layer always passes the output of the previous iteration. This is a neural network that creates an Elman style. Elman and Jordan networks will be discussed in Chapter 7. introduce.

4.2 The role of the activation function

The previous section introduced how to assign activation functions to layers. Many neural network architectures use activation functions to scale the output of layers. Encog provides a large number of different activation functions that can be used to construct neural networks. The next section will introduce activation functions.

Activation functions are attached to layers and are used to scale the data from a layer, the activation function of the Encog layer is applied to the output data. If an activation function is not specified for BasicLayer, the default hyperbolic tangent activation function will be used, and all activation function classes must implement the ActivationFunction interface.

Activation functions play a very important role in training neural network tasks, propagation training, which will be introduced in the next chapter, for propagation training, activation functions require an effective derivative, not all activation functions have effective derivatives, determine An activation function having a derivative may be an important factor in choosing an activation function.

4.3 Encog activation function

The next section will introduce each activation function of encog. There are several factors to consider when choosing an activation function. First, it is very important to consider how to use the type of neural network to determine the required activation function. Second, consider the necessity for neural networks to use propagation training, which requires the activation function to provide a derivative. Finally, considering the range of values ​​used, some activation functions only handle positive values ​​or a certain range of values.

4.3.1 ActivationBiPolar

The ActivationBiPolar activation function uses a neural network that requires a bipolar value. The value of the bipolar value is either true or false. The bipolar value of 1 represents true, and the bipolar value of -1 represents false. Bipolar The activation function ensures that any value passed to it is either 1 or -1. The bipolar activation function code is as follows:
insert image description here
As shown above, the output of this activation function is limited to 1 or -1. This activation function is used in neural networks that require a bipolar output from one layer to another. This bipolar value has no derivative function, so this activation function cannot be used for propagation training.

4.3.2 ActivationCompetitive

The ActivationCompetitive function is used to force only one group of neurons to win, and the winner is the group of neurons with the highest output. The output of each neuron is kept in the array passed to this function, and the size of the winning neuron group is definable. The function decides the winner first. All non-winning neurons will be set to 0. Winners all have the same value, which is a division of the sum of winning outputs.

This function first creates an array that will keep track of whether each neuron has been selected as one of the winners. The number of winners is also counted:
insert image description here
first, loop maxWiners many times the number of winners has been found:
insert image description here
now, the winner must be determined, loop over all neural network outputs have found the maximum output:
insert image description here
if this neuron did not win, it has already been The largest output and no other neuron is higher than this, then this is probably a winner:
insert image description here
keep the sum of the winners found, and mark this neuron as a winner. Marking it as a winner will prevent it from being here Selected. The sum of the winners' outputs will eventually be distributed among the winners.
insert image description here
Now that the correct number of winners has been determined, the values ​​for winners and non-winners must be adjusted. All non-winners will be set to 0, and winners will share the sum of all winners' values.
insert image description here
This type of activation function can be used in competitive learning neural networks, such as self-organizing maps, this activation function has no derivatives, so it cannot be used for propagation training.

4.3.3 ActivationLinear

The ActivationLinear function actually has no activation function, it simply implements a linear function, which is shown in Equation 4.1 below:
insert image description here
Figure 4.1 shows a simple image of a linear function:
insert image description here
The java implementation of the linear activation function is very simple, it does nothing:
insert image description here
Linear function Mainly used for certain types of neural networks that do not have an activation function, such as self-organizing maps, the linear activation function has a constant derivative, so it can be used for propagation training. The output layer of a propagation training feedforward neural network sometimes uses a linear layer.

4.3.4 ActivationLOG

The ActivationLog activation function uses the log function algorithm. The calculation of this activation function is shown below:
insert image description here
The resulting curve is similar to the hyperbolic tangent activation function, which will be discussed later in this chapter. Figure 4.2 shows a graph of the logarithmic activation function.
insert image description here
Logarithmic activation functions can be used to prevent saturation. A neuron's hidden nodes are said to be oversaturated, and for a given set of inputs, the output will be around 1 or -1 most of the time. This can significantly slow down training, and the logarithmic activation function is an appropriate choice when training with hyperbolic tangent is unsuccessful.

As shown in Figure 4.2, the logarithmic activation function has both positive and negative values, which means that it can be used in neural networks that expect negative output. Some activation functions, such as the sigmoid activation function, only produce positive outputs. The number activation function has a derivative, so it can be used for propagation training.

4.3.5 ActivationSigmoid

The ActivationSigmoid activation function should only be used when the expected output is positive, because the ActivationSigmoid activation function only produces positive outputs. The equation of the ActivationSigmoid activation function is as follows:
insert image description here
The ActivationSigmoid function changes the negative range to a positive range. An image of the sigmoid function is shown in Figure 4.3:
insert image description here
The ActivationSigmoid function is a very common choice for feedforward and simple recurrent neural networks. However, it is very necessary that the training data does not expect negative output. If negative output is required, the hyperbolic tangent activation function may be the best solution.

4.3.6 ActivationSoftMax

The ActivationSoftMax activation function will scale all input values ​​so that the sum of all is equal to 1. ActivationSoftMax is sometimes used in the activation function of a hidden layer.

The activation function starts with the natural exponential sum of the outputs of all neurons:
insert image description here
then, the output of each neuron is based on this sum, thus producing an output that sums to 1:
insert image description here
ActivationSoftMax is usually used in the output layer classification of neural networks.

4.3.7 ActivationTANH

The ActivationTANH activation function uses the hyperbolic tangent function. The hyperbolic tangent function may be the most commonly used activation function because it has two ranges of positive and negative. The hyperbolic tangent function is the default activation function of Encog. The formula of the hyperbolic tangent function is as follows Equation 4.3 Shown:
insert image description here
The hyperbolic tangent function image shown in Figure 4.4 shows that the hyperbolic tangent activation function actually accepts both positive and negative values.
insert image description here
The hyperbolic tangent function is a very common choice in feed-forward and simple recurrent neural networks. The hyperbolic tangent function has a derivative, so it can be used for propagation training.

4.4 Encog Persistence

Training a neural network takes a considerable amount of time. It is important to take steps to ensure that your work is preserved after the network training is complete. Encog provides several ways to save data, which are the same as the two most important storage Encog data objects. In this way, Encog provides file-based persistence or java's own persistence.

Java provides its own way of serializing objects, called java serialization. Java serialization allows a large number of different object types to be written to a stream, such as a disk file. Java serialization Encog is the same as serializing other java objects. Every important Encog objects support serialization and implement the Serializable interface.

Java serialization is a fast way to store Encog objects, however, it has some important limitations, creating java serialization files can only be used with Encog's java platform, they will not be compatible with Encog's .Net or Encog for Silverlight platforms . Furthermore, java serialization directly depends on objects, so future versions of Encog may not be compatible with your serialization files.

Create common files for all Encog platforms, consider the Encog EG format, EG format file architecture neural networks are saved as text files with the file extension .EG.

This chapter will introduce these two types of Encog persistence, starting with Encog EG persistence, this chapter will explore how to save a neural network with an Encog persistence file.

4.5 Persistence with Encog EG

Encog EG persistent files are Encog native file formats and externally stored...EG files. Encog Workbench uses Encog EG to process files. This format can be exchanged between different operating systems and Encog platforms. This is one reason to choose the Encog application.

This section starts by looking at an XOR example using an Encog EG file, and at the end, the same example will be used in java serialization. We will start by persisting the example with Encog EG.

4.5.1 Persistence using Encog EG

Persistence using Encog EG is very simple, the EncogDirectoryPersistence class is used to load and save objects from Encog EG files. Here is a good example of Encog EG persistence:
insert image description here
This example consists of two main methods, the first method, trainAndSave, trains a neural network and saves to an Encog EG file. The second method, LoadAndEvaluate, loads the Encog EG file and evaluates it, which proves that the Encog EG file was saved correctly. The main method simply calls these two methods in sequence, and we will start testing the trainAndSave method.
insert image description here
This method starts by creating a basic neural network to train the XOR operator. It is a simple three-layer feed-forward neural network.
insert image description here
The XOR operator training set was created with expected outputs and inputs.
insert image description here
This neural network will be trained using Resilient Propagation (RPROP).
insert image description here
RPROP is performed iteratively until the error is very small. Training will be discussed in the next chapter. For now, training is a way to verify that the error rate remains constant after the network is reloaded.
insert image description here
Once the network is trained, show the final error rate and save the neural network.
insert image description here
The neural network can now be saved to a file, saving only one Encog object per file, using the saveObject method of the EncogDirectoryPersistence class.
insert image description here
Now that the Encog EG file is created, use loadAndEvaluate to load the neural network from this file afterwards to make sure it still performs well.
insert image description here
Now that the ensemble has been constructed, and the previously saved named network loaded, it is important to evaluate the neural network to prove that it is still trained. To do this, create a training set for the XOR operator.

insert image description here
Computes the error given the training data.
insert image description here
The error should be the same as when the previous neural network was saved.

4.6 Using java serialization

You can also use standard java serialization with the Encog neural network training set. Encog EG persistence is more flexible than java serialization. However, in some cases the neural network can be saved to a platform independent binary file. This example starts by calling the trainAndSave method: This method starts by creating a basic neural network
insert image description here
to Train the XOR operation. It's a simple example of a three-layer feed-forward neural network.
insert image description here
We will train this network using Resilient Propagation (RPROP).
insert image description here
The following code trains iteratively through a loop until the error rate is below 1% (<0.01).
insert image description here
Finally, the error rate of the neural network is displayed.
insert image description here
Networks can be saved using plain java serialization code or the SerializeObject class. This utility class provides a save method that will write any serializable object to a binary file. The save method here is used to save the neural network.
insert image description here
Now that the binary file has been created, the neural network will be loaded from this file to see if it still performs well, here through the loadAndEvaluate method.

insert image description here
The SerializeObject class also provides the Load method, which will read objects from a binary serialization file.
insert image description here
Now, the network is loaded and the error level is given.
insert image description here
This error level should match the error level with which the network was originally trained.

4.7 Summary

Use the BasicNetwork and BasicLayer classes to create feed-forward and simple recurrent neural networks. Using these objects, neural networks can be created and also use contextual connection layers, just like simple recurrent neural networks, such as the Elamn neural network are created.

Encog uses the activation function to scale the output from the neural network layer. Encog will use the hyperbolic tangent function by default, which is a good general purpose activation function. Any activation function class must implement the ActivationFunction interface. If the activation function is to be used for propagation training, This activation function must have the ability to compute derivatives.

The ActivationBiPolar activation function class uses a network that only accepts bipolar values, the ActivationCompetitive activation function class is used in neural networks with competitive mechanisms such as self-organizing maps, the ActivationLinear activation function class is used when no activation function is required, and the ActivationLOG activation function class works similarly to ActivationTANH The activation function class, except that it does not always saturate as a hidden layer, the ActivationSigmoid activation function class is similar to the ActivationTANH activation function class, except that it only returns a positive value, and the ActivationSoftMax activation function class scales the output so that all sums equal 1.

This chapter discusses how to save Encog objects using two methods, perhaps using either the Encog EG format or java serialization.

For saving the Encog neural network, the Encog EG format is preferred. These objects use their resource names, and the EG files can be exchanged on any platform supported by Encog.

Encog also allows java serialization to store objects to disk or to streams. Java serialization is much more restrictive than Encog EG files, because the binary is stored directly from the object automatically, and even the smallest change to an Encog object would result in an incompatible file. Also, other platforms do not have the ability to use this file.

Concepts in neural network training are elaborated in the next chapter, the process of training a neural network whose initial weights are modified to produce the desired output. There are several ways to train a neural network, and the next chapter will cover propagation training.

Guess you like

Origin blog.csdn.net/Janix520/article/details/125882798