We will now look at how to structure a neural network for a very simple problem. We will consider creating a neural network that can function as an XOR operator. Learning the XOR operator is a frequent “first example” when demonstrating the architecture of a new neural network. Just as most new programming languages are first demonstrated with a program that simply displays “Hello World”, neural networks are frequently demonstrated with the XOR operator. Learning the XOR operator is sort of the “Hello World” application for neural networks.

The XOR operator is one of three commonly used Boolean logical operators. The other two are the AND and OR operators. For each of these logical operators, there are four different combinations. For example, all possible combinations for the AND operator are shown below.

0 AND 0 = 0 1 AND 0 = 0 0 AND 1 = 0 1 AND 1 = 1

This should be consistent with how you learned the AND operator for computer programming. As its name implies, the AND operator will only return true, or one, when both inputs are true.

The OR operator behaves as follows.

0 OR 0 = 0 1 OR 0 = 1 0 OR 1 = 1 1 OR 1 = 1

This also should be consistent with how you learned the OR operator for computer programming. For the OR operator to be true, either of the inputs must be true.

The “exclusive or” (XOR) operator is less frequently used in computer programming, so you may not be familiar with it. XOR has the same output as the OR operator, except for the case where both inputs are true. The possible combinations for the XOR operator are shown here.

0 XOR 0 = 0 1 XOR 0 = 1 0 XOR 1 = 1 1 XOR 1 = 0

As you can see the XOR operator only returns true when both inputs differ. In the next section we will see how to structure the input, output and hidden layers for the XOR operator.

There are two inputs to the XOR operator and one output. The input and output layers will be structured accordingly. We will feed the input neurons the following **double** values:

0.0,0.0 1.0,0.0 0.0,1.0 1.0,1.0

These values correspond to the inputs to the XOR operator, shown above. We will expect the one output neuron to produce the following **double** values:

0.0 1.0 1.0 0.0

This is one way that the neural network can be structured. This method allows a simple feedforward neural network to learn the XOR operator. The feedforward neural network, also called a perceptron, is one of the first neural network architectures that we will learn.

There are other ways that the XOR data could be presented to the neural network. Later in this book we will see two examples of recurrent neural networks. We will examine the Elman and Jordan styles of neural networks. These methods would treat the XOR data as one long sequence. Basically concatenate the truth table for XOR together and you get one long XOR sequence, such as:

0.0,0.0,0.0, 0.0,1.0,1.0, 1.0,0.0,1.0, 1.0,1.0,0.0

The line breaks are only for readability. This is just treating XOR as a long sequence. By using the data above, the network would have a single input neuron and a single output neuron. The input neuron would be fed one value from the list above, and the output neuron would be expected to return the next value.

This shows that there is often more than one way to model the data for a neural network. How you model the data will greatly influence the success of your neural network. If one particular model is not working, you may need to consider another. For the examples in this book we will consider the first model we looked at for the XOR data.

Because the XOR operator has two inputs and one output, the neural network will follow suit. Additionally, the neural network will have a single hidden layer, with two neurons to help process the data. The choice for 2 neurons in the hidden layer is arbitrary, and often comes down to trial and error. The XOR problem is simple, and two hidden neurons are sufficient to solve it. A diagram for this network can be seen in Figure 1.4.

**Figure 1.4: Neuron Diagram for the XOR Network**

Usually, the individual neurons are not drawn on neural network diagrams. There are often too many. Similar neurons are grouped into layers. The Encog workbench displays neural networks on a layer-by-layer basis. Figure 1.5 shows how the above network is represented in Encog.

**Figure 1.5: Encog Layer Diagram for the XOR Network**

The code needed to create this network is relatively simple.

BasicNetwork network = new BasicNetwork(); network.addLayer(new BasicLayer(2)); network.addLayer(new BasicLayer(2)); network.addLayer(new BasicLayer(1)); network.getStructure().finalizeStructure(); network.reset();

In the above code you can see a **BasicNetwork** being created. Three layers are added to this network. The first layer, which becomes the input layer, has two neurons. The hidden layer is added second, and it has two neurons also. Lastly, the output layer is added, which has a single neuron. Finally, the **finalizeStructure** method must be called to inform the network that no more layers are to be added. The call to **reset** randomizes the weights in the connections between these layers.

These weights make up the long-term memory of the neural network. Additionally, some layers have threshold values that also contribute to the long-term memory of the neural network. Some neural networks also contain context layers which give the neural network a short-term memory as well. The neural network learns by modifying these weight and threshold values. We will learn more about weights and threshold values in Chapter 2.

Now that the neural network has been created, it must be trained. Training is discussed in the next section.

To train the neural network, we must construct a **NeuralDataSet** object. This object contains the inputs and the expected outputs. To construct this object, we must create two arrays. The first array will hold the input values for the XOR operator. The second array will hold the ideal outputs for each of 115 corresponding input values. These will correspond to the possible values for XOR. To review, the four possible values are as follows:

0 XOR 0 = 0 1 XOR 0 = 1 0 XOR 1 = 1 1 XOR 1 = 0

First we will construct an array to hold the four input values to the XOR operator. This is done using a two dimensional **double** array. This array is as follows:

public static double XOR_INPUT[][] = { { 0.0, 0.0 }, { 1.0, 0.0 }, { 0.0, 1.0 }, { 1.0, 1.0 } };

Likewise, an array must be created for the expected outputs for each of the input values. This array is as follows:

public static double XOR_IDEAL[][] = { { 0.0 }, { 1.0 }, { 1.0 }, { 0.0 } };

Even though there is only one output value, we must still use a two-dimensional array to represent the output. If there had been more than one output neuron, there would have been additional columns in the above array.

Now that the two input arrays have been constructed a **NeuralDataSet** object must be created to hold the training set. This object is created as follows.

NeuralDataSet trainingSet = new BasicNeuralDataSet(XOR_INPUT, XOR_IDEAL);

Now that the training set has been created, the neural network can be trained. Training is the process where the neural network's weights are adjusted to better produce the expected output. Training will continue for many iterations, until the error rate of the network is below an acceptable level. First, a training object must be created. Encog supports many different types of training.

For this example we are going to use Resilient Propagation (RPROP). RPROP is perhaps the best general-purpose training algorithm supported by Encog. Other training techniques are provided as well, as certain problems are solved better with certain training techniques. The following code constructs a RPROP trainer.

final Train train = new ResilientPropagation(network, trainingSet);

All training classes implement the **Train** interface. The PROP algorithm is implemented by the ResilientPropagation class, which is constructed above.

Once the trainer has been constructed the neural network should be trained. Training the neural network involves calling the **iteration** method on the **Train** class until the error is below a specific value.

int epoch = 1; do { train.iteration(); System.out.println("Epoch #" + epoch + " Error:" + train.getError()); epoch++; } while(train.getError() > 0.01);

The above code loops through as many iterations, or epochs, as it takes to get the error rate for the neural network to be below 1%. Once the neural network has been trained, it is ready for use. The next section will explain how to use a neural network.

Making use of the neural network involves calling the **compute** method on the **BasicNetwork** class. Here we loop through every training set value and display the output from the neural network.

System.out.println("Neural Network Results:"); for(NeuralDataPair pair: trainingSet ) { final NeuralData output = network.compute(pair.getInput()); System.out.println(pair.getInput().getData(0) + "," + pair.getInput().getData(1) + ", actual=" + output.getData(0) + ",ideal=" + pair.getIdeal().getData(0)); }

The compute method accepts a **NeuralData** class and also returns a **NeuralData** object. This contains the output from the neural network. This output is displayed to the user. With the program is run the training results are first displayed. For each Epoch, the current error rate is displayed.

Epoch #1 Error:0.5604437512295236 Epoch #2 Error:0.5056375155784316 Epoch #3 Error:0.5026960720526166 Epoch #4 Error:0.4907299498390594 ... Epoch #104 Error:0.01017278345766472 Epoch #105 Error:0.010557202078697751 Epoch #106 Error:0.011034965164672806 Epoch #107 Error:0.009682102808616387

The error starts at 56% at epoch 1. By epoch 107 the training has dropped below 1% and training stops. Because neural network was initialized with random weights, it may take different numbers of iterations to train each time the program is run. Additionally, though the final error rate may be different, it should always end below 1%.

Finally, the program displays the results from each of the training items as follows:

Neural Network Results: 0.0,0.0, actual=0.002782538818034049,ideal=0.0 1.0,0.0, actual=0.9903741937121177,ideal=1.0 0.0,1.0, actual=0.9836807956566187,ideal=1.0 1.0,1.0, actual=0.0011646072586172778,ideal=0.0

As you can see, the network has not been trained to give the exact results. This is normal. Because the network was trained to 1% error, each of the results will also be within generally 1% of the expected value.

Because the neural network is initialized to random values, the final output will be different on second run of the program.

Neural Network Results: 0.0,0.0, actual=0.005489822214926685,ideal=0.0 1.0,0.0, actual=0.985425090860287,ideal=1.0 0.0,1.0, actual=0.9888064742994463,ideal=1.0 1.0,1.0, actual=0.005923146369557053,ideal=0.0

Above, you see a second run of the program. The output is slightly different. This is normal.

This is the first Encog example. You can see the complete program in Listing 1.1.

**Listing 1.1: Solve XOR with RPROP**

// /org/encog/examples/neural/xorresilient/XORResilient.java /* * Encog Artificial Intelligence Framework v2.x * Java Examples * http://www.heatonresearch.com/encog/ * http://code.google.com/p/encog-java/ * * Copyright 2008-2009, Heaton Research Inc., and individual contributors. * See the copyright.txt in the distribution for a full listing of * individual contributors. * * This is free software; you can redistribute it and/or modify it * under the terms of the GNU Lesser General Public License as * published by the Free Software Foundation; either version 2.1 of * the License, or (at your option) any later version. * * This software is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * Lesser General Public License for more details. * * You should have received a copy of the GNU Lesser General Public * License along with this software; if not, write to the Free * Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA * 02110-1301 USA, or see the FSF site: http://www.fsf.org. */ package org.encog.examples.neural.xorresilient; import org.encog.neural.data.NeuralData; import org.encog.neural.data.NeuralDataPair; import org.encog.neural.data.NeuralDataSet; import org.encog.neural.data.basic.BasicNeuralDataSet; import org.encog.neural.networks.BasicNetwork; import org.encog.neural.networks.layers.BasicLayer; import org.encog.neural.networks.training.Train; import org.encog.neural.networks.training.propagation.back.Backpropagation; import org.encog.neural.networks.training.propagation.manhattan.ManhattanPropagation; import org.encog.neural.networks.training.propagation.resilient.ResilientPropagation; import org.encog.neural.networks.training.strategy.Greedy; import org.encog.util.logging.Logging; /** * XOR: This example is essentially the "Hello World" of neural network * programming. This example shows how to construct an Encog neural * network to predict the output from the XOR operator. This example * uses resilient propagation (RPROP) to train the neural network. * RPROP is the best general purpose supervised training method provided by * Encog. * * @author $Author$ * @version $Revision$ */ public class XORResilient { public static double XOR_INPUT[][] = { { 0.0, 0.0 }, { 1.0, 0.0 }, { 0.0, 1.0 }, { 1.0, 1.0 } }; public static double XOR_IDEAL[][] = { { 0.0 }, { 1.0 }, { 1.0 }, { 0.0 } }; public static void main(final String args[]) { Logging.stopConsoleLogging(); BasicNetwork network = new BasicNetwork(); network.addLayer(new BasicLayer(2)); network.addLayer(new BasicLayer(2)); network.addLayer(new BasicLayer(1)); network.getStructure().finalizeStructure(); network.reset(); NeuralDataSet trainingSet = new BasicNeuralDataSet(XOR_INPUT, XOR_IDEAL); // train the neural network final Train train = new ResilientPropagation(network, trainingSet); int epoch = 1; do { train.iteration(); System.out .println("Epoch #" + epoch + " Error:" + train.getError()); epoch++; } while(train.getError() > 0.01); // test the neural network System.out.println("Neural Network Results:"); for(NeuralDataPair pair: trainingSet ) { final NeuralData output = network.compute(pair.getInput()); System.out.println(pair.getInput().getData(0) + "," + pair.getInput().getData(1) + ", actual=" + output.getData(0) + ",ideal=" + pair.getIdeal().getData(0)); } } }

Calais Document Category:

Events Facts:

Technology:

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer