jeffheaton's picture

    You are now familiar with all of the layer and synapse types supported by Encog. You will now be given a brief introduction to building ANNs with these neural network types. You will see how to construct several neural network types. They will be used to solve problems related to the XOR operator. For now, the XOR operator is a good enough introduction to several neural network architectures. We will see more interesting examples, as the book progresses. We will begin with the feedforward neural network.

Creating Feedforward Neural Networks

    The feedforward neural network is one of the oldest types of neural networks still in common use. The feedforward neural network is also known as the perceptron. The feedforward neural network works by having one or more hidden layers sandwiched between an input and output layer. Figure 2.7 shows an Encog Workbench diagram of a feedforward neural network.

Figure 2.7: The Feedforward Neural Network

The Feedforward Neural Network

    Listing 2.3 shows a simple example of a feedforward neural network learning to recognize the XOR operator.

// /org/encog/examples/neural/xorresilient/XORResilient.java
/*
 * Encog Artificial Intelligence Framework v2.x
 * Java Examples
 * http://www.heatonresearch.com/encog/
 * http://code.google.com/p/encog-java/
 * 
 * Copyright 2008-2009, Heaton Research Inc., and individual contributors.
 * See the copyright.txt in the distribution for a full listing of 
 * individual contributors.
 *
 * This is free software; you can redistribute it and/or modify it
 * under the terms of the GNU Lesser General Public License as
 * published by the Free Software Foundation; either version 2.1 of
 * the License, or (at your option) any later version.
 *
 * This software is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
 * Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with this software; if not, write to the Free
 * Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
 * 02110-1301 USA, or see the FSF site: http://www.fsf.org.
 */

package org.encog.examples.neural.xorresilient;

import org.encog.neural.data.NeuralData;
import org.encog.neural.data.NeuralDataPair;
import org.encog.neural.data.NeuralDataSet;
import org.encog.neural.data.basic.BasicNeuralDataSet;
import org.encog.neural.networks.BasicNetwork;
import org.encog.neural.networks.layers.BasicLayer;
import org.encog.neural.networks.training.Train;
import org.encog.neural.networks.training.propagation.back.Backpropagation;
import org.encog.neural.networks.training.propagation.manhattan.ManhattanPropagation;
import org.encog.neural.networks.training.propagation.resilient.ResilientPropagation;
import org.encog.neural.networks.training.strategy.Greedy;
import org.encog.util.logging.Logging;

/**
 * XOR: This example is essentially the "Hello World" of neural network
 * programming.  This example shows how to construct an Encog neural
 * network to predict the output from the XOR operator.  This example
 * uses resilient propagation (RPROP) to train the neural network.
 * RPROP is the best general purpose supervised training method provided by
 * Encog.
 * 
 * @author $Author$
 * @version $Revision$
 */
public class XORResilient {

	public static double XOR_INPUT[][] = { { 0.0, 0.0 }, { 1.0, 0.0 },
			{ 0.0, 1.0 }, { 1.0, 1.0 } };

	public static double XOR_IDEAL[][] = { { 0.0 }, { 1.0 }, { 1.0 }, { 0.0 } };

	public static void main(final String args[]) {
		
		Logging.stopConsoleLogging();
		
		BasicNetwork network = new BasicNetwork();
		network.addLayer(new BasicLayer(2));
		network.addLayer(new BasicLayer(2));
		network.addLayer(new BasicLayer(1));
		network.getStructure().finalizeStructure();
		network.reset();

		NeuralDataSet trainingSet = new BasicNeuralDataSet(XOR_INPUT, XOR_IDEAL);
		
		// train the neural network
		final Train train = new ResilientPropagation(network, trainingSet);

		
		int epoch = 1;

		do {
			train.iteration();
			System.out
					.println("Epoch #" + epoch + " Error:" + train.getError());
			epoch++;
		} while(train.getError() > 0.01);

		// test the neural network
		System.out.println("Neural Network Results:");
		for(NeuralDataPair pair: trainingSet ) {
			final NeuralData output = network.compute(pair.getInput());
			System.out.println(pair.getInput().getData(0) + "," + pair.getInput().getData(1)
					+ ", actual=" + output.getData(0) + ",ideal=" + pair.getIdeal().getData(0));
		}
	}
}

    As you can see from the above listing, it is very easy to construct a three-layer, feedforward neural network. Essentially, three new BasicLayer objects are created and added to the neural network with calls to the addLayer method. Because no synapse type is specified, the three layers are connected together using the WeightedSynapse.

    You will notice that after the neural network is constructed, it is trained. There are quite a few ways to train a neural network in Encog. Training is the process where the weights and thresholds are adjusted to values that will produce the desired output from the neural network. This example uses resilient propagation (RPROP) training. RPROP is the best choice for most neural networks to be trained with Encog. For certain special cases, some of the other training types may be used.

// train the neural network 		
final Train train = new ResilientPropagation(network, trainingSet); 	

    With the trainer setup we must now cycle through a bunch of iterations, or epochs. Each of these training iterations should decrease the “error” of the neural network. The error is the difference between the current actual output of the neural network and the desired output.

int epoch = 1; 		
do { 			
train.iteration(); 			
System.out.println("Epoch #" + epoch + " Error:" + train.getError()); 			
epoch++; 		

    Continue training the neural network so long as the error rate is greater than one percent.

} while(train.getError() > 0.01);

    Now that the neural network has been trained, we should test it. To do this, the same data that the neural network was trained with is presented to the neural network. The following code does this.

// test the neural network 	
System.out.println("Neural Network Results:"); 		
for(NeuralDataPair pair: trainingSet ) { 			
final NeuralData output = network.compute(pair.getInput()); 			
System.out.println(pair.getInput().getData(0) + "," + pair.getInput().getData(1) 					+ ", actual=" + output.getData(0) + ",ideal=" + pair.getIdeal().getData(0));

    This will produce the following output:

Epoch #1 Error:0.9902997764512583 
Epoch #2 Error:0.6762359214192293 
Epoch #3 Error:0.49572129129302844 
Epoch #4 Error:0.49279160045197135 
Epoch #5 Error:0.5063357328001542 
Epoch #6 Error:0.502484567412553 
Epoch #7 Error:0.4919515177527043 
Epoch #8 Error:0.49157058621332506 
Epoch #9 Error:0.48883664423510526 
Epoch #10 Error:0.48977067420698456 
Epoch #11 Error:0.4895238942630234 
Epoch #12 Error:0.4870271073515729 
Epoch #13 Error:0.48534672846811844 
Epoch #14 Error:0.4837776485977757 
Epoch #15 Error:0.48184530627656685 
Epoch #16 Error:0.47980242878514856 
Epoch #17 Error:0.47746641141708474 
Epoch #18 Error:0.4748474362926616 
Epoch #19 Error:0.47162728117571795 
Epoch #20 Error:0.46807640808835427 
... 
Epoch #495 Error:0.010583637636670955 
Epoch #496 Error:0.010748859630158925 
Epoch #497 Error:0.010342203029249158 
Epoch #498 Error:0.00997945501479827 
Neural Network Results: 0.0,0.0, actual=0.005418223644461675,ideal=0.0 1.0,0.0, actual=0.9873413174817033,ideal=1.0 0.0,1.0, actual=0.9863636878918781,ideal=1.0 1.0,1.0, actual=0.007650291171204077,ideal=0.0

    As you can see the error rate starts off high and steadily decreases. Finally, the patterns are presented to the neural network. As you can see, the neural network can handle the XOR operator. It does not produce the exact output it was trained with, but it is very close. The values 0.0054 and 0.0076 are very close to zero, just as 0.987 and 0.986 are very close to one.

    For this network, we are testing the neural network with exactly the same data that the neural network was trained with. Generally, this is a very bad practice. You want to test the neural network on data that it was not trained with. This lets you see how the neural network is performing with new data that it has never processed before. However, the XOR function only has four possible combinations, and they all represent unique patterns that the network must be trained for. The neural network in the next example will not use all of its data for training. The neural network will be tested on data it has never been presented with before.

Creating Self-Connected Neural Networks

    The Hopfield neural network is a good example of a self-connected, neural network. The Hopfield neural network contains a single layer of neurons. This layer is connected to itself. Every network on the layer is connected to every other neuron on the same layer. However, no two neurons are connected to themselves. Figure 2.8 shows a Hopfield neural network diagramed in the Encog Workbench.

Figure 2.8: The Hopfield Neural Network

The Hopfield Neural Network

    Listing 2.4 shows a simple example of a Hopfield neural network learning to recognize patterns operator.

// /org/encog/examples/neural/hopfield/HopfieldAssociate.java
package org.encog.examples.neural.hopfield;

import org.encog.neural.data.bipolar.BiPolarNeuralData;
import org.encog.neural.networks.BasicNetwork;
import org.encog.neural.networks.logic.HopfieldLogic;
import org.encog.neural.pattern.HopfieldPattern;

/**
 * Simple class to recognize some patterns with a Hopfield Neural Network.
 * This is very loosely based on a an example by Karsten Kutza, 
 * written in C on 1996-01-30.
 * http://www.neural-networks-at-your-fingertips.com/hopfield.html
 * 
 * I translated it to Java and adapted it to use Encog for neural
 * network processing.  I mainly kept the patterns from the 
 * original example.
 *
 */
public class HopfieldAssociate {

	final static int HEIGHT = 10;
	final static int WIDTH = 10;
	
	/**
	 * The neural network will learn these patterns.
	 */
	public static final String[][] PATTERN  = { { 
		"O O O O O ",
        " O O O O O",
        "O O O O O ",
        " O O O O O",
        "O O O O O ",
        " O O O O O",
        "O O O O O ",
        " O O O O O",
        "O O O O O ",
        " O O O O O"  },

      { "OO  OO  OO",
        "OO  OO  OO",
        "  OO  OO  ",
        "  OO  OO  ",
        "OO  OO  OO",
        "OO  OO  OO",
        "  OO  OO  ",
        "  OO  OO  ",
        "OO  OO  OO",
        "OO  OO  OO"  },

      { "OOOOO     ",
        "OOOOO     ",
        "OOOOO     ",
        "OOOOO     ",
        "OOOOO     ",
        "     OOOOO",
        "     OOOOO",
        "     OOOOO",
        "     OOOOO",
        "     OOOOO"  },

      { "O  O  O  O",
        " O  O  O  ",
        "  O  O  O ",
        "O  O  O  O",
        " O  O  O  ",
        "  O  O  O ",
        "O  O  O  O",
        " O  O  O  ",
        "  O  O  O ",
        "O  O  O  O"  },

      { "OOOOOOOOOO",
        "O        O",
        "O OOOOOO O",
        "O O    O O",
        "O O OO O O",
        "O O OO O O",
        "O O    O O",
        "O OOOOOO O",
        "O        O",
        "OOOOOOOOOO"  } };

	/**
	 * The neural network will be tested on these patterns, to see
	 * which of the last set they are the closest to.
	 */
	public static final String[][] PATTERN2 = { { 
		"          ",
        "          ",
        "          ",
        "          ",
        "          ",
        " O O O O O",
        "O O O O O ",
        " O O O O O",
        "O O O O O ",
        " O O O O O"  },

      { "OOO O    O",
        " O  OOO OO",
        "  O O OO O",
        " OOO   O  ",
        "OO  O  OOO",
        " O OOO   O",
        "O OO  O  O",
        "   O OOO  ",
        "OO OOO  O ",
        " O  O  OOO"  },

      { "OOOOO     ",
        "O   O OOO ",
        "O   O OOO ",
        "O   O OOO ",
        "OOOOO     ",
        "     OOOOO",
        " OOO O   O",
        " OOO O   O",
        " OOO O   O",
        "     OOOOO"  },

      { "O  OOOO  O",
        "OO  OOOO  ",
        "OOO  OOOO ",
        "OOOO  OOOO",
        " OOOO  OOO",
        "  OOOO  OO",
        "O  OOOO  O",
        "OO  OOOO  ",
        "OOO  OOOO ",
        "OOOO  OOOO"  },

      { "OOOOOOOOOO",
        "O        O",
        "O        O",
        "O        O",
        "O   OO   O",
        "O   OO   O",
        "O        O",
        "O        O",
        "O        O",
        "OOOOOOOOOO"  } };

	public BiPolarNeuralData convertPattern(String[][] data, int index)
	{
		int resultIndex = 0;
		BiPolarNeuralData result = new BiPolarNeuralData(WIDTH*HEIGHT);
		for(int row=0;row<HEIGHT;row++)
		{
			for(int col=0;col<WIDTH;col++)
			{
				char ch = data[index][row].charAt(col);
				result.setData(resultIndex++, ch=='O');
			}
		}
		return result;
	}
	
	public void display(BiPolarNeuralData pattern1,BiPolarNeuralData pattern2)
	{
		int index1 = 0;
		int index2 = 0;
		
		for(int row = 0;row<HEIGHT;row++)
		{
			StringBuilder line = new StringBuilder();
			
			for(int col = 0;col<WIDTH;col++)
			{
				if(pattern1.getBoolean(index1++))
					line.append('O');
				else
					line.append(' ');
			}
			
			line.append("   ->   ");
			
			for(int col = 0;col<WIDTH;col++)
			{
				if(pattern2.getBoolean(index2++))
					line.append('O');
				else
					line.append(' ');
			}
			
			
			
			System.out.println(line.toString());
		}
	}

	
	public void evaluate(BasicNetwork hopfield, String[][] pattern)
	{
		HopfieldLogic hopfieldLogic = (HopfieldLogic)hopfield.getLogic();
		for(int i=0;i<pattern.length;i++)
		{
			BiPolarNeuralData pattern1 = convertPattern(pattern,i);
			hopfieldLogic.setCurrentState(pattern1);
			int cycles = hopfieldLogic.runUntilStable(100);
			BiPolarNeuralData pattern2 = (BiPolarNeuralData)hopfieldLogic.getCurrentState();
			System.out.println("Cycles until stable(max 100): " + cycles + ", result=");
			display( pattern1, pattern2);
			System.out.println("----------------------");
		}
	}
	
	public void run()
	{
		HopfieldPattern pattern = new HopfieldPattern();
		pattern.setInputNeurons(WIDTH*HEIGHT);
		BasicNetwork hopfield = pattern.generate();
		HopfieldLogic hopfieldLogic = (HopfieldLogic)hopfield.getLogic();

		for(int i=0;i<PATTERN.length;i++)
		{
			hopfieldLogic.addPattern(convertPattern(PATTERN,i));
		}
		
		evaluate(hopfield,PATTERN);
		evaluate(hopfield,PATTERN2);
	}
	
	public static void main(String[] args)
	{
		HopfieldAssociate program = new HopfieldAssociate();
		program.run();
	}
	
}

    The Hopfield example begins by creating a HopfieldPattern class. The pattern classes allow for common types of neural networks to be constructed automatically. You simply provide the parameters about the type of neural network you wish to create, and the pattern takes care of setting up layers, synapses, parameters and tags.

HopfieldPattern pattern = new HopfieldPattern();

    This Hopfield neural network is going to recognize graphic patterns. These graphic patterns are mapped on to grids. The number of input neurons will be the total number of cells in the grid. This is the width times the height.

pattern.setInputNeurons(WIDTH*HEIGHT); 	

    The Hopfield pattern requires very little input, just the number of input neurons. Other patterns will require more parameters. Now that the HopfieldPattern has been provided with all that it needs, the generate method can be called to create the neural network.

BasicNetwork hopfield = pattern.generate(); 	

    The logic object is obtained for the Hopfield network.

HopfieldLogic hopfieldLogic = (HopfieldLogic)hopfield.getLogic();

    The logic class is used to add the patterns that the neural network is to be trained on. This is similar to the training seen in the last section, except it happens much faster for the simple Hopfield neural network.

for(int i=0;i<PATTERN.length;i++){ 		
  hopfieldLogic.addPattern(convertPattern(PATTERN,i)); 		
} 		 

    Now that the network has been “trained” we will test it. Just like in the last section, we will evaluate the neural network with the same data with which it was trained.

evaluate(hopfield,PATTERN); 	

    However, in addition to the data that the network has already been presented with, we will also present new data. This new data are distorted images of the data that the network was trained on. The network should be able to still recognize the patterns, even though they were distorted.

evaluate(hopfield,PATTERN2);

    The following shows the output of the Hopfield neural network. As you can see the Hopfield neural network is first presented with several patterns to train on. The Hopfield network simply echoes these patterns. Next, the Hopfield neural network is presented with distorted versions of the patterns with which it was trained. As you can see from the code snippet below, the Hopfield neural network still recognizes the values.

Cycles until stable(max 100): 1, result=
O O O O O    ->   O O O O O 
 O O O O O   ->    O O O O O
O O O O O    ->   O O O O O 
 O O O O O   ->    O O O O O
O O O O O    ->   O O O O O 
 O O O O O   ->    O O O O O
O O O O O    ->   O O O O O 
 O O O O O   ->    O O O O O
O O O O O    ->   O O O O O 
 O O O O O   ->    O O O O O
----------------------
Cycles until stable(max 100): 1, result=
OO  OO  OO   ->   OO  OO  OO
OO  OO  OO   ->   OO  OO  OO
  OO  OO     ->     OO  OO  
  OO  OO     ->     OO  OO  
OO  OO  OO   ->   OO  OO  OO
OO  OO  OO   ->   OO  OO  OO
  OO  OO     ->     OO  OO  
  OO  OO     ->     OO  OO  
OO  OO  OO   ->   OO  OO  OO
OO  OO  OO   ->   OO  OO  OO
----------------------
Cycles until stable(max 100): 1, result=
OOOOO        ->   OOOOO     
OOOOO        ->   OOOOO     
OOOOO        ->   OOOOO     
OOOOO        ->   OOOOO     
OOOOO        ->   OOOOO     
     OOOOO   ->        OOOOO
     OOOOO   ->        OOOOO
     OOOOO   ->        OOOOO
     OOOOO   ->        OOOOO
     OOOOO   ->        OOOOO
----------------------
Cycles until stable(max 100): 1, result=
O  O  O  O   ->   O  O  O  O
 O  O  O     ->    O  O  O  
  O  O  O    ->     O  O  O 
O  O  O  O   ->   O  O  O  O
 O  O  O     ->    O  O  O  
  O  O  O    ->     O  O  O 
O  O  O  O   ->   O  O  O  O
 O  O  O     ->    O  O  O  
  O  O  O    ->     O  O  O 
O  O  O  O   ->   O  O  O  O
----------------------
Cycles until stable(max 100): 1, result=
OOOOOOOOOO   ->   OOOOOOOOOO
O        O   ->   O        O
O OOOOOO O   ->   O OOOOOO O
O O    O O   ->   O O    O O
O O OO O O   ->   O O OO O O
O O OO O O   ->   O O OO O O
O O    O O   ->   O O    O O
O OOOOOO O   ->   O OOOOOO O
O        O   ->   O        O
OOOOOOOOOO   ->   OOOOOOOOOO
----------------------
Cycles until stable(max 100): 2, result=
             ->   O O O O O 
             ->    O O O O O
             ->   O O O O O 
             ->    O O O O O
             ->   O O O O O 
 O O O O O   ->    O O O O O
O O O O O    ->   O O O O O 
 O O O O O   ->    O O O O O
O O O O O    ->   O O O O O 
 O O O O O   ->    O O O O O
----------------------
Cycles until stable(max 100): 2, result=
OOO O    O   ->   OO  OO  OO
 O  OOO OO   ->   OO  OO  OO
  O O OO O   ->     OO  OO  
 OOO   O     ->     OO  OO  
OO  O  OOO   ->   OO  OO  OO
 O OOO   O   ->   OO  OO  OO
O OO  O  O   ->     OO  OO  
   O OOO     ->     OO  OO  
OO OOO  O    ->   OO  OO  OO
 O  O  OOO   ->   OO  OO  OO
----------------------
Cycles until stable(max 100): 2, result=
OOOOO        ->   OOOOO     
O   O OOO    ->   OOOOO     
O   O OOO    ->   OOOOO     
O   O OOO    ->   OOOOO     
OOOOO        ->   OOOOO     
     OOOOO   ->        OOOOO
 OOO O   O   ->        OOOOO
 OOO O   O   ->        OOOOO
 OOO O   O   ->        OOOOO
     OOOOO   ->        OOOOO
----------------------
Cycles until stable(max 100): 2, result=
O  OOOO  O   ->   O  O  O  O
OO  OOOO     ->    O  O  O  
OOO  OOOO    ->     O  O  O 
OOOO  OOOO   ->   O  O  O  O
 OOOO  OOO   ->    O  O  O  
  OOOO  OO   ->     O  O  O 
O  OOOO  O   ->   O  O  O  O
OO  OOOO     ->    O  O  O  
OOO  OOOO    ->     O  O  O 
OOOO  OOOO   ->   O  O  O  O
----------------------
Cycles until stable(max 100): 2, result=
OOOOOOOOOO   ->   OOOOOOOOOO
O        O   ->   O        O
O        O   ->   O OOOOOO O
O        O   ->   O O    O O
O   OO   O   ->   O O OO O O
O   OO   O   ->   O O OO O O
O        O   ->   O O    O O
O        O   ->   O OOOOOO O
O        O   ->   O        O
OOOOOOOOOO   ->   OOOOOOOOOO
----------------------

    As you can see, the neural network can recognize the distorted values as well as those values with which it was trained. This is a much more comprehensive test than was performed in the previous section.

    The program code for the evaluate method will now be examined. This shows how to present a pattern to the neural network.

public void evaluate(BasicNetwork hopfield, String[][] pattern) 	
{ 		

    First the logic object is obtained.

  HopfieldLogic hopfieldLogic = (HopfieldLogic)hopfield.getLogic(); 	

    Loop over all of the patterns and present each to the neural network.

  for(int i=0;i<pattern.length;i++) 		
  { 		
    BiPolarNeuralData pattern1 = convertPattern(pattern,i); 

    The pattern is obtained from the array and converted to a form that can be presented to the neural network. The graphic patterns are binary, either the pixel is on or it is off. To convert the image all displayed pixels are converted to the numbers. We are using bipolar numbers, so a display pixel is 1, a hidden pixel is -1.

    The Hopfield neural network has a current state. The neurons will be at either 1 or -1 level. The current state of the Hopfield network is set to the pattern that we want to recognize.

    hopfieldLogic.setCurrentState(pattern1); 

    The Hopfield network will be run until it stabilizes. A Hopfield network will adjust its pattern until it no longer changes. At this point it has stabilized. The Hopfield neural network will stabilize on one of the patterns that it was trained on. The following code will run the Hopfield network until it stabilizes, up to 100 iterations.

    int cycles = hopfieldLogic.runUntilStable(100); 		
    BiPolarNeuralData pattern2 = (BiPolarNeuralData)hopfieldLogic.getCurrentState(); 

    Once the network's state has stabilized it is displayed.

    System.out.println("Cycles until stable(max 100): " + cycles + ", result="); 			
    display( pattern1, pattern2); 			
    System.out.println("----------------------"); 	
}

    These are just a few of the neural network types that can be constructed with Encog. As the book progresses, you will learn many more.


Copyright 2005 - 2012 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright, license and trademark information.