Implementing Local Minima Escape
We will now examine an example that uses both simulated annealing and genetic algorithms to help find the most optimal weight matrix. For this example we will revisit the XOR example that was shown in Chapter 3. This example program used a multi-layer feedforward backpropagation neural network to solve the XOR logic gate. This application can also be applied to other logic gates simply by modifying its training grid. For this example we will assume that you are using the XOR training grid provided with this book.
The example program looks similar to the example program from Chapter 3. This program is shown in Figure 10.1.

Figure 10.3: The Genetic/Annealing XOR Example Program
The primary difference is two additional regions just below the top of the application. These areas allow you to control the simulated annealing and genetic algorithm portions of the program.
Just as with the XOR example program from Chapter 3 you are able to begin a training cycle. Once you begin a training cycle, by clicking the "Train" button, the program will begin training the neural network. As the neural network is trained you will see the current error for the neural network. This error should decrease as the program runs.
Usually this error number will decrease to well below 10%. Once this happens you have a sufficiently trained neural network. If this happens you should click the "Randomize" button and begin training again. In this particular case we are looking for a weight matrix that will get stuck in a local minimum and not decrease below 10%.
In addition to the backpropgation training that was provided in Chapter 3, this neural network can also be trained in other ways. Clicking on the "Anneal" button will begin training the neural network using the simulated annealing algorithm. Clicking on the "Genetic" button will begin training the neural network using a genetic algorithm.
If you run both the simulated annealing and genetic algorithms you will notice a considerable difference in the outputs of each. The simulated annealing routine will execute much faster, usually within a few seconds a weight matrix will be established. This weight matrix will almost always be lower in error than the weight matrix determined by the backpropagation algorithm. This is because the simulated annealing algorithm was not caught up by local minima.
Executing the genetic algorithm will take a considerably longer time to execute than the simulated annealing. If you examine the process list for your computer, while the genetic algorithm is running, you will also notice the genetic algorithm takes up considerably more memory than the simulated annealing algorithm. Because of this simulated annealing has become a very popular means of training neural networks because of its quick execution and low memory requirements.
The example program remains similar to the XOR problem from Chapter 3, however additional user interface elements have been added for simulated annealing and genetic algorithm. We will now examine how this example program was constricted.
You will now be shown how this program was constructed. We will begin with the user interface. The user interface for the simulated annealing and genetic algorithms version of the XOR problem adds some user interface elements to the XOR example shown in Chapter 3. The additional user interface elements are discussed in the next section.
The User Interface
This program has essentially the same user interface as the XOR example shown in Chapter 3. Because of this we will not examine the complete source code of the user interface. The complete source code to the user interface can be found on the companion CD-ROM that came with this book. We will now highlight some of the differences between the user interface source code.
The user interface code is contained in the file XorMinimaExample.java. Rather than just having the "Train" and "Run" buttons, the minima example also contains buttons to allow the program to train using simulated annealing and a genetic algorithm. When the user selects to train using simulated annealing the following method is called.
/**
* Train the neural network with simulated
* annealing.
*/
protected void anneal()
{
double xorData[][] = getGrid();
double xorIdeal[][] = getIdeal();
int update=0;
status.setText("Starting...");
btnRun.setEnabled(false);
btnGenetic.setEnabled(false);
btnAnneal.setEnabled(false);
SimulateAnnealing anneal = new SimulateAnnealing(network);
anneal.start(10,2,100);
anneal.setInput(xorData);
anneal.setIdeal(xorIdeal);
for (int x=0;x<100;x++) {
anneal.anneal();
status.setText("Anneal Cycle = " + x + ",Error = " + anneal.getGlobalError());
}
btnRun.setEnabled(true);
btnGenetic.setEnabled(true);
btnAnneal.setEnabled(true);
}
None of the actual simulated annealing algorithm is actually contained in this method. Rather this method simply serves as a bridge between the user interface code and the SimulateAnnealing class that actually performs the simulated annealing.
As you can see the method begins by disabling the buttons so that the user does not accidentally initiate two training sessions simultaneously. Then the simulated annealing class is instantiated and passed a few parameters. The following line does this.
SimulateAnnealing anneal = new SimulateAnnealing(network);
anneal.start(10,2,100);
The ten and two arguments specify the beginning and ending temperatures. The 100 value specified how many cycles will be used. It is necessary for the algorithm to know how many cycles will be used so that the temperature is decreased in an orderly fashion that will cause it to end at the ending temperature.
The starting and ending temperatures can be altered if the network is not being trained well. Higher values for the starting temperature will cause greater deviations in the weight matrix. This may be necessary to reach acceptable error levels with certain neural networks. Lower ending temperatures often allow the algorithm to fine tune the weight matrix more before the algorithm completes. Some degree of experimentation is generally required to find a optimal set of parameters.
Next the genetic algorithm is given the training data so that the error of the neural network can be determined at each cycle.
anneal.setInput(xorData);
anneal.setIdeal(xorIdeal);
Finally the method runs through 100 cycles and applies the annealing algorithm at each cycle. The global error is reported to the user interface at each cycle.
for (int x=0;x<100;x++) {
anneal.anneal();
status.setText("Anneal Cycle = " + x + ",Error = " + anneal.getGlobalError());
}
The method that is used to train the neural network with a genetic algorithm is very similar to the method just examined. The genetic algorithm method is shown here.
/**
* Train the neural network with a genetic algorithm.
*/
protected void genetic()
{
double xorData[][] = getGrid();
double xorIdeal[][] = getIdeal();
int update=0;
status.setText("Starting...");
Genetic genetic = new Genetic(network);
genetic.setInput(xorData);
genetic.setIdeal(xorIdeal);
genetic.start();
btnRun.setEnabled(false);
btnGenetic.setEnabled(false);
btnAnneal.setEnabled(false);
int x=0;
while (genetic.getGlobalError()>.1) {
x++;
genetic.generation();
status.setText("Genetic Cycle = " + x + ",Error = " + genetic.getGlobalError());
}
btnRun.setEnabled(true);
btnGenetic.setEnabled(true);
btnAnneal.setEnabled(true);
}
Just as was the case with the simulated annealing method the genetic method is merely an interface between the user interface and the class that actually handles the genetic algorithm. It is the Genetic class that actually implements the genetic algorithm.
The Neural Network
Only minor changes were made to the neural network class to support genetic algorithms and simulated annealing. Two method were added to the Network class that allows better access to the weight matrix and threshold values of the neural network.
The simulated annealing and genetic algorithms work best when the weight matrix and thresholds are presented as a linear array of doubles. To do this two new method were added. The "fromArray" method will copy an array to the weight matrix and thresholds of the neural network. The "toArray" method is provided to convert the weight matrixes and thresholds to a simple array. The "toArray" method is shown here.
/**
* Convert to an array. This is used with some training algorithms
* that require that the "memory" of the neuron(the weight and threshold
* values) be expressed as a linear array.
*
* @return The memory of the neuron.
*/
public double []toArray()
{
double result[] = new double[matrix.length+thresholds.length];
for (int i=0;i<matrix.length;i++)
result[i] = matrix[i];
for (int i=0;i<thresholds.length;i++)
result[matrix.length+i] = thresholds[i];
return result;
}
As you can see the "toArray" method does nothing more than allocate a new array that is large enough to hold both the weight matrix and thresholds of the neural network. Then the weight matrix and threshold values are copied to this new array. The "fromArray" method does the opposite. The "fromArray" method is shown here.
/**
* Use an array to populate the memory of the neural network.
*
* @param array An array of doubles.
*/
public void fromArray(double array[])
{
for (int i=0;i<matrix.length;i++)
matrix[i] = array[i];
for (int i=0;i<thresholds.length;i++)
thresholds[i] = array[matrix.length+i];
}
As you can see, the "fromArray" method copies an array, which was created by "toArray", back into the weight matrix and threshold values of the neural network.
These two methods will allow more direct access to the weight matrix and threshold values that will be needed by the simulated annealing and genetic algorithms. We will now examine the classes that implement the simulated annealing and genetic algorithm classes. You were already introduced to these classes back in Chapters 8 and 9. These classes were modified somewhat to be adapted to use with neural networks.
These changes were necessary due to some inherent differences in neural network training and the traveling salesman problem. In the traveling salesman problem an order of cities is what is changing. In neural network training it is the weights that are changing in value. Also, where it was illegal for the traveling salesman to visit a city more than once, this there is no such constraint on neural network training. In the next two sections you will be shown exactly how this training is done. We will begin with simulated annealing.
Simulated Annealing
You will now be shown how the simulated annealing learning algorithm was implemented. The simulated annealing algorithm fits completely in the SimulatedAnnealing class. This file is shown in Listing 10.1.
Listing 10.1: Simulated Annealing (SimulateAnnealing.java)
public class SimulateAnnealing {
protected Network network;
protected double input[][];
protected double ideal[][];
protected double startTemperature;
protected double stopTemperature;
protected int cycles;
protected double globalError;
/**
* The constructor.
*
* @param network The neural network that is to be trained.
*/
public SimulateAnnealing(Network network)
{
this.network = network;
}
/**
* Set the training input.
*
* @param input The data that is to be used to train the neural network.
*/
public void setInput(double input[][])
{
this.input = input;
}
/**
* Set the ideal results for the training data.
*
* @param ideal The ideal results for the training data.
*/
public void setIdeal(double ideal[][])
{
this.ideal = ideal;
}
/**
* The current temperature.
*/
protected double temperature;
/**
* Randomize the weights and thresholds. This function
* does most of the work of the class. Each call to this
* class will randomize the data according to the current
* temperature. The higher the temperature the more randomness.
*/
protected void randomize()
{
double ratio = temperature * Math.sqrt(12);
double array[] = network.toArray();
for (int i=0;i<array.length;i++) {
double add = 0.5 - (Math.random());
add /= startTemperature;
add *= temperature;
array[i] = array[i] + add;
}
network.fromArray(array);
}
/**
* Determine the error of the current weights and
* thresholds.
*/
protected double determineError()
{
for (int i=0;i<input.length;i++) {
network.computeOutputs(input[i]);
network.calcError(ideal[i]);
}
return network.getError(input.length);
}
/**
* Initialize the simulated annealing class.
*
* @param startTemp The starting temperature.
* @param stopTemp The ending temperature.
* @param cycles The number of cycles to use.
*/
public void start(double startTemp,double stopTemp,int cycles)
{
temperature = startTemp;
this.cycles = cycles;
this.startTemperature = startTemp;
this.stopTemperature = stopTemp;
}
/**
* Called to perform one cycle of the annealing process.
*/
public void anneal()
{
double bestArray[];
globalError = determineError();
bestArray = network.toArray();
for (int i=0;i<100;i++) {
double curError;
randomize();
curError = determineError();
if (curError<globalError) {
bestArray = network.toArray();
globalError = curError;
}
}
network.fromArray(bestArray);
double ratio = Math.exp(Math.log(stopTemperature/startTemperature)/(cycles-1));
temperature*=ratio;
}
/**
* Get the current temperature.
*
* @return The current temperature.
*/
public double getTemperature()
{
return temperature;
}
/**
* Get the global error.
*
* @return The global error.
*/
public double getGlobalError()
{
return globalError;
}
}
To understand this class we will begin by looking at the properties. The properties of this class are listed below.
- network - The neural network that is being trained.
- input - Training data that will be used to determine the error for the neural network.
- ideal - Holds the ideal results from the neural network for the training sets.
- startTemperature - The beginning temperature.
- stopTemperature - The ending temperature. This value should be reached once the number of cycles, specified by the cycles property, has passed.
- cycles - The total number of cycles that the simulated annealing algorithm is to be run for.
- globalError - The current error of the neural network. This value is to be minimized.
To begin the process of simulated annealing the start method must be called. The start method accepts the network to be optimized and sets up the other properties based on that network.
Most of the work is done by the "randomize" method. This method will randomize the weights and tolerances to a degree specified by the current temperature. The algorithm works by repeatedly calling the randomize method, as the temperature drops. This allows an optimal weight matrix and threshold values to be gradually created.
The algorithm used to randomize the weight matrix and threshold values is not complex. The "randomize" method works by looping across all of the weight matrix and threshold values and multiplying each by a ratio. This ratio is derived directly from the current temperature.
The main entry point for external class is the anneal method. The anneal method is responsible for actually calling the "randomize" method. The anneal method begins by looping through 100 iterations at the current temperature. This is done by calling the "randomize" method 100 times. Each time through the iteration the new version of the weight matrix, which was created by the "randomize" method is evaluated. If it is better than the previous "best matrix", then it is replaced. At the end of these 100 iterations you will be left with the best weight matrix of all considered.
Once the 100 iterations have been completed the "anneal" method must decrease the temperature. The temperature is decreased in a way so that the temperature will reach the ending temperature by the time that program has reached the maximum number of cycles. Each time the "anneal" method is called is considered one cycle. The total number of cycles was specified when the simulated annealing class was instantiated. To allow the temperature to be decreased in this way the following method is used. First a ratio is calculated that determines how much to decrease the temperature by.
double ratio = Math.exp(Math.log(stopTemperature/startTemperature)/(cycles-1));
Next this ratio is multiplied against the current temperature.
temperature*=ratio;
Now that you have see now the simulated annealing class was implemented you will be shown the class that allows genetic algorithms.
Genetic Algorithms
Unlike the simulated annealing algorithm, the genetic algorithm is implemented in more than one class. The Chromosome class is provided to hold one individual chromosome. A chromosome is one life form. In this case, the life form represents the weight matrix and threshold values of a neural network. The chromosomes are processed by the GeneticAlgoritm class. We will begin by examining the Chromosome class.
The Chromosome class
All life forms in the genetic algorithm, that we are using, have on single chromosome. Because of this the Chromosome class implements the individual life forms that will be used for the genetic algorithm. The complete listing for this class is shown in Listing 10.2.
Listing 10.2: Chromosomes (Chromosome.java)
class Chromosome {
protected static final int GENE_SIZE = 64;
protected int [] gene;
protected double cost;
protected double mutationPercent;
protected int cutLength;
double matrix[];
/**
* The constructor, takes a list of cities to
* set the initial "genes" to.
*
* @param cities The order that this chromosome would
* visit the
* cities. These cities can be thought of as the
* genes of this chromosome.
*/
Chromosome(Network network) {
matrix = network.toArray();
gene = new int[matrix.length*GENE_SIZE];
cost = 0.0;
fromMatrix();
cutLength = gene.length/2;
}
/**
* Copy the genes to a weight matrix.
*/
public void toMatrix()
{
int idx = 0;
for (int i=0;i<matrix.length;i++) {
long l = 0;
long or = 1;
for (int j=0;j<GENE_SIZE;j++) {
if ( gene[idx++]!=0 )
l = l | or;
or+=or;
}
matrix[i] = Double.longBitsToDouble(l);
}
}
/**
* Copy the weight matrix to the genes.
*/
public void fromMatrix()
{
int idx = 0;
for (int i=0;i<matrix.length;i++) {
long l = Double.doubleToLongBits(matrix[i]);
long and = 1;
for (int j=0;j<GENE_SIZE;j++) {
gene[idx++] = (l & and)!=0?1:0;
and+=and;
}
}
}
/**
* Calculate the cost of this solution. This is
* the error of this weight matrix against the
* sample data.
*
* @param network The neural network to use to calculate the cost.
* @param trial
* @param ideal
*/
public void calculateCost(Network network,double trial[][],double ideal[][])
{
toMatrix();
network.fromArray(matrix);
for (int i=0;i<trial.length;i++) {
network.computeOutputs(trial[i]);
network.calcError(ideal[i]);
}
cost = network.getError(trial.length);
}
/**
* Get the cost for this chromosome. Note that
* the cost must be calculated first.
*
* @return The cost of this chromosome's solution.
*/
double getCost() {
return cost;
}
/**
* Get the ith gene in this chromosome.
*
* @param i The city you want.
* @return The ith city.
*/
int getGene(int i) {
return gene[i];
}
/**
* Set all genes.
*
* @param list A list of genes.
*/
void setGenes(int [] list) {
for ( int i=0;i<gene.length;i++ ) {
gene[i] = list[i];
}
}
/**
* Set the cut value.
*
* @param cut The new cut value.
*/
void setCut(int cut) {
cutLength = cut;
}
/**
* Set the percent of new births that will be mutated.
*
* @param prob The probability that a mutation will occur.
*/
void setMutation(double prob) {
mutationPercent = prob;
}
/**
* Assuming this chromosome is the "mother" mate with
* the passed in "father".
*
* @param father The father.
* @param offspring1 Returns the first offspring
* @param offspring2 Returns the second offspring.
* @return The amount of mutation that was applied.
*/
int mate(Chromosome father, Chromosome offspring1, Chromosome offspring2) {
int cutpoint = (int) (0.999999*Math.random()*(double)(gene.length-cutLength));
int off1 [] = new int[gene.length];
int off2 [] = new int[gene.length];
// mate
int n1 = gene.length/2;
int n2 = gene.length-n1;
int c = cutpoint;
while ((n1--)>0) {
c = (c+1)%gene.length;
off1[c] = this.getGene(c);
off2[c] = father.getGene(c);
}
while ((n2--)>0) {
c = (c+1)%gene.length;
off2[c] = this.getGene(c);
off1[c] = father.getGene(c);
}
int mutate = 0;
if ( Math.random() < mutationPercent ) {
int iswap1 = (int) (0.999999*Math.random()*(double)(gene.length));
int iswap2 = (int) (0.999999*Math.random()*(double)gene.length);
int i = off1[iswap1];
off1[iswap1] = off1[iswap2];
off1[iswap2] = i;
mutate++;
}
if ( Math.random() < mutationPercent ) {
int iswap1 = (int) (0.999999*Math.random()*(double)(gene.length));
int iswap2 = (int) (0.999999*Math.random()*(double)gene.length);
int i = off2[iswap1];
off2[iswap1] = off2[iswap2];
off2[iswap2] = i;
mutate++;
}
// copy results
offspring1.setGenes(off1);
offspring1.toMatrix();
offspring2.setGenes(off2);
offspring2.toMatrix();
return mutate;
}
/**
* Sort the chromosomes by their cost.
*
* @param chromosomes An array of chromosomes to sort.
* @param num How much of the chromosome list to sort.
*/
public static void sortChromosomes(Chromosome chromosomes[],int num) {
Chromosome ctemp;
boolean swapped = true;
while ( swapped ) {
swapped = false;
for ( int i=0;i<num-1;i++ ) {
if ( chromosomes[i].getCost() > chromosomes[i+1].getCost() ) {
ctemp = chromosomes[i];
chromosomes[i] = chromosomes[i+1];
chromosomes[i+1] = ctemp;
swapped = true;
}
}
}
}
/**
* Get the weight matrix associated with this chromosome.
*
* @return The weight matrix associated with this chromosome.
*/
public double[] getMatrix()
{
return matrix;
}
}
There are several properties that are contained by the chromosome. They are defined as follows.
- GENE_SIZE - The size of an individual gene. In this case it is 64 because each double has 8 bytes, and each of those bytes has 8 bits.
- gene - An array of the genes. In this case, the genes contain the binary representation of the weight matrix and threshold values of the neural network.
- cost - The error that this chromosome gives when evaluated.
- mutationPercent - The percent of genes that should be mutated.
- cutLength - Where to cut the chromosome during mating.
- matrix - The weight matrix and threshold values that this chromosome corresponds to.
The genetic algorithm prefers that data be expresses as one long stream of genes. This allows these genes to be cut at certain intervals and spliced back together during the process of reproduction, or crossover. To do this an important question becomes how exactly we represent the weight matrix and thresholds of the neural network as one long stream of information.
This is similar to the process of converting the weight matrix and thresholds to a single array of doubles, as was done for simulated annealing. Thought he process is similar, there are some very important differences. Breaking the weight matrix and threshold values into individual numbers is far too coarse a granularity for a genetic algorithm. If the individual weights were the genes, then the weights would simply be swapped about, and not actually modified. This is because a genetic algorithm does not actually modify the genes themselves. Rather the genes are simply recombined from the parents to produce the child. Because of this we need a finer level of granularity for the genes.
Storing the Gene Sequence
The binary representation of numbers is often chosen as the level of granularity required for an individual gene. This causes the individual binary digit to become the gene. This works well as "strips" of this binary information can be taken from both mother and father to produce the child. This level of detail is also below the individual weight number, so the weights will change as mating occurs.
To represent this binary stream of information we need to choose a Java data type. To maintain some degree of similarity to the traveling salesman version of the genetic algorithm the Java int type was chosen. This also allows marginal changes to the encode and decode routines to change the level of grandularty. For example, you may choose to modify the program so that individual digits of the weight and threshold values become the genes.
To encode and decode between binary and the weight matrix array, two data structures are provided by the Chromosome class. First, a variable named "matrix" is provided that holds the matrix and threshold values from the neural network. This is data structure holds the data in the same format as is returned from the "toArray" method of the Network class. The binary representation of the genes is stored in the "gene" array. Two methods are provided that allow you to move data between the gene array and the matrix. These methods are named "toMatrix" and "fromMatrix", The "toMatrix" method is shown here.
public void toMatrix()
{
int idx = 0;
for (int i=0;i<matrix.length;i++) {
long l = 0;
long or = 1;
for (int j=0;j<GENE_SIZE;j++) {
if ( gene[idx++]!=0 )
l = l | or;
or+=or;
}
matrix[i] = Double.longBitsToDouble(l);
}
}
As you can see, this method loops through each double value that is in the matrix. For each of these values a corresponding value is built from the gene array. Because the gene array stores its information in binary, and the size of a double is 8 bytes, it takes 64 bits to create a single double value. As these bits are read, they are concatenated to a long value. Once the long value is full, the Java method "Double.longBitsToDouble" is called to actually create the double value.
Because Java is a high-level language you cannot simply access the bytes that make up a type, such as double. Yet the designers of Java realized that there are cases where the programmer will need access to the individual bits of number format such as double. To solve this issue two Java methods were added. The "Double.longBitsToDouble" method is provided to convert individual bits to a double type. Additionally, the "Double. Double.doubleToLongBits " method is provided to convert a double to the bits that make up that value.
In addition to the "toMatrix" method, which converts the gene binary sequence to a weight matrix and threshold array, the "fromMatrix" method is also provided to do the reverse. The "fromMatrix" method will take the weight matrix and threshold values and fill the "gene" array with the binary representation. The "fromMatrix" method is shown here.
public void fromMatrix()
{
int idx = 0;
for (int i=0;i<matrix.length;i++) {
long l = Double.doubleToLongBits(matrix[i]);
long and = 1;
for (int j=0;j<GENE_SIZE;j++) {
gene[idx++] = (l & and)!=0?1:0;
and+=and;
}
}
}
As you can see from the above code the "fromMatrix" method begins by looping through each of the matrix values. For each matrix values 64 bits are generated. The GENE_SIZE constant shown above holds the value of 64. The logical and operator is used to determine if each bit position is active or not.
The Mating Process
The "mate" method of the Chromosome class is provided to allow the chromosome to mate with another chromosome. Two offspring are always produced as a result of this mating. You will notice that the mate method, as seen in Listing 10.3 is somewhat different than the mate algorithm used in traveling salesman problem of Chapter 8.
The main difference between the mating algorithm of the traveling salesman problem and neural network training lies in one constraint that the traveling salesman problem has that neural network training does not. In the case of the traveling salesman problem it is illegal to use the same gene more than once. If the same gene were used more than once this would mean that the traveling salesman were visiting the same city more than once. This is a basic requirement of the traveling salesman problem.
When the genetic algorithm is applied to neural network training, no such constraint is present. Because the individual genes for the neural network are made up of binary bits, there really are only two gene types. It would not be possible to require a unique gene at each position on the chromosome. This makes the process of cutting the gene of the parents to produce the outputs much simpler than the process that was used in Chapter 8 for the traveling salesman problem. We will now examine the mating process.
First a rand cutpoint is chosen. This cut point should like somewhere within the cutLength specified earlier.
int cutpoint = (int) (0.999999*Math.random()*(double)(gene.length-cutLength));
Two arrays are then allocated to hold the two offspring.
int off1 [] = new int[gene.length];
int off2 [] = new int[gene.length];
Next the number of genes that will be in each area of the cut is determined. These two numbers will always be within one unit of each other. This is because the gene cut will wrap around the gene sequence. If you thought of the gene sequence as a circle, each cut would have 180 degrees. The only thing that varys is the starting point of the cut.
int n1 = gene.length/2;
int n2 = gene.length-n1;
int c = cutpoint;
Next the program loops through and assigns the mother's DNA for the first cut to offspring one and the father's DNA to offspring two.
while ((n1--)>0) {
c = (c+1)%gene.length;
off1[c] = this.getGene(c);
off2[c] = father.getGene(c);
}
The same process is completed for the second cut, except the mother and father are switched. The program loops through and assigns the mother's DNA for the first cut to offspring two and the father's DNA to offspring one.
while ((n2--)>0) {
c = (c+1)%gene.length;
off2[c] = this.getGene(c);
off1[c] = father.getGene(c);
}
Finally mutation may be allowed to occur. A random number is selected based on the mutation percent. If mutation is to occur then two of the genes in the offspring will be switched. This simple alteration provides some variety to the population that might otherwise become stagnant with no new genetic material being introduced. The mutation process for the first offspring is shown below.
int mutate = 0;
if ( Math.random() < mutationPercent ) {
int iswap1 = (int) (0.999999*Math.random()*(double)(gene.length));
int iswap2 = (int) (0.999999*Math.random()*(double)gene.length);
int i = off1[iswap1];
off1[iswap1] = off1[iswap2];
off1[iswap2] = i;
mutate++;
}
The same mutation algorithm is then carried out on the second offspring. Now that you have seen how the individual chromosomes function we must take a look at the main class for the genetic algorithm, the "Genetic" class. The Genetic class holds all of the chromosomes and regulates the mating process. The "Genetic" class will be discussed in the next section.
The Genetic Algorithm Class
We will now examine the controller class for the genetic algorithm. Most of the work is done in the Chromosome class that we just discussed. The controller class for the genetic algorithm is shown in Listing 10.3.
Listing 10.3: Genetic Algorithm(Genetic.java)
public class Genetic {
public double globalError = 0;
public static final int POPULATION_SIZE = 5000;
public static final double MUTATION_PERCENT = 0.10;
protected int matingPopulationSize = POPULATION_SIZE/2;
protected int favoredPopulationSize = matingPopulationSize/2;
protected double input[][];
protected double ideal[][];
protected Chromosome [] chromosomes;
Network network;
/**
* Get the error for the best chromosome.
*
* @return The error for the best chromosome.
*/
public double getGlobalError()
{
return globalError;
}
/**
* Construct a new genetic algorithm class.
*
* @param network The neural network that is to be optimized.
*/
public Genetic(Network network)
{
this.network = network;
}
/**
* Init the genetic algorithm.
*/
public void start()
{
chromosomes = new Chromosome[POPULATION_SIZE];
for (int i=0;i<POPULATION_SIZE;i++) {
network.reset();
chromosomes[i] = new Chromosome(network);
chromosomes[i].calculateCost(network,input,ideal);
Chromosome c = chromosomes[i];
c.fromMatrix();
c.toMatrix();
}
Chromosome.sortChromosomes(chromosomes,chromosomes.length);
}
/**
* Set the training data that will be used to evaluate the
* error level of the weight matrixes.
*
* @param input The training data.
*/
public void setInput(double input[][])
{
this.input = input;
}
/**
* Set the idea results for the training data.
*
* @param ideal The ideal results for the training data.
*/
public void setIdeal(double ideal[][])
{
this.ideal = ideal;
}
/**
* Process one generation.
*/
public void generation()
{
int ioffset = matingPopulationSize;
int mutated = 0;
double thisCost = 500.0;
double oldCost = 0.0;
double dcost = 500.0;
int countSame = 0;
// Mate the chromosomes in the favored population
// with all in the mating population
for ( int i=0;i<favoredPopulationSize-1;i++ ) {
Chromosome cmother = chromosomes[i];
// Select partner from the mating population
int father = (int) ( 0.999999*Math.random()*(double)matingPopulationSize);
Chromosome cfather = chromosomes[father];
mutated += cmother.mate(cfather,chromosomes[ioffset],chromosomes[ioffset+1]);
ioffset += 2;
}
// The new generation is in the matingPopulation area
// move them to the correct area for sort.
for ( int i=0;i<matingPopulationSize;i++ ) {
chromosomes[i] = chromosomes[i+matingPopulationSize];
chromosomes[i].calculateCost(network,input,ideal);
}
// Now sort the new mating population
Chromosome.sortChromosomes(chromosomes,matingPopulationSize);
double cost = chromosomes[0].getCost();
dcost = Math.abs(cost-thisCost);
thisCost = cost;
double mutationRate = 100.0 * (double) mutated / (double) matingPopulationSize;
globalError = thisCost;
chromosomes[0].toMatrix();
network.fromArray(chromosomes[0].getMatrix());
}
}
There are several properties that are used by the genetic algorithm controller class. These properties are listed here.
- globalError - The error of the current best chromosome.
- POPULATION_SIZE - A constant that holds the total size of the population.
- MUTATION_PERCENT - The percent of new-born chromosomes that are mutated.
- matingPopulationSize - The size of the mating population. This is half of the total population.
This the favored population will select partners from this population. - favoredPopulationSize - The size of the favored population.
This is half of the size of the mating population. - input - The training data that is used to determine the error for a given chromosome.
- ideal - The ideal results from the training data.
- chromosomes - The chromosomes that make up the total population.
- network - The neural network that is to be trained.
Most of the work that is done by the genetic algorithm's controller class occurs in the methods "start" and "generation". The start method prepares the genetic algorithm for execution. The generation method is then called successively for each generation that is to be processed.
The "start" method is relatively simple. It begins by initializing some of the properties that will be used as the genetic algorithm executes. The "start" method then continues by creating the required number of chromosomes. Each chromosome is created and set to a random gene sequence. Obviously these chromosomes will not be well suited. This will be corrected by successive generations as evolution occurs.
The "generation" method involves more steps. The "generation" method begins by looping through the favored population and mating each chromosome with a random selection from the preferred population. The following loop does this.
for ( int i=0;i<favoredPopulationSize-1;i++ ) {
Chromosome cmother = chromosomes[i];
// Select partner from the mating population
int father = (int) ( 0.999999*Math.random()*(double)matingPopulationSize);
Chromosome cfather = chromosomes[father];
mutated += cmother.mate(cfather,chromosomes[ioffset],chromosomes[ioffset+1]);
ioffset += 2;
}
The children that are created from these matings are stored in the bottom half of the population array. These children are now moved to the top of the population and the population is sorted.
for ( int i=0;i<matingPopulationSize;i++ ) {
chromosomes[i] = chromosomes[i+matingPopulationSize];
chromosomes[i].calculateCost(network,input,ideal);
}
// Now sort the new mating population
Chromosome.sortChromosomes(chromosomes,matingPopulationSize);
As you can see there are quite a number of steps involved in the genetic algorithm. The genetic algorithm also consumes a considerable amount of memory, as each chromosome's patterns and weight matrixes must be stored. The genetic algorithm also takes considerably longer to execute than a comparable solution from simulated annealing. That is why simulated annealing is generally preferred to genetic algorithms. That is not to say that genetic algorithms have no place. Genetic algorithms are frequently beyond neural networks.
