The Kohonen neural network differs considerably from the feedforward back propagation neural network. The Kohonen neural network differs both in how it is trained and how it recalls a pattern. The Kohohen neural network does not use any sort of activation function. Further, the Kohonen neural network does not use any sort of a bias weight.
Output from the Kohonen neural network does not consist of the output of several neurons. When a pattern is presented to a Kohonen network one of the output neurons is selected as a "winner". This "winning" neuron is the output from the Kohonen network. Often these "winning" neurons represent groups in the data that is presented to the Kohonen network. For example, in Chapter 7 we will examine an OCR program that uses 26 output neurons. These 26 output neurons map the input patterns into the 26 letters of the Latin alphabet.
The most significant difference between the Kohonen neural network and the feed forward back propagation neural network is that the Kohonen network trained in an unsupervised mode. This means that the Kohonen network is presented with data, but the correct output that corresponds to that data is not specified. Using the Kohonen network this data can be classified into groups. We will begin our review of the Kohonen network by examining the training process.
It is also important to understand the limitations of the Kohonen neural network. You will recall from the previous chapter that neural networks with only two layers can only be applied to linearly separable problems. This is the case with the Kohonen neural network. Kohonen nueral networks are used because they are a relatively simple network to construct that can be trained very rapidly.
I will now show you how the Kohonen neural network recognizes a pattern. We will begin by examining the structure of the Kohonen neural network. Once you understand the structure of the Kohonen neural network, and how it recognizes patterns, you will be shown how to train the Kohonen neural network to properly recognize the patterns you desire. We will begin by examining the structure of the Kohonen neural network.
The Kohonen neural network works differently than the feed forward neural network that we learned about in Chapter 5. The Kohonen neural network contains only an input and output layer of neurons. There is no hidden layer in a Kohonen neural network. First we will examine the input and output to a Kohonen neural network.
The input to a Kohonen neural network is given to the neural network using the input neurons. These input neurons are each given the floating point numbers that make up the input pattern to the network. A Kohonen neural network requires that these inputs be normalized to the range between -1 and 1. Presenting an input pattern to the network will cause a reaction from the output neurons.
The output of a Kohonen neural network is very different from the output of a feed forward neural network. Recall from Chapter 5, that if we had a neural network with five output neurons we would be given an output that consisted of five values. This is not the case with the Kohonen neural network. In a Kohonen neural network only one of the output neurons actually produces a value. Additionally, this single value is either true or false. When the pattern is presented to the Kohonen neural network, one single output neuron is chosen as the output neuron. Therefore, the output from the Kohonen neural network is usually the index of the neuron (i.e. Neuron #5) that fired. The structure of a typical Kohonen neural network is shown in Figure 6.1.
Figure 6.1: A Kohonen Neural Network
Now that you understand the structure of the Kohonen neural network we will examine how the network processes information. To examine this process we will step through the calculation process. For this example we will consider a very simple Kohonen neural network. This network will have only two input and two output neurons. The input given to the two input neurons is shown in Table 6.1.
Table 6.1: Sample Inputs to a Kohonen Neural Network
|Input Neuron 1 (I1)||0.5|
|Input Neuron 2 (I2)||0.75|
We must also know the connection weights between the neurons. These connection weights are given in Table 6.2.
Table 6.2: Connection weights in the sample Kohonen neural network
Using these values we will now examine which neuron would win and produce output. We will begin by normalizing the input.
The Kohonen neural network requires that its input be normalized. Because of this some texts refer to the normalization as a third layer. For the purposes of this book the Kohonen neural network is considered a two layer network because there are only two actual neuron layers at work in the Kohonen neural network.
The requirements that the Kohonen neural network places on its input data is one of the most sever limitations of the Kohonen neural network. Input to the Kohonen neural network should be between the values -1 and 1. In addition, each of the inputs should fully use the range. If one, or more, of the input neurons were to use only the numbers between 0 and 1, the performance of the neural network would suffer.
To normalize the input we must first calculate the "vector length" of the input data, or vector. This is done by summing the squares of the input vector. In this case it would be.
(0.5 * 0.5) + (0.75 * 0.75)
This would result in a "vector length" of 0.8125. If the length becomes too small, say less than the length is set to that same arbitrarily small value. In this case the "vector length" is a sufficiently large number. Using this length we can now determine the normalization factor. The normalization factor is the reciprocal of the square root of the length. For our value the normalization factor is calculated as follows.
This results in a normalization factor of 1.1094. This normalization process will be used in the next step where the output layer is calculated.
To calculate the output the input vector and neuron connection weights must both be considered. First the "dot product" of the input neurons and their connection weights must be calculated. To calculate the dot product between two vectors you must multiply each of the elements in the two vectors. We will now examine how this is done.
The Kohonen algorithm specifies that we must take the dot product of the input vector and the weights between the input neurons and the output neurons. The result of this is as follows.
As you can see from the above calculation the dot product would be 0.395. This calculation will be performed for the first output neuron. This calculation will have to be done for each of the output neurons. Through this example we will only examine the calculations for the first output neuron. The calculations necessary for the second output neuron are calculated in the same way.
This output must now be normalized by multiplying it by the normalization factor that was determined in the previous step. You must now multiply the dot product of 0.395 by the normalization factor of 1.1094. This results in an output of 0.438213. Now that the output has been calculated and normalized it must be mapped to a bipolar number.
As you may recall from Chapter 2 a bipolar number is an alternate way of representing binary numbers. In the bipolar system the binary zero maps to -1 and the binary remains a 1. Because the input to the neural network normalized to this range we must perform a similar normalization to the output of the neurons. To make this mapping we add one and divide the result in half. For the output of 0.438213 this would result in a final output of 0.7191065.
The value 0.7191065 is the output of the first neuron. This value will be compared with the outputs of the other neuron. By comparing these values we can determine a "winning" neuron.
We have seen how to calculate the value for the first output neuron. If we are to determine a winning output neuron we must also calculate the value for the second output neuron. We will now quickly review the process to calculate the second neuron. For a more detailed description you should refer to the previous section.
The second output neuron will use exactly the same normalization factor as was used to calculate the first output neuron. As you recall from the previous section the normalization factor is 1.1094. If we apply the dot product for the weights of the second output neuron and the input vector we get a value of 0.45. This value is multiplied by the normalization factor of 1.1094 to give the value of 0.0465948. We can now calculate the final output for neuron 2 by converting the output of 0.0465948 to bipolar yields 0.49923.
As you can see we now have an output value for each of the neurons. The first neuron has an output value of 0.7191065 and the second neuron has an output value of 0.49923. To choose the winning neuron we choose the output that has the largest output value. In this case the winning neuron is the first output neuron with an output of 0.7191065, which beats neuron two's output of 0.49923.
You have now seen how the output of the Kohonen neural network was derived. As you can see the weights between the input and output neurons determine this output. In the next section we will see how you can adjust these weights can be adjusted to produce output that is more suitable for the desired task. The training process is what modified these weights. The training process will be described in the next section.
In this section you will learn to train a Kohonen neural network. There several steps involved in this training process. Overall the process for training a Kohonen neural network involves stepping through several epochs until the error of the Kohonen neural network is below acceptable level. In this section we will learn these individual processes. You'll learn how to calculate the error rate for Koenig neural network, you'll learn how to adjust the weights for each epoch. You will also learn to determine when no more epochs are necessary to further train the neural network.
The training process for the Kohonen neural network is competitive. For each training set one neuron will "win". This winning neuron will have its weight adjusted so that it will react even more strongly to the input the next time. As different neurons win for different patterns, their ability to recognize that particular pattern will be increased.
We will first examine the overall process involving training the Kohonen neural network. These individual steps are summarized in figure 6.2.
Figure 6.2: Training the Kohonen neural network.
As you can see from the above diagram to Koenig neural network is trained by repeating epochs until one of two things happens. If they calculated error is below acceptable level business at block will complete the training process. On the other hand, if the error rate has all only changed by a very marginal amount this individual cycle will be aborted with tile any additional epochs taking place. If it is determined that the cycle is to be aborted the weights will be initialized random values and a new training cycle began. This training cycle will continue the previous training cycle and that it will analyze epochs on to solve get the two is either abandoned or produces a set of weights that produces an acceptable error level.
The most important part in the network's training cycles are the individual epochs. We will now examine what happens turning each of these epochs. We will began by examining how the weights are adjusted for each epoch.
The learning rate is a constant that will be used by the learning algorithm. The learning rate must be a positive number less than 1. Typically the learning rate is a number such as .4 or .5. In the following section the learning rate will be specified by the symbol alpha.
Generally setting the learning rate to a larger value will cause the training to progress faster. Though setting the learning rate to too large a number could cause the network to never converge. This is because the oscillations of the weight vectors will be too great for the classification patterns to ever emerge. Another technique is to start with a relatively high learning rate and decrease this rate as training progresses. This allows initial rapid training of the neural network that will be "fine tuned" as training progresses.
The learning rate is just a variable that is used as part of the algorithm used to adjust the weights of the neurons. In the next section we will see how these weights are adjusted using the learning rate.
The entire memory of the Kohonen neural network is stored inside of the weighted connections between the input and output layer. The weights are adjusted in each epoc. An epoch occurs when training data is presented to the Kohonen neural network and the weights are adjusted based on the results of this item of training data. The adjustments to the weights should produce a network that will yield more favorable results the next time the same training data is presented. Epochs continue as more and more data is presented to the network and the weights are adjusted.
Eventually the return on these weight adjustments will diminish to the point that it is no longer valuable to continue with this particular set of weights. When this happens the entire weight matrix is reset to new random values. This forms a new cycle. The final weight matrix that will be used will be the best weight matrix determined from each of the cycles. We will now examine how these weights are transformed.
The original method for calculating the changes to weights, which was proposed by Kohonen, is often called the additive method. This method uses the following equation.
The variable x is the training vector that was presented to the network. The variable is the weight of the winning neuron, and the variable is the new weight. The double vertical bars represent the vector length. This method will be implemented in the Kohonen example shown later in this chapter.
The additive method generally works well for Kohonen neural networks. Though in cases where the additive method shows excessive instability, and fails to converge, an alternate method can be used. This method is called the subtractive method. The subtractive method uses the following equations.
These two equations show you the basic transformation that will occur on the weights of the network. In the next section you will see how these equations are implemented as a Java program, and their use will be demonstrated.
Before we can understand how to calculate the error for chronic neural network must first understand what the error means. The coming neural network is trained in an unsupervised fashion so the definition of the error is somewhat different involving the normally think of as an error.
As your alternate previous Chapter unsupervised training involved Calculating and error which was the difference between the anticipated output of the neural network and the actual output of the neural network. In this chapter we are examining unsupervised training. And unsupervised training there is no anticipated output. Because of this you may be wondering exactly how we can calculate an error. The answer is that the error where Calculating is not be true error, or at least not an error in the normal sense of the word.
The purpose of the Kohonen neural network is to classify the input into several sets. The error for the Kohonen neural network, therefore, must be able to measure how well the network is classifying these items. We will examine two methods for determining the error in this section. There is no official way to calculate the error for a Kohonen neural network. The error is just a percent number that gives an idea of how well the Kohonen network is classifying the input into the output groups. The error itself is not used to modify the weights, as was done in the back propagation algorithm. The method to determine this error will be discussed when we see how to implement a Java training method.