A feedforward neural network is similar to the types of neural networks that we have already examined. Just like many other types of neural networks, the feedforward neural network begins with an input layer. The input layer may be connected to a hidden layer or directly to the output layer. If it is connected to a hidden layer, the hidden layer can then be connected to another hidden layer or directly to the output layer. There can be any number of hidden layers, as long as there is at least one hidden layer or output layer provided. In common use, most neural networks will have one hidden layer, and it is very rare for a neural network to have more than two hidden layers.
Figure 5.1 illustrates a typical feedforward neural network with a single hidden layer.
Figure 5.1: A typical feedforward neural network (single hidden layer).
Neural networks with more than two hidden layers are uncommon.
As we saw in the previous section, there are many ways that feedforward neural networks can be constructed. You must decide how many neurons will be inside the input and output layers. You must also decide how many hidden layers you are going to have and how many neurons will be in each of them.
There are many techniques for choosing these parameters. In this section we will cover some of the general “rules of thumb” that you can use to assist you in these decisions; however, these rules will only take you so far. In nearly all cases, some experimentation will be required to determine the optimal structure for your feedforward neural network. There are many books dedicated entirely to this topic. For a thorough discussion on structuring feedforward neural networks, you should refer to the book Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks (MIT Press, 1999).
The input layer is the conduit through which the external environment presents a pattern to the neural network. Once a pattern is presented to the input layer, the output layer will produce another pattern. In essence, this is all the neural network does. The input layer should represent the condition for which we are training the neural network. Every input neuron should represent some independent variable that has an influence over the output of the neural network.
It is important to remember that the inputs to the neural network are floating point numbers. These values are expressed as the primitive Java data type “double.” This is not to say that you can only process numeric data with the neural network; if you wish to process a form of data that is non-numeric, you must develop a process that normalizes this data to a numeric representation. In chapter 12, “OCR and the Self-Organizing Map,” I will show you how to communicate graphic information to a neural network.
The output layer of the neural network is what actually presents a pattern to the external environment. The pattern presented by the output layer can be directly traced back to the input layer. The number of output neurons should be directly related to the type of work that the neural network is to perform.
To determine the number of neurons to use in your output layer, you must first consider the intended use of the neural network. If the neural network is to be used to classify items into groups, then it is often preferable to have one output neuron for each group that input items are to be assigned into. If the neural network is to perform noise reduction on a signal, then it is likely that the number of input neurons will match the number of output neurons. In this sort of neural network, you will want the patterns to leave the neural network in the same format as they entered.
For a specific example of how to choose the number of input neurons and the number of output neurons, consider a program that is used for optical character recognition (OCR), such as the program presented in the example in chapter 12, “OCR and the Self-Organizing Map.” To determine the number of neurons used for the OCR example, we will first consider the input layer. The number of input neurons that we will use is the number of pixels that might represent any given character. Characters processed by this program are normalized to a universal size that is represented by a 5x7 grid. A 5x7 grid contains a total of 35 pixels. Therefore, the OCR program has 35 input neurons.
The number of output neurons used by the OCR program will vary depending on how many characters the program has been trained to recognize. The default training file that is provided with the OCR program is used to train it to recognize 26 characters. Using this file, the neural network will have 26 output neurons. Presenting a pattern to the input neurons will fire the appropriate output neuron that corresponds to the letter that the input pattern represents.