A Historical Perspective on Neural Networks
Neural networks have been used with computers as early as the 1950’s. Through the years many different neural network architectures have been presented. In this section you will be shown some of the history behind neural networks and how this history led to the neural networks of today. We will begin this exploration with the Perceptron.
Perceptron
The perceptron is one of the earliest neural networks. Invented at the Cornell Aeronautical Laboratory in 1957 by Frank Rosenblatt, the Perceptron was an attempt to understand human memory, learning, and cognitive processes. In 1960, Rosenblatt demonstrated the Mark I Perceptron. The Mark I was the first machine that could “learn” to identify optical patterns.
The Perceptron progressed from the biological neural studies of neural researchers such as D.O. Hebb, Warren McCulloch and Walter Pitts. McCulloch and Pitts were the first to describe biological neural networks, and are credited with coining the phrase “neural network.” They developed a simplified model of the neuron, called the MP neuron that centered on the idea that a nerve will fire an impulse only if its threshold value is exceeded. The MP neuron functioned as a sort of scanning device that read predefined input and output associations to determine the final output. MP neurons were incapable of learning as they had fixed thresholds. As a result MP neurons were able to be hard-wired logic devices that were setup manually.
Because the MP neuron did not have the ability to learn, it was very limited when compared to the infinitely more flexible and adaptive human nervous system upon which it was modeled. Rosenblatt determined that a learning network model could improve its responses by adjusting the weight on its connections between neurons. This was taken into consideration when Rosenblatt designed the perceptron.
The perceptron showed early promise for neural networks and machine learning. The Perceptron had one very large shortcoming. The perceptron was unable to learn to recognize input that was not “linearly separable.” This would prove to be huge obstacle that would take some time to overcome.
Perceptrons and Linear Separability
To see why the perceptron failed you must see what exactly is meant by a linearly separable problem. Consider a neural network that accepts two binary digits (0 or 1) and outputs one binary digit. The inputs and output of such a neural network could be represented by Table 1.1.
Table 1.1: A Linearly Separable Function
| Input 1 | Input 2 | Output |
|---|---|---|
| 0 | 0 | 1 |
| 0 | 1 | 0 |
| 1 | 0 | 1 |
| 1 | 1 | 1 |
This table would be considered to be linearly separable. To see why, examine Figure 1.5. Table 1.1 is shown, in the form of a logic diagram, on Figure 1.5a. Notice how a line can be drawn to separate the output values of 1 from the output values of 0? This is a linearly separable table. Table 1.2 shows a non-linearly separable table.
Figure 1.5: Linearly/Non-Linearly Separable Function 
Table 1.2: A Non Linearly Separable Function
| Input 1 | Input 2 | Output |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
The above table, which happens to be the XOR function, is not linearly separable. This can be seen in Figure 1.5b. Table 1.2 is shown on the right side of Figure 1.5. There is no way you could draw a line that would separate the 0 outputs from the 1 outputs. As a result, Table 1.2 is said to be non-linearly separately. A perceptron could not be trained to recognize Table 1.2.
The Perception’s inability to solve non-linearly separable problems would prove to be a major obstacle to not only the Perceptron, but the entire field of artificial intelligence. A former classmate of Rosenblatt, Marvin Minsky, along with Seymour Papert, published the book Perceptrons in 1969. This book mathematically discredited the Perceptron model. Fate was to further rule against the Perceptron in 1971 when Rosenblatt died in a boating accident. Without Rosenblatt to defend the Perceptron and neural networks, interest diminished for over a decade.
While the XOR problem was the nemesis of the Perceptron, current neural networks have little problem learning the XOR function or other non-linearly separable problems. In fact, The XOR problem has become a sort of “Hello World” problem for new neural network software. While the XOR problem was eventually surmounted, another test, the Turing Test, remains unsolved to this day.
The Turing Test
The Turing test was proposed in a 1950 paper by Dr. Alan Turing. In this article Dr. Turing introduces the now famous “Turing Test”. This is a test that is designed to measure the advance of AI research. The Turing test is far more complex than the XOR problem, and has yet to be solved.
To understand the Turing Test, think of an Internet Instant Message window. Using the Instant Message program you can chat with someone using another computer. Suppose a stranger sends you an Instant Message and you begin chatting. Are you sure that this stranger is a human being? Perhaps you are talking to an AI enabled computer program. Could you tell the difference? This is the “Turing Test.” If you are unable to distinguish the AI program from another human being, then that program has passed the “Turing Test”.
No computer program has ever passed the Turing Test. No computer program has ever even come close to passing the Turing Test. In the 1950’s it was assumed that a computer program capable of passing the Turing Test was no more than a decade away. But like many of the other lofty goals of AI, passing the Turing Test has yet to be realized.
Passing the Turing Test is quite complex. To pass this test requires the computer to be able to read English, or some other human language, and understand the meaning of the sentence. Then the computer must be able to access a database that comprises the knowledge that a typical human has amassed from several decades of human existence. Finally, the computer program must be capable of forming a response, and perhaps questioning the human that it is interacting with. This is no small feat. This goes well beyond the capabilities of current neural networks.
One of the most complex parts of solving the Turing Test is working with the database of human knowledge. This has given way to a new test called the “Limited Turing Test”. The “Limited Turing Test” works similarly to the actual Turing Test. A human is allowed to conduct a conversation with a computer program. The difference is that the human must restrict the conversation to one narrow subject area. This limits the size of the human experience database.
Neural Network Today and in the Future
Neural networks have existed since the 1950’s. They have come a long way since the early Percptrons that were easily defeated by problems as simple as the XOR operator. Yet neural networks have a long way to go.
Neural Networks Today
Neural networks are in use today for a wide variety of tasks. Most people think of neural networks as attempting to emulate the human mind or passing the Turing Test. Most neural networks used today take on far less glamorous roles than the neural networks frequently seen in science fiction.
Speech and handwriting recognition are two common uses for today’s neural networks. Chapter 7 contains an example that illustrates a neural network handwriting recognition program. Neural networks tend to work well for both speech and handwriting recognition because these types of programs can be trained to the individual user.
Data mining is a process where large volumes of data are “mined” for trends and other statistics that might otherwise be overlooked. Very often in data mining the programmer is not particularly sure what final outcome is being sought. Neural networks are often employed in data mining because of their trainability.
Perhaps the most common form of neural network used by modern applications is the feedforward backpropagation neural network. This network feeds inputs forward from one layer to the next as it processes. Backpropagation refers to the way in which the neurons are trained in this sort of neural network. Chapter 3 begins your introduction into this sort of network.
A Fixed Wing Neural Network
Some researchers suggest that perhaps the neural network itself is a fallacy. Perhaps other methods of modeling human intelligence must be explored. The ultimate goal of AI is to produce a thinking machine. Does this not mean that such a machine would have to be constructed exactly like a human brain? That to solve the AI puzzle, we should seek to imitate nature? Imitating nature has not always led mankind to the most optimal solution. Consider the airplane.
Man has been fascinated with the idea of flight since the beginnings of civilization. Many inventors through history worked towards the development of the “Flying Machine”. To create a flying machine, most of these inventors looked to nature. In nature we found our only working model of a flying machine, which was the bird. Most inventors who aspired to create a flying machine created various forms of ornithopters.
Ornithopters are flying machines that work by flapping their wings. This is how a bird works, so it seemed only logical that this would be the way to create such a device. However none of the ornithopters were successful. They simply could not generate sufficient lift to overcome their weight. Many designs were tried. Figure 1.6 shows one such design that was patented in the late 1800’s.
Figure 1.6: An Ornithopter
It was not until Wilbur and Orville Wright decided to use a fixed wing design that air plane technology began to truly advance. For years, the paradigm of modeling the bird was pursued. Once the two brothers broke with this tradition, this area of science began to move forward. Perhaps AI is no different. Perhaps it will take a new paradigm, outside of the neural network, to usher in the next era of AI.




