Introduction to Neural Networks with Java, 2nd Edition EBook Available


Introduction to Neural Networks with Java, Second Edition
is now available for purchase
E-Book form! You can also download all examples from this book. We will be posting about
half of it online soon. The book will go off to the printer on Monday, and will show up
on Amazon(and the others) in paperback form within a few weeks. The C# neural network
book is nearly complete. It should come out sometime in October, 2008.

Introduction to Neural Networks with Java, Second Edition, introduces the Java programmer
to the world of Neural Networks and Artificial Intelligence. Neural network architectures,
such as the feedforward, Hopfield, and self-organizing map architectures are discussed.
Training techniques, such as backpropagation, genetic algorithms and simulated annealing
are also introduced. Practical examples are given for each neural network. Examples
include the traveling salesman problem, handwriting recognition, financial prediction,
game strategy, mathematical functions, and Internet bots. All Java source code is available
online for easy downloading.

Chapter 1 provides an overview of neural networks. You will be introduced to the mathematical
underpinnings of neural networks and how to calculate their values manually. You will also
see how neural networks use weights and thresholds to determine their output. Matrix math plays
a central role in neural network processing.

Chapter 2 introduces matrix operations and demonstrates how to implement them in Java. The
mathematical concepts of matrix operations used later in this book are discussed. Additionally,
Java classes are provided which accomplish each of the required matrix operations. One of the
most basic neural networks is the Hopfield neural network.

Chapter 3 demonstrates how to use a Hopfield Neural Network. You will be shown how to
construct a Hopfield neural network and how to train it to recognize patterns.

Chapter 4 introduces the concept of machine learning. To train a neural network, the weights
and thresholds are adjusted until the network produces the desired output. There are many
different ways training can be accomplished. This chapter introduces the different training
methods.

Chapter 5 introduces perhaps the most common neural network architecture, the feedforward
backpropagation neural network. This type of neural network is the central focus of this book.
In this chapter, you will see how to construct a feedforward neural network and how to train
it using backpropagation. Backpropagation may not always be the optimal training algorithm.

Chapter 6 expands upon backpropagation by showing how to train a network using a genetic
algorithm. A genetic algorithm creates a population of neural networks and only allows the
best networks to “mate” and produce offspring. Simulated annealing can also be a very
effective means of training a feedforward neural network.

Chapter 7 continues the discussion of training methods by introducing simulated annealing.
Simulated annealing simulates the heating and cooling of a metal to produce an optimal solution.
Neural networks may contain unnecessary neurons.

Chapter 8 explains how to prune a neural network to its optimal size. Pruning allows unnecessary
neurons to be removed from the neural network without adversely affecting the error
rate of the network. The neural network will process information more quickly with fewer
neurons. Prediction is another popular use for neural networks.

Chapter 9 introduces temporal neural networks, which attempt to predict the future. Prediction
networks can be applied to many different problems, such as the prediction of sunspot cycles,
weather, and the financial markets.

Chapter 10 builds upon chapter 9 by demonstrating how to apply temporal neural networks to
the financial markets. The resulting neural network attempts to predict the direction of
the S & P 500. Another neural network architecture is the self-organizing map (SOM). SOM’s
are often used to group input into categories and are generally trained with an unsupervised
training algorithm. An SOM uses a winner-takes-all strategy, in which the output is provided
by the winning neuron—output is not produced by each of the neurons.

Chapter 11 provides an introduction to SOMs and demonstrates how to use them.
Handwriting recognition is a popular use for SOMs.

Chapter 12 continues where chapter 11 leaves off, by demonstrating how to use an SOM to
read handwritten characters. The neural network must be provided with a sample of the handwriting
that it is to analyze. This handwriting is categorized using the 26 characters of the
Latin alphabet. The neural network is then able to recognize new characters.

Chapter 13 introduces bot programming and explains how to use a neural network to help
identify data. Bots are computer programs that perform repetitive tasks. An HTTP bot is a
special type of bot that uses the web much like a human uses it. The neural network is
trained to recognize the specific types of data for which the bot is searching.

The book ends with chapter 14, which discusses the future of neural networks, quantum
computing, and how it applies to neural networks. The Encog neural network framework is
also introduced.

Comments

Nice information you share

Allen's picture

Nice information you share here. I build my local search marketing networks and I share this infor their already. This is good content and unique for people and developers.

I am so interested about

fghgf's picture

I am so interested about virus programming and i want to learn about writing virus programs so, please send me some ebooks that giving the total information about virus writing.

bookkeeping

What ebook conversion tool is

navins's picture

What ebook conversion tool is used to read this book. Is the pdf version available? What is the cost for buying this book?

it is $19.99

jeffheaton's picture

Click on the book link for more info or visit http://www.heatonresearch.com/book. It is sold as a DRM-free PDF.

Using a Neural Network

greek.god's picture

The type of problem amenable to solution by a neural network is defined by the way they work and the way they are trained. Neural networks work by feeding in some input variables, and producing some output variables. They can therefore be used where you have some known information, and would like to infer some unknown information (see Patterson, 1996; Fausett, 1994). Some examples are:

Stock market prediction rather than online marketing . You know last week's stock prices and today's DOW, NASDAQ, or FTSE index; you want to know tomorrow's stock prices.

Credit assignment. You want to know whether an applicant for a loan is a good or bad credit risk. You usually know applicants' income, previous credit history, etc. (because you ask them these things).

Control. You want to know whether a robot should turn left, turn right, or move forwards in order to reach a target; you know the scene that the robot's camera is currently observing.

Needless to say, not every problem can be solved by a neural network. You may wish to know next week's lottery result, and know your shoe size, but there is no relationship between the two. Indeed, if the lottery is being run correctly, there is no fact you could possibly know that would allow you to infer next week's result. Many financial institutions use, or have experimented with, neural networks for stock market prediction, so it is likely that any trends predictable by neural techniques are already discounted by the market, and (unfortunately), unless you have a sophisticated understanding of that problem domain, you are unlikely to have any success there either!

Therefore, another important requirement for the use of a neural network therefore is that you know (or at least strongly suspect) that there is a relationship between the proposed known inputs and unknown outputs. This relationship may be noisy (you certainly would not expect that the factors given in the stock market prediction example above could give an exact prediction, as prices are clearly influenced by other factors not represented in the input set, and there may be an element of pure randomness) but it must exist.

In general, if you use a neural network, you won't know the exact nature of the relationship between inputs and outputs - if you knew the relationship, you would model it directly. The other key feature of neural networks is that they learn the input/output relationship through training. There are two types of training used in neural networks, with different types of networks using different types of training. These are supervised and unsupervised training, of which supervised is the most common and will be discussed in this section (unsupervised learning is described in a later section).

In supervised learning, the network user assembles a set of training data. The training data contains examples of inputs together with the corresponding outputs, and the network learns to infer the relationship between the two. Training data is usually taken from historical records. In the above examples, this might include previous stock prices and DOW, NASDAQ, or FTSE indices, records of previous successful loan applicants, including questionnaires and a record of whether they defaulted or not, or sample robot positions and the correct reaction.

The neural network is then trained using one of the supervised learning algorithms (of which the best known example is back propagation, devised by Rumelhart et. al., 1986), which uses the data to adjust the network's weights and thresholds so as to minimize the error in its predictions on the training set. If the network is properly trained, it has then learned to model the (unknown) function that relates the input variables to the output variables, and can subsequently be used to make predictions where the output is not known. To index

Gathering Data for Neural Networks

Once you have decided on a problem to solve using neural networks, you will need to gather data for training purposes. The training data set includes a number of cases, each containing values for a range of input and output variables. The first decisions you will need to make are: which variables to use, and how many (and which) cases to gather.

The choice of variables (at least initially) is guided by intuition. Your own expertise in the problem domain will give you some idea of which input variables are likely to be influential. As a first pass, you should include any variables that you think could have an influence - part of the design process will be to whittle this set down.

Neural networks process numeric data in a fairly limited range. This presents a problem if data is in an unusual range, if there is missing data, or if data is non-numeric. Fortunately, there are methods to deal with each of these problems. Numeric data is scaled into an appropriate range for the network, and missing values can be substituted for using the mean value (or other statistic) of that variable across the other available training cases (see Bishop, 1995).

Handling non-numeric data is more difficult. The most common form of non-numeric data consists of nominal-value variables such as Gender={Male, Female}. Nominal-valued variables can be represented numerically. However, neural networks do not tend to perform well with nominal variables that have a large number of possible values.

For example, consider a neural network being trained to estimate the value of houses. The price of houses depends critically on the area of a city in which they are located. A particular city might be subdivided into dozens of named locations, and so it might seem natural to use a nominal-valued variable representing these locations. Unfortunately, it would be very difficult to train a neural network under these circumstances, and a more credible approach would be to assign ratings (based on expert knowledge) to each area; for example, you might assign ratings for the quality of local schools, convenient access to leisure facilities, etc.

Other kinds of non-numeric data must either be converted to numeric form, or discarded. Dates and times, if important, can be converted to an offset value from a starting date/time. Currency values can easily be converted. Unconstrained text fields (such as names) cannot be handled and should be discarded.

The number of cases required for neural network training frequently presents difficulties. There are some heuristic guidelines, which relate the number of cases needed to the size of the network (the simplest of these says that there should be ten times as many cases as connections in the network). Actually, the number needed is also related to the (unknown) complexity of the underlying function which the network is trying to model, and to the variance of the additive noise. As the number of variables increases, the number of cases required increases nonlinearly, so that with even a fairly small number of variables (perhaps fifty or less) a huge number of cases are required. This problem is known as "the curse of dimensionality," and is discussed further later in this chapter.

For most practical problem domains, the number of cases required will be hundreds or thousands. For very complex problems more may be required, but it would be a rare (even trivial) problem which required less than a hundred cases. If your data is sparser than this, you really don't have enough information to train a network, and the best you can do is probably to fit a linear model. If you have a larger, but still restricted, data set, you can compensate to some extent by forming an ensemble of networks, each trained using a different resampling of the available data, and then average across the predictions of the networks in the ensemble.

Many practical problems suffer from data that is unreliable: some variables may be corrupted by noise, or values may be missing altogether. Neural networks are also noise tolerant. However, there is a limit to this tolerance; if there are occasional outliers far outside the range of normal values for a variable, they may bias the training. The best approach to such outliers is to identify and remove them (either discarding the case, or converting the outlier into a missing value). If outliers are difficult to detect, a city block error function (see Bishop, 1995) may be used, but this outlier-tolerant training is generally less effective than the standard approach.


Copyright 2005 - 2010 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright and trademark information.