Minima
To understand the significance of global and local minima we must first establish exactly what is meant by the term minima. Minima are the bottoms of curves on a graph. Figure 10.1 shows a graph that contains several minima.

Figure 10.1: Several minima
Local Minima
The function that we are seeking the minimum to is not nearly as simple as the graph shown in Figure 10.1. We are seeking the minimum of the error function chosen for the neural network. Because the entire weight matrix serves as input to this function it produces a far more complex graph than the one shown in Figure 10.1.
Local minima are any points that forms the bottom of a selected region of the graph. This region can be of any arbitrary size. Backpropagation has a tendency to find these local minima. Once the backpropagation learning algorithm settles into one of these local minima it is very difficult for the algorithm to continue its search to find the global minimum. The global minimum is the absolute lowest point that the graph of the error function will ever attain.
The Search for the Global Minima
The objective of training a neural network is the minimization of the error function. Supervised training, using a backpropagation based learning algorithm, can become trapped in a local minimum of the error function. The reason this can occur is the fact that steepest descent and conjugate gradient backpropagation based training methods are local minimization algorithms. There is no mechanism that allows them to escape such a local minimum.
Simulated annealing and genetic algorithms are two algorithms that have the capability to move out of regions near local minima. Though there is no guarantee that it will find the true global minimum, these algorithms can often help to find a more suitable local minimum. That is a local minimum with a lower error score. Figure 10.2 shows the chart of an arbitrary function that has several local minima and a global minimum.

Figure 10.2: The global minimum
