Flat Spot Problem

The Flat Spot Problem is an issue observed by Scott Fahlman in a paper on the Quickprop training method. Flat spot occurs in certain activation functions, particularly the Sigmoid/Logistic Activation Function. Encog does not use flat spot on the Hyperbolic Tangent or ReLU Activation Functions. Flatspot can make it very difficult for a neural network to property train with propagation training. Because of the Flatspot problem, certain hidden neurons can be rendered completely useless. This can greatly increase training time, and decrease overall efficiency for neural networks. For small neural networks, with just a few hidden neurons, if enough random weights fall into the flat spot range, the neural network will fail to ever converge.

To see why the Flatspot problem exists, consider that all propagation training methods require a derivative of the activation function. Consider the derivative of the sigmoid activation function.

Where $o_j$ is the sigmoid output from unit $j$. The above derivative will approach zero when $o_j$ is near 1.0 or 0.0. The graph illustrates this. You can see the flat spot at the top of the graph, near zero.

Eliminating the flat spot is as simple as adding a constant, such as 0.1 to the derivative function. This results in.

This, generally, has a very positive effect on all propagation training. No change is made to the actual activation function, so the flat spot modification is only necessary at training time.

Encog Handling of the Flat Spot

By default Encog addresses the flat spot. This has been shown to enhance Encog training. However, you can disable the flat spot propcessing. To do this set the FixFlatSpot property, on any propagation training object to false.

References

  • An Empirical Study of Learning Speed in Back-Propagation Networks” (Scott E. Fahlman, 1988)