# Resilient Propagation

**Resilient Propagation** (RPROP) is one of the best general purpose training methods provided by Encog for neural networks. RPROP is not the best training method in every case, but in most cases it is. RPROP can be used for feedforward neural networks and simple recurrent neural networks. Resilient propagation is a type of propagation training, and is susceptible to the Flat Spot Problem for certain activation functions.

Resilient propagation will typically outperform backpropagation by a considerable factor. Additionally, RPROP has no parameters that must be set. Backpropagation requires that a learning rate and momentum value be specified. Finding an optimal learning rate and momentum value for backpropagation can be difficult. This is not necessary with resilient propagation.

Encog made use of an RPROP technique described by M. Reidmiller ^{[1]} in versions prior to Encog 3.0. For Encog 3.0 Encog switched to four different RPROP algorithms described in a paper by C. Igel. ^{[2]}

Currently Encog makes use of the following four RPROP algorithms:

- RPROP+
- RPROP-
- iRPROP+
- iRPROP-

## Contents |

## Usage

The Encog RPROP trainer is used much like any other trainer in Encog. It implements the IMLTrain (C#) or the MLTrain (Java) interface. The following code shows how to create an RPROP object. For a complete example see the Hello World Example.

### Java Usage

The following code creates a RPROP trainer in Java.

MLTrain train = new ResilientPropagation(network, trainingSet);

### C# Usage

MLTrain train = new ResilientPropagation(network, trainingSet);

The following code creates a RPROP trainer in C#.

## Calculation

In this section we will take a look at how resilient propagation actually functions. The change in weight is calculated as follows.

Clause | Meaning | Pseudo-code Variable |
---|---|---|

Update value for current iteration t. |
delta | |

Update value for the iteration t-1. |
lastDelta | |

Gradient weight from i to j for iteration t. |
gradient | |

Gradient weight from i to j for iteration t-1. |
lastGradient | |

E |
The training error | error |

E^{(t − 1)} |
The training error for iteration t-1 |
lastError |

The change in weight from i to j made by the current iteration (t). |
weightChange | |

The change in weight from i to j made by the current iteration (t). |
lastWeightChange | |

η^{ + } |
Positive step value. Typically 1.2. | maxStep |

η^{ − } |
Negative step value. Typically 0.5. | minStep |

### Reidmiller Implementation

The original paper describes RPROP as a two step process. First, we update the weights. This is done with the following formula.

Here we calculate the change in weight (delta w) depending on what the update value (delta) is.

Once the weights are calculated, we determine the new weight update value. This is done with the following formula.

Now lets see how this is actually done, broken into a series of smaller steps. First me must determine the sign of the change in derivative. This will require use of a **sgn** function. Such a function is defined here.

The actual change in sign is calculated as follows.

What we do now depends on the sign of c.

**If c>0 Then**

If this is the case, then the sign has not changed. This is good, try to accelerate with a larger update value.

Δ*w*_{ij}(*t* + 1) = Δ*w*_{ij}(*t*) + Δ_{ij}(*t*)

**Else If c<0 Then**

If this is the case, then the sign has changed. This means that the last update was too big and we jumped over a local minimum.

**Else c = 0 Then**

If this is the change in the gradient was very small. We will continue to apply the update, but not change the update.

Δ*w*_{ij}(*t* + 1) = Δ*w*_{ij}(*t*) + Δ_{ij}(*t*)

**End If**

### Implementing RPROP+

RPROP+ is very similar to the original Reidmiller implementation. The main difference is that we revert the previous iteration's weight change if the sign of the gradient changes in the current iteration. This is sometimes called "weight backtracking". The basic programmatic implementation for calculating each weight change is shown here.

// multiply the current and previous gradient, and take the // sign. We want to see if the gradient has changed its sign. change = sign( gradient * lastGradient ) weightChange = 0 // if the gradient has retained its sign, then we increase the // delta so that it will converge faster if change > 0 delta = min( delta * positiveStep , maxStep) weightChange = -sign(gradient) * delta lastGradient = gradient // if change<0, then the sign has changed, and the last // delta was too big else if change < 0 delta = max( delta * negativeStep , minStep) weightChange = -lastWeightChange lastGradient = 0 // if change==0 then there is no change to the delta else if change == 0 weightChange = -sign( gradient ) * delta lastGradient = gradient end if lastDelta = delta lastWeightChange = weightChange

### Implementing RPROP-

RPROP- questions how important the "weight backtracking" of RPROP+ actually is. RPROP- removes this and simplifies the algorithm.

// multiply the current and previous gradient, and take the // sign. We want to see if the gradient has changed its sign. change = sign( gradient * lastGradient ) weightChange = 0 // if the gradient has retained its sign, then we increase the // delta so that it will converge faster if change > 0 delta = min( delta * positiveStep , maxStep) // if change<0, then the sign has changed, and the last // delta was too big else if change < 0 delta = max( delta * negativeStep , minStep) end if weightChange = -sign( gradient ) * delta lastGradient = gradient

### Implementing iRPROP+

The iPROP+ algorithm revisits the "weight backtracking" seen in in RPROP+. Some research suggests that iRPROP+ is the optimum RPROP algorithm.

// multiply the current and previous gradient, and take the // sign. We want to see if the gradient has changed its sign. change = sign( gradient * lastGradient ) weightChange = 0 // if the gradient has retained its sign, then we increase the // delta so that it will converge faster if change > 0 delta = min( delta * positiveStep , maxStep) weightChange = -sign(gradient) * delta lastGradient = gradient // if change<0, then the sign has changed, and the last // delta was too big else if change < 0 delta = max( delta * negativeStep , minStep) if currentError > lastError then weightChange = -lastWeightChange lastGradient = 0 // if change==0 then there is no change to the delta else if change == 0 weightChange = -sign( gradient ) * delta lastGradient = gradient end if lastDelta = delta lastWeightChange = weightChange

### Implementing iRPROP-

iRPROP- is very similar to RRPROP-. No weight backtracking is used. However, the **lastGradient** is set to zero when the gradient changes its sign.

// multiply the current and previous gradient, and take the // sign. We want to see if the gradient has changed its sign. change = sign( gradient * lastGradient ) weightChange = 0 // if the gradient has retained its sign, then we increase the // delta so that it will converge faster if change > 0 delta = min( delta * positiveStep , maxStep) // if change<0, then the sign has changed, and the last // delta was too big else if change < 0 delta = max( delta * negativeStep , minStep) lastGradient = 0 end if weightChange = -sign( gradient ) * delta lastGradient = gradient