Somewhat strange behavior detected

Diamantregen's picture

Hello,

Could please someone shed some light on the following. The code below gives a different result when the line with the arrow is uncommented (error and weights are different), but I'm not sure why.

The goal is to continue a previously interrupted training and not to have the ITrain object in memory.

Thanks!

Regards,
Roger

Here is the code:

BasicNetwork network = new BasicNetwork();
network.AddLayer(new BasicLayer(new ActivationSigmoid(), false, 1));
network.AddLayer(new BasicLayer(new ActivationSigmoid(), true, 1));
network.AddLayer(new BasicLayer(new ActivationSigmoid(), true, 1));
network.Logic = new FeedforwardLogic();
network.Structure.FinalizeStructure();
NetworkCODEC.ArrayToNetwork(new double[] { 0.3233232, 0.1212328, -0.11, 0.984 }, network);
INeuralDataSet trainingSet = new BasicNeuralDataSet(new double[][] { new double[] { 0 } }, new double[][] { new double[] { 0 } });
ITrain train = new Backpropagation(network, trainingSet, 0.3d, 0.9d);
Console.WriteLine("Start at " + DateTime.Now);
double mse = 0;
/********************/
for (int i = 0; i < 1; i++)
{
train.Iteration();
mse = train.Error;
}
//train = new Backpropagation(network, trainingSet, 0.3d, 0.9d); //<---- uncomment this gives different result
for (int i = 0; i < 2; i++)
{
train.Iteration();
mse = train.Error;
}
/********************/
Console.WriteLine("End at " + DateTime.Now);
Console.WriteLine("MSE before last iteration: " + mse);

double[] doubleArray = NetworkCODEC.NetworkToArray(network);
Console.WriteLine(doubleArray[0]);
Console.WriteLine(doubleArray[1]);
Console.WriteLine(doubleArray[2]);
Console.WriteLine(doubleArray[3]);

SeemaSingh's picture

Backpropagation uses momentum to help train. You are specifying a momentum of 0.9, which i s nearly "full" momentum. Momentum means to take that percent of the weight delta, from the previous iteration. When you create a new trainer, there is no previous iteration, so you lose the momentum for just that one iteration(the first one with the new trainer).

I tried your example, and you are correct. You get different results each time through. However, I tried running your example with momentum set to zero for both trainers, and I get the same result.

Diamantregen's picture

Thanks for the explanation.

So, is there any solution for the problem? Is it possible for example to save the missing information and pass it to Encog when training resumes? It should be avoided to require persistence of the whole ITrain object, preferably just the weights and possibly that missing information should be persisted. Then when training resumes, the persisted information is loaded and passed to Encog when reinitializing the network. Thanks.

SeemaSingh's picture

Somehow that previous iteration delta array needs to be saved. This same thing exists in RPROP, only WAY worse. RPROP has two arrays that have to be stored so that training can continue, or else training REALLY suffers, a 2% can jump back up to 40% or so.

To fix this we added the pause and resume methods on training. They return an EncogPersisted object, TrainingContinuation, that can be used to restart the training exactly where you left off. Currently only RPROP supports it. I didn't think backprop needed it, but that is a good point, the momentum does get lost.

I have the task of reworking the training continuation stuff for Encog 2.5, as the flat networks affected it somewhat. While I am hooking it back together, I will also implement it for backprop. That only makes sense.

Diamantregen's picture

That sounds really cool! So just creating a new TrainingContinuation object, setting its content and passing it to the Resume method will make it easy to continue the training.

Thanks for implementing it for BackProp!

Regards,
Roger


Copyright 2005 - 2012 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright, license and trademark information.