XOR sample sometimes dead loop during training.

rtsang's picture

I have tried out the first example: XOR. Sometimes if I run the program multiple time, I got a dead loop 2 out of 4 times. Is this a bug?

The following is from the log:
Epoch #1 Error:1.4791786351330383
Epoch #2 Error:1.4791786351330383
Epoch #3 Error:1.3784874783572416
Epoch #4 Error:1.2166564202505221
Epoch #5 Error:1.0090751839897298
Epoch #6 Error:0.7597002609965798
Epoch #7 Error:0.5088553838519776
Epoch #8 Error:0.5081817343214782
Epoch #9 Error:0.5062081150734146
Epoch #10 Error:0.5049785769710606
Epoch #11 Error:0.44567714493314387
Epoch #12 Error:0.4961471412954777
Epoch #13 Error:0.49553390783811174
Epoch #14 Error:0.4799110972052888
Epoch #15 Error:0.4323748815718149
Epoch #16 Error:0.45242778931345634
Epoch #17 Error:0.45242770617683864
Epoch #18 Error:0.44350462319411615
Epoch #19 Error:0.4195678365974226
Epoch #20 Error:0.42218320785683333
Epoch #21 Error:0.42218320785683333
Epoch #22 Error:0.41685864777979437
Epoch #23 Error:0.4137373032835609
Epoch #24 Error:0.4134488386335428
Epoch #25 Error:0.4137184194833095
Epoch #26 Error:0.4095866841043894
Epoch #27 Error:0.4108853874447044
Epoch #28 Error:0.40988407480508776
Epoch #29 Error:0.40880427953877463
Epoch #30 Error:0.4088915593308395
Epoch #31 Error:0.4088915593308395
Epoch #32 Error:0.4088915593308395
Epoch #33 Error:0.40826322623260924
Epoch #34 Error:0.40826322623260924
Epoch #35 Error:0.40826322623260924
Epoch #36 Error:0.4084383648603632
Epoch #37 Error:0.4084383648603632
Epoch #38 Error:0.4084383648603632
Epoch #39 Error:0.40825938339646917
Epoch #40 Error:0.40833841582419056
Epoch #41 Error:0.40833841582419056
Epoch #42 Error:0.40833841582419056
Epoch #43 Error:0.4082521066510239
Epoch #44 Error:0.40830210385666277
Epoch #45 Error:0.40830210385666277
Epoch #46 Error:0.40830210385666277
Epoch #47 Error:0.4082521597859816
Epoch #48 Error:0.4082713752121419

.
.
.
.
.
.
.
Epoch #7310 Error:0.4082482907521338
Epoch #7311 Error:0.4082482907526949
Epoch #7312 Error:0.4082482907526949
Epoch #7313 Error:0.4082482907526949
Epoch #7314 Error:0.4082482907521338
Epoch #7315 Error:0.4082482907521338
Epoch #7316 Error:0.4082482907521338
Epoch #7317 Error:0.4082482907526949
Epoch #7318 Error:0.4082482907526949
Epoch #7319 Error:0.4082482907526949
Epoch #7320 Error:0.4082482907521338

jeffheaton's picture

It is actually not a bug. It really depends on the training technique. Especially with RPROP, SCG and LMA some random starting weights are just not trainable for the XOR network. Also the very small size of the XOR training does not help either.

Once you start with a new random weight/bias matrix, the network will train just fine. Bigger neural networks have this problem much less often.

jeffheaton's picture

By the way, the examples in 2.4 contain a "strategy" that detects a bad initial weight and resets if needed. So you actually won't see this happen in 2.4. I just checked in an improvement to address this. I had been meaning to do it for awhile. It will be rolled to C# as well.

spdracer22's picture

Was this an improvement to the core? Is it in 2.4.2?

I am having the exact same issue as the OP with 2.4.2. I'd been stressing the past couple hours trying to figure out what was going on. XOR only trains with RProp about 1 of 6 or so times, and even when it does, it gives grossly incorrect compute results, even when trained to a very low error level...

Good to know why exactly this is happening, and that I'm not missing anything.

jeffheaton's picture

Each of the examples that might run into this have the following line added.

// reset if improve is less than 1% over 5 cycles
train.addStrategy(new RequiredImprovementStrategy(5));

There are many options on the constructor for RequiredImprovementStrategy, so this is not the only way to do it.

spdracer22's picture

Thanks, I'll give that a try.


Copyright 2005 - 2012 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright, license and trademark information.