Threading bug in TaskGroup.java

fallsm's picture

Some people have complained about iteration() not returning. I had that problem and think I've solved it. In TaskGroup.waitForComplete(), the check is performed before the lock is acquired, so if the final worker completes after the main thread does the check but before it gets the lock, the main thread will sit in this.mightBeDone.await() forever without being notified.

I added a second check after the lock is acquired :

try {
if (!getNoTasks()) {
this.mightBeDone.await();
}
} catch (InterruptedException e) {
...

Since doing this, iteration() has always returned over millions of trials.

Hope this helps.

SeemaSingh's picture

That seems quite logical. I tested it, it seems to work quite well. I went ahead and checked this into both 2.4.3 and 2.5(mainline). Someone should check the C# side of the boat too, it might have the same issue.

jeffheaton's picture

Thankyou! That is a really good catch. That makes sense to me, and we will get that implemented. I will also try to get a few really long training runs going, just to see if I can still see any sort of lockup.

Also, I checked in the c# code, and it uses a somewhat different method, but I will take a closer look and see if I want to tighten things up a bit.

jeffheaton's picture

Thanks for putting that in. I think it is a good fix.


Copyright 2005 - 2012 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright, license and trademark information.