one-of-n giving some weird results

jmvaret's picture

Hi all,

I am working with a NN for image/pattern recognition. For an initial test, I made it so that it could recognize two kind of images, let's call them "A" and "B".

Thus I set up an output layer with two neurons, so as if the NN believed the input data to be the image "A", our output neuron array would be [0,1].

For the B image, it would be [1,0].

Basically it does work, and I must say I'm fairly impressed with the results (even considering that the network has been poorly trained with little training data).

however, I'm getting some weird outputs, which I do not know how to read.

For one, sometimes I'm getting negative results. For example, being the input our image "B", something like [-0.008 , + 0.992] will appear.
Note, however, that the values are well withing the NN error for which it was trained (I usually train the NN until nn_error drops below 0.01), so technically the NN is giving a right answer, but I am not quite sure about whether negative values are a normal behaviour for this NN.

Also, my two output neurons don't seem to be equilateral. Usually you get answers like [0.006 , 0.98] , for example... it was my understanding that the two output neurons should add up to "1"? As you see in this example, 0.006 + 0.98 != 1.

Finally, it's not very clear what this NN should answer if it recognises neither "A" nor "B" image, i.e., when it's presented with an unknown/untrained image. In such case, how should a one-of-n output layer behave? perhaps something like [0.4 , 0.6] or so ?

Also, there are other instances where one output neuron falls well whithin the acceptable error range, but the other output neuron jumps far away ( like [0.996 , 0.21] for example.

0.996 seems to be yelling "I'm seeing the image 'A' !! " , but what does that other 0.21 neuron? it's well over the error range we defined...

thanks a lot for your comments and help,

jmvaret's picture

ok, I figured it out....the problem was BasicNetwork's default activation function being ActivationTANH by default; it seems that TANH does some weird things with positively-normalized values (i.e., [0 , +1]). Guess it's better not to use TANH when you're sure about your NN won't be using negative values at all.

So I changed my activation functions to ActivationSigmoid(), and now the NN works like a clock.

regards,
Jose,

jeffheaton's picture

Glad you figured it out, and thanks for posting the solution for others.

I agree with you on TANH. The problem is you were normalizing to only half of its range.


Copyright 2005 - 2012 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright, license and trademark information.