one-of-n giving some weird results
Hi all,
I am working with a NN for image/pattern recognition. For an initial test, I made it so that it could recognize two kind of images, let's call them "A" and "B".
Thus I set up an output layer with two neurons, so as if the NN believed the input data to be the image "A", our output neuron array would be [0,1].
For the B image, it would be [1,0].
Basically it does work, and I must say I'm fairly impressed with the results (even considering that the network has been poorly trained with little training data).
however, I'm getting some weird outputs, which I do not know how to read.
For one, sometimes I'm getting negative results. For example, being the input our image "B", something like [-0.008 , + 0.992] will appear.
Note, however, that the values are well withing the NN error for which it was trained (I usually train the NN until nn_error drops below 0.01), so technically the NN is giving a right answer, but I am not quite sure about whether negative values are a normal behaviour for this NN.
Also, my two output neurons don't seem to be equilateral. Usually you get answers like [0.006 , 0.98] , for example... it was my understanding that the two output neurons should add up to "1"? As you see in this example, 0.006 + 0.98 != 1.
Finally, it's not very clear what this NN should answer if it recognises neither "A" nor "B" image, i.e., when it's presented with an unknown/untrained image. In such case, how should a one-of-n output layer behave? perhaps something like [0.4 , 0.6] or so ?
Also, there are other instances where one output neuron falls well whithin the acceptable error range, but the other output neuron jumps far away ( like [0.996 , 0.21] for example.
0.996 seems to be yelling "I'm seeing the image 'A' !! " , but what does that other 0.21 neuron? it's well over the error range we defined...
thanks a lot for your comments and help,




ok, I figured it out....the problem was BasicNetwork's default activation function being ActivationTANH by default; it seems that TANH does some weird things with positively-normalized values (i.e., [0 , +1]). Guess it's better not to use TANH when you're sure about your NN won't be using negative values at all.
So I changed my activation functions to ActivationSigmoid(), and now the NN works like a clock.
regards,
Jose,
Glad you figured it out, and thanks for posting the solution for others.
I agree with you on TANH. The problem is you were normalizing to only half of its range.