Activation functions
I have read elsewhere and heard/read from you that the sigmoid function is inappropriate for negative values. Is this only for academic purposes or am I totally not getting something?
Example: Assume I have a feed forward, back propagation net with ten inputs that are always positive, and out one output that is in the range of -100 through 100.
Would there be any disadvantage to using the sigmoid function and normalizing my results to make .5 represent output 0, 0.0 to represent -100 and 1 to represent 100?
Let me then change the question: What if one or more inputs were negative and my normalization turned them into positive numbers?
I fear I am mixing theory you presented (Sigmoid vs. TANH) with being practical or worse yet am not getting something altogether!
Thanks for the book, videos, and this site!




I think what Jeff means is that if you would like output from the neural network to be in the full -1 to +1 range, don't use sigmoid. I've used neural networks before where I want to have the output be -1 for false, +1 for positive. There I have to use something other than sigmoid, at least for the output layer. In theory I could use other types for the first few layers.
You could always multiply it by 2, then subtract 1. That would give you a -1 to 1 range. It might not be quite what you want, but it should work.
Archistrage