Problems in SOM Training in a Devnagari(Nepali) OCR
I'm trying to develop an (Devnagari) Nepali OCR application. Line Detection, Word Detection, and Character segmentation is successfully done.NN is SOM, for this application. I trained with sample data like belows :
प:11111100011000111111000010000100001
म:11111010010100111111010010000100001
न:11111000011111110001100010000100001
स:11111010011111110111110010110100001
व:11111011111101110011111110000100001
ल:11111110011111110101110010100100001
क:11111001001111110101111010010000100
त:11111000011111110001110010110100101
त्र:11111111110001101111110010000100001
But The problem arising is, the characters are not correctly determined at recognition level. I looked through your OCR example and derived it for devnagari character recognition.
Here are some screenshots you may look at....
http://sujandhakal.com.np/view/a-nepali-ocr-:-problems/2-12-42.html
what may be the possible problem and how can i fix them..please reply soon...




I've thought of trying to write an OCR app for Marathi, also Devnagari, but have not tried. Done correctly, the horizontal line could be quite useful for picking up lines of text.
My guess is that you might be downsampling too far. Devnagari is more complex than the latin alphabet that this example was written with, you might be losing too much detail in the downsample.
ok let me try it by changing the downsampling width and height...one question... can we use feed forward network for OCR....i guess not....but what do u suggest?....My work is to compare Feed Forward Back Propagation NN and SOM(Kohenon) in a OCR, so please reply ok...