You are here

Text lemmatization problem

Hi, Last day I stumbled upon a book "Neural networks for Java" and I was wondering if it was feasible to use neural networks to lemmatize words. The target language is inflectionally rich so it is hard to find lemmas by simple "classical programming" rules.
I have a 100k word/lemma/MSD corpus (MSD = morphosyntactic description = POS tag). The lemmas of the words highly depend on morphology of the word (i.e. type, gender, case, number, subtype etc..).
I have a trained POS tagger, so an unknown word together with its POS tag would be an input, and a word lemma an output.
Which neural network architecture would be best to try? How would I structure input?

Any suggestions/ideas/hints/lookups are more than welcome!
Thanks in advance!

Nikola

Neural Network Forums: 

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer