I am using Encog to implement NN and SVM-based bioinformatics analyses. For a particular bioinformatics sub-field, the gold standard for comparing two binary classification models is the area under an ROC curve (AUC) and I must report this value for publications.
AUC requires a ranked-order of classification predictions. This is not a problem for NN since confidence metrics are reported. For SVM, the only reported values are 0 or 1. While this is sufficient for metrics such as F1 score, it is not for AUC.
LIBSVM has a parameter option (-b 1) to "obtain a model with probability information and predict test data with probability estimates." The org.encog.ml.svm.SVM constructor creates a default org.encog.mathutil.libsvm.svm_parameter object with the probability parameter set to 0. There is currently no way to set custom SVM parameters before training.
I think it would be a very valuable addition to Encog to provide an option to use probability estimates for SVM training and evaluation. I envision that a user could set the svm_parameter (or at least change the probability parameter) and the probability results would be returned as an array from MLData.getData(). It would be important for the persisted SVM model to retain the ability to return probabilities when used to classify new data.
Would there be support to include this enhancement in a future version of Encog? I would be happy to help test the new feature.