Problems with Normalization

mos's picture

I tried to segregate the first 30 rows of data contained in a csv file (length 100 rows). I used the following code.

BasicNeuralDataSet target = new BasicNeuralDataSet();
DataNormalization norm = new DataNormalization();
norm.setTarget(new NormalizationStorageNeuralDataSet(target));
norm.addSegregator(new IndexRangeSegregator(0,30));

It didn't work. Target is emtpy. Without adding the segregator it works fine (Normalization of 100 rows).

In addition I didn't succeed in assigning outputFields to ideal data. I used the following code:

outputField[i].setIdeal(true);
norm.addOutputField(outputField[i]);

When iterating over the target (via target.getData().iterator() and getIdeal()) returns null.

jeffheaton's picture

Here is your code, modified to work on my system. I created test.csv, which contains two input fields and one ideal field.

BasicNeuralDataSet target = new BasicNeuralDataSet();
DataNormalization norm = new DataNormalization();
norm.setTarget(new NormalizationStorageNeuralDataSet(target));
norm.addSegregator(new IndexRangeSegregator(0,30));
InputField a,b,c;
norm.addInputField(a=new InputFieldCSV(true,new File("d:\\test.csv"),0));
norm.addInputField(b=new InputFieldCSV(true,new File("d:\\test.csv"),1));
norm.addInputField(c=new InputFieldCSV(false,new File("d:\\test.csv"),2));
norm.addOutputField(new OutputFieldDirect(a));
norm.addOutputField(new OutputFieldDirect(b));
norm.addOutputField(new OutputFieldDirect(c));
norm.process();
System.out.println(target.getRecordCount());

Anton's picture

Thanks, this realy works.

I tried it this way and it doesn't work with the IndexRangeSegregator, but with the IndexSampleSegregator it works, why some Segragotor works, other don't. I use some other classes (InputFieldNeuralDataSet, NormalizationStorageNeuralDataSet, other constructor: with this other constructor the dataset can be configured for input and ideal count.

This code works for IndexSampleSegregator but not work IndexRangeSegregator.


/**
* Normalize the dataset
* @param data dataset to normalize
* @param report add a report
* @param segregator add a segregator
* @return the normalied dataset
*/
private static NeuralDataSet normalize(final NeuralDataSet data, StatusReportable report, Segregator segregator) {
int inputCount = data.getInputSize();
int idealCount = data.getIdealSize();
int sum = inputCount + idealCount;

DataNormalization norm = new DataNormalization();
// norm.addSegregator(new IndexRangeSegregator(2,4)); // does not work ! it skill all rows
if (report != null) {
norm.setReport(report);
}

InputFieldNeuralDataSet[] toNorm = new InputFieldNeuralDataSet[sum];
OutputFieldRangeMapped[] normed = new OutputFieldRangeMapped[sum];

for (int i = 0; i < sum; i++) {
if (i < inputCount) {
toNorm[i] = new InputFieldNeuralDataSet(true, data, i);
} else {
toNorm[i] = new InputFieldNeuralDataSet(false, data, i);
}

normed[i] = new OutputFieldRangeMapped(toNorm[i], 0, 1);
norm.addInputField(toNorm[i]);
norm.addOutputField(normed[i]);
}
NormalizationStorageNeuralDataSet resultDataSet = new NormalizationStorageNeuralDataSet(inputCount, idealCount);
norm.setTarget(resultDataSet);
if (segregator != null) {
norm.addSegregator(segregator);
}
norm.process();
return resultDataSet.getDataset();
}

I have another question: Can I exclude some fields for normalization, when I would normalize only the second field/column, but not the first?
// edit: I fonud in the other thread this snippet of you:

OutputFieldRangeMapped mpgField = (OutputFieldRangeMapped)(List)norm.getOutputFields()).get(0);
mpgField.convertBack(output.getData(0));

But I don't know where I get the output object, I only have a NeuralDataSet / BasicNeuralDataSet


Copyright 2005 - 2012 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright, license and trademark information.