Command Line Time-Series

From Encog Machine Learning Framework
Jump to: navigation, search
Sunspots Time Series

This example shows how to use the Encog Command Line Utility to perform a Time Series prediction. This example also uses the Encog Analyst. Time series prediction is the process where a Machine Learning Method learns to predict numbers in a series. Using Supervised Learning the machine learning method is provided with a Training Set. The Machine Learning Method learns to predict future numbers based on previous ones.

This example will use the Encog Analyst to learn to predict sunspot cycles. This example makes use of the Sunspot Data Set. This data set provides the average sunspots per month over many years.

Contents

Steps for Running the Example

To walk through this example follow these steps. This example has been updated to Encog 3.0.

Step 1: Download the Sunspot Data

Sunspot data can be downloaded from the following URL.

http://solarscience.msfc.nasa.gov/greenwch/spot_num.txt

The NASA site is sometimes unreliable. You can also obtain the file from here. File:Sunspots.csv

You can see a small sample of this data here.

YEAR MON  SSN   DEV
1749  1  58.0  24.1
1749  2  62.6  25.1
1749  3  70.0  26.6
1749  4  55.7  23.6
1749  5  85.0  29.4
1749  6  83.5  29.2
1749  7  94.8  31.1
1749  8  66.3  25.9
1749  9  75.9  27.7
1749 10  75.5  27.7

As you can see there are several attributes provided. The attributes, listed in order, are:

  • Column 1. Year
  • Column 2. Month
  • Column 3. Average number of sunspots this month
  • Column 4. Standard deviation of sunspots this month

We would like to create a Machine Learning Method that will learn to predict the sunspot cycles, when provided with some of the previous months of sunspots. We will attempt to predict column 4, using column 3. Columns 1, 2 and 4 are not useful for prediction. We will divide this training set into a training data set and an evaluation data set. The larger training data set will be used for the Machine Learning Method to learn from. The evaluation data set will be used to test the Machine Learning Method on data that it was not trained with. It is also possible to use cross validation, and use a single data set.

Step 2: Use the Analyst Wizard

Now that you have downloaded the input data, you should use the Encog Analyst Wizard to create an Encog Analyst File. The Encog Analyst File (*.ega) is a script file that tells Encog Analyst how to process your file data. To generate a EGA File execute the command given below.

D:\test>EncogCmd wizard sunspots.csv
Encog 3.0.0(32-bit) Command Line Utility
Copyright 2011 by Heaton Research, Inc. Released under the Apache License

Executing command: wizard
Enter value for [headers] (default=True): t
Enter value for [format] (default=decpnt|comma): decpnt|space
Enter value for [goal] (default=c): r
Enter value for [targetField] (default=): ssn
Enter value for [method] (default=ff):
Enter value for [range] (default=-1t1):
Enter value for [missing] (default=DiscardMissing):
Enter value for [lagWindow] (default=0): 30
Enter value for [leadWindow] (default=0): 1
Enter value for [includeTarget] (default=False): t
Enter value for [normalize] (default=True):
Enter value for [randomize] (default=True): f
Enter value for [segregate] (default=True):
Enter value for [balance] (default=False):
Enter value for [cluster] (default=False):
Analyzing data
Saving analyst file
Done.  Runtime was 00:00:01 (1000ms).

D:\test>

Make sure you include all the settings from above. These settings are explained below.

  • File Format - As you can see from the sample data above, this CSV file is space (or tab) delimited. Choose Decimal Point (USA/English) & Space Separator
  • Machine Learning - Set this to Feedforward or Support Vector Machine. For this example, I will assume you used Feedforward Neural Network.
  • Goal - Set this to Regression.
  • Target Field - Set this to ssn. There are no headers, so we need to identify the field by number. This is the sun spot number field.
  • CSV Headers - Set this field to true (checked), as you can see from the file above, there are headers.
  • Normalization Range - Set to -1 to 1.
  • Lag Count - Set to 30. We will use the last 30 months to predict.
  • Lead Count - Set to 1. We will predict the next month.
  • Include Target in Input - Set to true. The ssn is being used to predict, but it is also what is predicted.
  • Normalize - Set to true. The data should be normalized to consistent ranges. This is necessary for both Neural Networks and Support Vector Machines
  • Randomize - Set to false. This is time series data. Order is very important, we do not want to randomize.
  • Segregate - Set to true. We do want to segregate into training data and evaluation data data.
  • Balance - Set to false. Balancing is most often used with classification. Balancing is not needed in this example, as the data is not unbalanced.
  • Cluster - Set to false. There is nothing to cluster, so clustering is not needed.

Generated Analyst File

Encog Analyst will now generate a EGA File with the same base name as your data file. You should now see two files in the project directory area: sunspots.csv and sunspots.ega.

This shows you the EGA File that was generated by analyzing the Sunspot Data Set. You can see the complete file here. The file text below is not generic, yours should be very similar.

[HEADER]
[HEADER:DATASOURCE]
rawFile=FILE_RAW
sourceFile=
sourceFormat=decpnt|space
sourceHeaders=t
[SETUP]
[SETUP:CONFIG]
allowedClasses=integer,string
csvFormat=decpnt|comma
inputHeaders=t
maxClassCount=50
[SETUP:FILENAMES]
FILE_EVAL_NORM=Sunspots_eval_norm.csv
FILE_EVAL=Sunspots_eval.csv
FILE_RAW=Sunspots.csv
FILE_ML=Sunspots_train.eg
FILE_OUTPUT=Sunspots_output.csv
FILE_NORMALIZE=Sunspots_norm.csv
FILE_TRAINSET=Sunspots_train.egb
FILE_TRAIN=Sunspots_train.csv
[DATA]
[DATA:CONFIG]
goal=regression
[DATA:STATS]
"name","isclass","iscomplete","isint","isreal","amax","amin","mean","sdev"
"year",0,1,1,1,2011,1749,1879.58359822,75.6809068041
"mon",1,1,1,1,12,1,6.4968213605,3.4532785777
"ssn",0,1,0,1,253.8,0,51.8915448188,44.308527387
"dev",0,1,0,1,90.2,0,20.2255880483,11.8458515538
[DATA:CLASSES]
"field","code","name"
"mon","1","1",263
"mon","10","10",262
"mon","11","11",262
"mon","12","12",262
"mon","2","2",263
"mon","3","3",262
"mon","4","4",262
"mon","5","5",262
"mon","6","6",262
"mon","7","7",262
"mon","8","8",262
"mon","9","9",262
[NORMALIZE]
[NORMALIZE:CONFIG]
sourceFile=FILE_TRAIN
targetFile=FILE_NORMALIZE
[NORMALIZE:RANGE]
"name","io","timeSlice","action","high","low"
"year","input",0,"range",1,-1
"year","input",-1,"range",1,-1
"year","input",-2,"range",1,-1
"year","input",-3,"range",1,-1
"year","input",-4,"range",1,-1
"year","input",-5,"range",1,-1
"year","input",-6,"range",1,-1
"year","input",-7,"range",1,-1
"year","input",-8,"range",1,-1
"year","input",-9,"range",1,-1
"year","input",-10,"range",1,-1
"year","input",-11,"range",1,-1
"year","input",-12,"range",1,-1
"year","input",-13,"range",1,-1
"year","input",-14,"range",1,-1
"year","input",-15,"range",1,-1
"year","input",-16,"range",1,-1
"year","input",-17,"range",1,-1
"year","input",-18,"range",1,-1
"year","input",-19,"range",1,-1
"year","input",-20,"range",1,-1
"year","input",-21,"range",1,-1
"year","input",-22,"range",1,-1
"year","input",-23,"range",1,-1
"year","input",-24,"range",1,-1
"year","input",-25,"range",1,-1
"year","input",-26,"range",1,-1
"year","input",-27,"range",1,-1
"year","input",-28,"range",1,-1
"year","input",-29,"range",1,-1
"mon","input",0,"equilateral",1,-1
"mon","input",-1,"equilateral",1,-1
"mon","input",-2,"equilateral",1,-1
"mon","input",-3,"equilateral",1,-1
"mon","input",-4,"equilateral",1,-1
"mon","input",-5,"equilateral",1,-1
"mon","input",-6,"equilateral",1,-1
"mon","input",-7,"equilateral",1,-1
"mon","input",-8,"equilateral",1,-1
"mon","input",-9,"equilateral",1,-1
"mon","input",-10,"equilateral",1,-1
"mon","input",-11,"equilateral",1,-1
"mon","input",-12,"equilateral",1,-1
"mon","input",-13,"equilateral",1,-1
"mon","input",-14,"equilateral",1,-1
"mon","input",-15,"equilateral",1,-1
"mon","input",-16,"equilateral",1,-1
"mon","input",-17,"equilateral",1,-1
"mon","input",-18,"equilateral",1,-1
"mon","input",-19,"equilateral",1,-1
"mon","input",-20,"equilateral",1,-1
"mon","input",-21,"equilateral",1,-1
"mon","input",-22,"equilateral",1,-1
"mon","input",-23,"equilateral",1,-1
"mon","input",-24,"equilateral",1,-1
"mon","input",-25,"equilateral",1,-1
"mon","input",-26,"equilateral",1,-1
"mon","input",-27,"equilateral",1,-1
"mon","input",-28,"equilateral",1,-1
"mon","input",-29,"equilateral",1,-1
"ssn","input",0,"range",1,-1
"ssn","input",-1,"range",1,-1
"ssn","input",-2,"range",1,-1
"ssn","input",-3,"range",1,-1
"ssn","input",-4,"range",1,-1
"ssn","input",-5,"range",1,-1
"ssn","input",-6,"range",1,-1
"ssn","input",-7,"range",1,-1
"ssn","input",-8,"range",1,-1
"ssn","input",-9,"range",1,-1
"ssn","input",-10,"range",1,-1
"ssn","input",-11,"range",1,-1
"ssn","input",-12,"range",1,-1
"ssn","input",-13,"range",1,-1
"ssn","input",-14,"range",1,-1
"ssn","input",-15,"range",1,-1
"ssn","input",-16,"range",1,-1
"ssn","input",-17,"range",1,-1
"ssn","input",-18,"range",1,-1
"ssn","input",-19,"range",1,-1
"ssn","input",-20,"range",1,-1
"ssn","input",-21,"range",1,-1
"ssn","input",-22,"range",1,-1
"ssn","input",-23,"range",1,-1
"ssn","input",-24,"range",1,-1
"ssn","input",-25,"range",1,-1
"ssn","input",-26,"range",1,-1
"ssn","input",-27,"range",1,-1
"ssn","input",-28,"range",1,-1
"ssn","input",-29,"range",1,-1
"dev","input",0,"range",1,-1
"dev","input",-1,"range",1,-1
"dev","input",-2,"range",1,-1
"dev","input",-3,"range",1,-1
"dev","input",-4,"range",1,-1
"dev","input",-5,"range",1,-1
"dev","input",-6,"range",1,-1
"dev","input",-7,"range",1,-1
"dev","input",-8,"range",1,-1
"dev","input",-9,"range",1,-1
"dev","input",-10,"range",1,-1
"dev","input",-11,"range",1,-1
"dev","input",-12,"range",1,-1
"dev","input",-13,"range",1,-1
"dev","input",-14,"range",1,-1
"dev","input",-15,"range",1,-1
"dev","input",-16,"range",1,-1
"dev","input",-17,"range",1,-1
"dev","input",-18,"range",1,-1
"dev","input",-19,"range",1,-1
"dev","input",-20,"range",1,-1
"dev","input",-21,"range",1,-1
"dev","input",-22,"range",1,-1
"dev","input",-23,"range",1,-1
"dev","input",-24,"range",1,-1
"dev","input",-25,"range",1,-1
"dev","input",-26,"range",1,-1
"dev","input",-27,"range",1,-1
"dev","input",-28,"range",1,-1
"dev","input",-29,"range",1,-1
"ssn","output",1,"range",1,-1
[RANDOMIZE]
[RANDOMIZE:CONFIG]
sourceFile=
targetFile=
[CLUSTER]
[CLUSTER:CONFIG]
clusters=
sourceFile=
targetFile=
type=
[BALANCE]
[BALANCE:CONFIG]
balanceField=
countPer=
sourceFile=
targetFile=
[SEGREGATE]
[SEGREGATE:CONFIG]
sourceFile=FILE_RAW
[SEGREGATE:FILES]
"file","percent"
"FILE_TRAIN",75
"FILE_EVAL",25
[GENERATE]
[GENERATE:CONFIG]
sourceFile=FILE_NORMALIZE
targetFile=FILE_TRAINSET
[ML]
[ML:CONFIG]
architecture=?:B->TANH->19:B->TANH->?
evalFile=FILE_EVAL
machineLearningFile=FILE_ML
outputFile=FILE_OUTPUT
trainingFile=FILE_TRAINSET
type=feedforward
[ML:TRAIN]
arguments=
cross=
targetError=0.01
type=rprop
[TASKS]
[TASKS:task-cluster]
cluster
[TASKS:task-create]
create
[TASKS:task-evaluate]
evaluate
[TASKS:task-evaluate-raw]
set ML.CONFIG.evalFile="FILE_EVAL_NORM"
set NORMALIZE.CONFIG.sourceFile="FILE_EVAL"
set NORMALIZE.CONFIG.targetFile="FILE_EVAL_NORM"
normalize
evaluate-raw
[TASKS:task-full]
segregate
normalize
generate
create
train
evaluate
[TASKS:task-generate]
segregate
normalize
generate
[TASKS:task-train]
train

Now that the EGA file has been generated the wizard will make several changes to it in the next section. For more information on the format of this file, see the article on EGA Files and Encog Analyst.

Step 3: Modifications to the EGA File

Sometimes your EGA file will need no modifications. This is not the case with the Sunspot example.

The month (mon) was treated as a class. We can remove its class definition. We will not be using the month field. Modify the data section to look like this.

[DATA]
[DATA:CONFIG]
goal=regression
[DATA:STATS]
"name","isclass","iscomplete","isint","isreal","amax","amin","mean","sdev"
"year",0,1,1,1,2011,1749,1879.58359822,75.6809068041
"mon",0,1,1,1,12,1,6.4968213605,3.4532785777
"ssn",0,1,0,1,253.8,0,51.8915448188,44.308527387
"dev",0,1,0,1,90.2,0,20.2255880483,11.8458515538
[DATA:CLASSES]
"field","code","name"
[NORMALIZE]

Make sure you delete the classes and modify the 1 following mon to be a 0.

Remove year, mon and dev from the normalized fields. This section should now look like this.

[NORMALIZE:RANGE]
"name","io","timeSlice","action","high","low"
"ssn","input",0,"range",1,-1
"ssn","input",-1,"range",1,-1
"ssn","input",-2,"range",1,-1
"ssn","input",-3,"range",1,-1
"ssn","input",-4,"range",1,-1
"ssn","input",-5,"range",1,-1
"ssn","input",-6,"range",1,-1
"ssn","input",-7,"range",1,-1
"ssn","input",-8,"range",1,-1
"ssn","input",-9,"range",1,-1
"ssn","input",-10,"range",1,-1
"ssn","input",-11,"range",1,-1
"ssn","input",-12,"range",1,-1
"ssn","input",-13,"range",1,-1
"ssn","input",-14,"range",1,-1
"ssn","input",-15,"range",1,-1
"ssn","input",-16,"range",1,-1
"ssn","input",-17,"range",1,-1
"ssn","input",-18,"range",1,-1
"ssn","input",-19,"range",1,-1
"ssn","input",-20,"range",1,-1
"ssn","input",-21,"range",1,-1
"ssn","input",-22,"range",1,-1
"ssn","input",-23,"range",1,-1
"ssn","input",-24,"range",1,-1
"ssn","input",-25,"range",1,-1
"ssn","input",-26,"range",1,-1
"ssn","input",-27,"range",1,-1
"ssn","input",-28,"range",1,-1
"ssn","input",-29,"range",1,-1
"ssn","output",1,"range",1,-1
[RANDOMIZE]

The final version of the EGA File is shown here.

[HEADER]
[HEADER:DATASOURCE]
rawFile=FILE_RAW
sourceFile=
sourceFormat=decpnt|space
sourceHeaders=t
[SETUP]
[SETUP:CONFIG]
allowedClasses=integer,string
csvFormat=decpnt|comma
inputHeaders=t
maxClassCount=50
[SETUP:FILENAMES]
FILE_EVAL_NORM=sunspots_eval_norm.csv
FILE_EVAL=sunspots_eval.csv
FILE_RAW=sunspots.csv
FILE_ML=sunspots_train.eg
FILE_OUTPUT=sunspots_output.csv
FILE_NORMALIZE=sunspots_norm.csv
FILE_TRAINSET=sunspots_train.egb
FILE_TRAIN=sunspots_train.csv
[DATA]
[DATA:CONFIG]
goal=regression
[DATA:STATS]
"name","isclass","iscomplete","isint","isreal","amax","amin","mean","sdev"
"year",0,1,1,1,2011,1749,1879.7504761905,75.7774977578
"mon",0,1,1,1,12,1,6.4942857143,3.4520478
"ssn",0,1,0,1,253.8,0,51.8854920635,44.2817275261
"dev",0,1,0,1,90.2,0,20.2241587302,11.8395416514
[DATA:CLASSES]
"field","code","name"
[NORMALIZE]
[NORMALIZE:CONFIG]
missingValues=DiscardMissing
sourceFile=FILE_TRAIN
targetFile=FILE_NORMALIZE
[NORMALIZE:RANGE]
"name","io","timeSlice","action","high","low"
"ssn","input",0,"range",1,-1
"ssn","input",-1,"range",1,-1
"ssn","input",-2,"range",1,-1
"ssn","input",-3,"range",1,-1
"ssn","input",-4,"range",1,-1
"ssn","input",-5,"range",1,-1
"ssn","input",-6,"range",1,-1
"ssn","input",-7,"range",1,-1
"ssn","input",-8,"range",1,-1
"ssn","input",-9,"range",1,-1
"ssn","input",-10,"range",1,-1
"ssn","input",-11,"range",1,-1
"ssn","input",-12,"range",1,-1
"ssn","input",-13,"range",1,-1
"ssn","input",-14,"range",1,-1
"ssn","input",-15,"range",1,-1
"ssn","input",-16,"range",1,-1
"ssn","input",-17,"range",1,-1
"ssn","input",-18,"range",1,-1
"ssn","input",-19,"range",1,-1
"ssn","input",-20,"range",1,-1
"ssn","input",-21,"range",1,-1
"ssn","input",-22,"range",1,-1
"ssn","input",-23,"range",1,-1
"ssn","input",-24,"range",1,-1
"ssn","input",-25,"range",1,-1
"ssn","input",-26,"range",1,-1
"ssn","input",-27,"range",1,-1
"ssn","input",-28,"range",1,-1
"ssn","input",-29,"range",1,-1
"ssn","output",1,"range",1,-1
[RANDOMIZE]
[RANDOMIZE:CONFIG]
sourceFile=
targetFile=
[CLUSTER]
[CLUSTER:CONFIG]
clusters=
sourceFile=
targetFile=
type=
[BALANCE]
[BALANCE:CONFIG]
balanceField=
countPer=
sourceFile=
targetFile=
[SEGREGATE]
[SEGREGATE:CONFIG]
sourceFile=FILE_RAW
[SEGREGATE:FILES]
"file","percent"
"FILE_TRAIN",75
"FILE_EVAL",25
[GENERATE]
[GENERATE:CONFIG]
sourceFile=FILE_NORMALIZE
targetFile=FILE_TRAINSET
[ML]
[ML:CONFIG]
architecture=?:B->TANH->19:B->TANH->?
evalFile=FILE_EVAL
machineLearningFile=FILE_ML
outputFile=FILE_OUTPUT
trainingFile=FILE_TRAINSET
type=feedforward
[ML:TRAIN]
arguments=
cross=
targetError=0.05
type=rprop
[TASKS]
[TASKS:task-cluster]
cluster
[TASKS:task-create]
create
[TASKS:task-evaluate]
evaluate
[TASKS:task-evaluate-raw]
set ML.CONFIG.evalFile="FILE_EVAL_NORM"
set NORMALIZE.CONFIG.sourceFile="FILE_EVAL"
set NORMALIZE.CONFIG.targetFile="FILE_EVAL_NORM"
normalize
evaluate-raw
[TASKS:task-full]
segregate
normalize
generate
create
train
evaluate
[TASKS:task-generate]
segregate
normalize
generate
[TASKS:task-train]
train

Step 4: Execute the Analyst Script

Now that the EGA File has been created, you can execute it. The following command executes the analyst script.

D:\test>EncogCmd analyst sunspots.ega
Encog 3.0.0(32-bit) Command Line Utility
Copyright 2011 by Heaton Research, Inc. Released under the Apache License

Executing command: analyst

Beginning Task#1/6 : segregate
1 : Analyzing
3146/3146 : Done analyzing
1/3146 : Processing
3146/3146 : Done processing
Task segregate completed, task elapsed time 00:00:02

Beginning Task#2/6 : normalize
1 : Processing
0 : Done processing
Task normalize completed, task elapsed time 00:00:06

Beginning Task#3/6 : generate
Task generate completed, task elapsed time 00:00:08

Beginning Task#4/6 : create
Task create completed, task elapsed time 00:00:09

Beginning Task#5/6 : train
Iteration #1 Error:39.257923% elapsed time = 00:00:09
Iteration #2 Error:19.048824% elapsed time = 00:00:10
Iteration #3 Error:4.779346% elapsed time = 00:00:10
Task train completed, task elapsed time 00:00:10

Beginning Task#6/6 : evaluate
1 : Analyzing
787/787 : Done analyzing
1/787 : Processing
787/787 : Done processing
Task evaluate completed, task elapsed time 00:00:10
Done.  Runtime was 00:00:10 (10889ms).

D:\test>

This will perform several steps. Click the Execute button from the EGA File Editor, that was opened in Step 2. This takes the data through 6 steps. There may be more, or fewer steps, for other Encog Analyst projects, depending on what options are chosen. The entire execution should take under a minute on most computers.

This process will also create a number of files. The complete list of files, in this project is:

  • sunspots.csv - The raw data.
  • sunspots.ega - The EGA File. This is the Encog Analyst script.
  • sunspots_eval.csv - The evaluation data.
  • sunspots_norm.csv - The normalized version of sunspots_train.csv.
  • sunspots_output.csv - The output from running sunspots_eval.csv.
  • sunspots_train.csv - The training data.
  • sunspots_train.eg - The Machine Learning Method that was trained.
  • sunspots_train.egb - The binary training data, created from sunspots_norm.egb.

Step 5: Examine the Output

To see how well the newly trained Machine Learning Method performed, examine sunspots_output.csv. You can see part of this file here.

"year","mon","ssn","dev","Output:ssn(t+1)"
1945,8,25.9,19.8,null
1945,9,34.9,20.6,null
1945,10,68.8,17.7,null
1945,11,46.0,11.1,null
1945,12,27.4,10.2,null
1946,1,47.6,26.2,null
1946,2,86.2,22.0,null
1946,3,76.6,19.1,null
1946,4,75.7,21.4,null
1946,5,84.9,39.4,null
1946,6,73.5,27.1,null
1946,7,116.2,30.5,null
1946,8,107.2,19.4,null
1946,9,94.4,30.9,null
1946,10,102.3,29.7,null
1946,11,123.8,30.8,null
1946,12,121.7,26.1,null
1947,1,115.7,49.0,null
1947,2,133.4,39.4,null
1947,3,129.8,50.8,null
1947,4,149.8,53.5,null
1947,5,201.3,60.9,null
1947,6,163.9,50.9,null
1947,7,157.9,31.9,null
1947,8,188.8,74.3,null
1947,9,169.4,50.8,null
1947,10,163.6,56.3,null
1947,11,128.0,48.7,null
1947,12,116.5,20.1,null
1948,1,108.5,17.2,null
1948,2,86.1,17.9,159.6885674362
1948,3,94.8,32.8,125.3623887972
1948,4,189.7,27.1,132.9868693205
1948,5,174.0,69.3,158.1542987258
1948,6,167.8,26.6,107.9866861177
1948,7,142.2,28.3,141.0083318548
1948,8,157.9,35.3,96.525206343
1948,9,143.3,55.9,114.9390769972
1948,10,136.3,44.9,86.7132162749
1948,11,95.8,21.8,93.4707737339
1948,12,138.0,46.2,94.1546579071
1949,1,119.1,29.6,121.825714721
1949,2,182.3,34.7,127.9518746504
1949,3,157.5,31.6,112.2009100841
1949,4,147.0,22.7,80.3982728611
1949,5,106.2,22.1,79.1085177763
1949,6,121.7,41.3,70.294781526
1949,7,125.8,44.6,88.0089099047
1949,8,123.8,47.9,80.6791463643
1949,9,145.3,37.3,130.0433476074
1949,10,131.6,47.8,124.4394206065
1949,11,143.5,18.9,107.3720144043

The output from the Machine Learning Method is the far-right column. This is data is the evaluation data set, which is data that the Machine Learning Method was not trained on. Notice the first 30 lines have no output. Because we have lead of 30 months, no prediction can occur until we have that much input data. The expected output is the third column. The program does not predict the exact number of sunspots. However, it does predict the cycles, the rises and falls, very well.

Understanding the Example

This is an example of time series. The output from the Machine Learning Method is the next number in a series, this is the average number of sunspots for the following month. In this example we used a neural network, however, a Support Vector Machine works well too. From a purely "black box" standpoint, a Support Vector Machine is very similar to a Neural Network. Both accept input data and produce output data. For time series, the input and output of a Support Vector Machine is identical to a Neural Network.

External Links

The completed example can be downloaded here.

Personal tools