Weight

From Encog Machine Learning Framework
(Redirected from Weights)
Jump to: navigation, search

In this article we look at the structure of the Encog Flat Network. The Encog flat network is how Encog represents most neural networks, at the lowest level. This allows Encog to process neural networks with maximum speed. It also keeps the neural networks in a form that they can easily be sent to a GPU.

A flat network stores the weight matrix as one long array. All layers share the same array. Your neural network is simply represented as one array of double numbers. You can easily see this array by calling the method DumpWeights(), provided on the BasicNetwork class. You can programatecially access this array using the NetworkCODEC. NetworkCODEC allows a weight array to be copied to/from a BasicNetwork.

Usually, the exact structure of the weight array, used by NetworkCODEC, is unimportant to an Encog programmer. The array is simply a quick way to store a network in memory, so you can later restore it. Or perhaps use a genetic algorithm or simulated annealing to modify the weights. However, if the exact structure is important, this article will be useful. This article describes the flat way a flat network stores the weights.

This document is current as of Encog 3.0. It is unlikely that the flat weight structure will change in future versions of Encog. However, it did change format slightly with Encog 2.5. This was a performance improvement to better support bias neurons.

Contents

Weight Initialization

Weights are generally initialized to random numbers. These random numbers are typically generated using a Linear Congruential Generator. However, there are several means for doing this.

Weight Initialization Performance

Your choice of a weight initialization method will have an impact on how fast your neural network will train. The following is an output from an Encog benchmark program to compare the different methods of weight initialization.

Average iterations needed (lower is better)
Range random: 529.82
Nguyen-Widrow: 366.98
Fan-In: 528.38
Gaussian: 615.6

Feedforward Structure

We will first look at a feedforward neural network. The example neural network has the following attributes.

Input Layer: 2 neurons, 1 bias
Hidden Layer: 2 neurons, 1 bias
Output Layer: 1 neuron

This gives this network a total of 7 neurons.

These neurons are numbered as followed. This is the order that the FlatNetwork.LayerOutput property stores the network output into.

Neuron 0: Output 1
Neuron 1: Hidden 1
Neuron 2: Hidden 2
Neuron 3: Bias 2 (set to 1, usually)
Neuron 4: Input 1
Neuron 5: Input 2
Neuron 6: Bias 1 (set to 1, usually)

Graphically, you can see the network as follows.

Method-ff-2.png

The flatnetwork keeps several index values, to allow it to quickly navigate the flat network. These are listed here.

contextTargetOffset: [0, 0, 0]
contextTargetSize: [0, 0, 0]
layerFeedCounts: [1, 2, 2]
layerCounts: [1, 3, 3]
layerIndex: [0, 1, 4]
layerOutput: [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0]
weightIndex: [0, 3, 9]

This is the structure of the flat weights.

Weight 0: H1->O1
Weight 1: H2->O1
Weight 2: B2->O1
Weight 3: I1->H1
Weight 4: I2->H1
Weight 5: B1->H1
Weight 6: I1->H2
Weight 7: I2->H2
Weight 8: B1->H2


SRN (Elman) Structure

Next we will look at the simple recurrent neural network, in this case an Elman network. This network has the following attributes.

Input Layer: 1 neurons, 2 context, 1 bias
Hidden Layer: 2 neurons, 1 bias
Output Layer: 1 neuron

This gives the network a total of 8 neurons. These neurons are numbered as followed. This is the order that the FlatNetwork.LayerOutput property stores the network output into.

Neuron 0: Output 1
Neuron 1: Hidden 1
Neuron 2: Hidden 2
Neuron 3: Bias 2  (set to 1, usually)
Neuron 4: Input 1
Neuron 5: Bias 1  (set to 1, usually)
Neuron 6: Context 1
Neuron 7: Context 2

Graphically, you can see the network as follows.

Method-elman-1.png

The flatnetwork keeps several index values, to allow it to quickly navigate the flat network. These are listed here.

contextTargetOffset: [0, 6, 0]
contextTargetSize: [0, 2, 0]
layerFeedCounts: [1, 2, 1]
layerCounts: [1, 3, 4]
layerIndex: [0, 1, 4]
layerOutput: [0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0]
[0.9822812812031464, 0.5592674561520378, 0.9503503698749067, 1.0, 0.0, 1.0, 0.9822812812031464, 0.2871920274550327]
weightIndex: [0, 3, 11]

This is the structure of the flat weights.

Weight 0: H1->O1
Weight 1: H2->O1
Weight 2: B2->O1
Weight 3: I1->H1
Weight 4: I2->H1
Weight 5: B1->H1
Weight 6: I1->H2
Weight 7: I2->H2
Weight 8: B1->H2

Training

Training is the process where a neural network's weights are adjusted to produce the desired output. A trained neural network will typically have an appearance as shown here. The weights will be in a small distribution to zero. Weights-trained.png

Personal tools