Equilateral Encoding is a means by which data made up of classes can be normalized into an array of floating point values. Equilateral encoding accomplishes the same goal as oneofn/onehot/dummy variables. That is, it encodes nominal values (classes) for input into machine learning. It has been reported to sometimes give better results than oneofn encoding (Masters, 1993).
Equilateral encoding is somewhat rare compared to the much more popular dummy variable/onehot encoding methods commonly used today. I frequently used equilateral encoding in some of my earlier works. Because I have not seen a substantial improvement with Equilateral, I generally stick with the more common onehot encoding for categorical data. Equilateral is generally reported to bring two main features to the table:
• Requires one fewer output than oneofn
• Spreads the “blame” of error across more neurons than oneofn
Equilateral encoding uses one fewer output than oneofn. This means that if you have ten categories to encode, oneofn will require ten outputs while equilateral will require only nine. This gives you a slight performance boost that might have been more important in 1991 than 2017.
During training, the output neurons are constantly checked against the ideal output values provided in the training set. The error between the actual output and the ideal output is represented by a delta. This limits the amount of neurons that contribute to an incorrect answer for onehot encoding. We will look at a case where a neural network must deal with a class of size 7. For this example we will normalized between 0 and 1. If the ideal were class one and the actual class two we would have the following:


Only two of the output neurons are incorrect. Yet the entire group of neurons are part of the answer. Equilateral encoding seeks to spread the “guilt” for this error over more of the neurons. To do this, we must come up with a unique set of values for each. Each set of values should have an equal Euclidean Distance from the others. The equal distance makes sure that incorrectly choosing class 0 for class 6 has the same error weight as choosing class 1 for class 3. Equilateral encoding produces a lookup table of (n1) values for every (n) of classes. For example, the encoding table for this 7class example is as follows:
 Class #1: [0.118 0.28 0.34 0.38 0.40 0.41]
 Class #2: [0.88 0.28 0.34 0.38 0.4 0.42]
 Class #3: [0.5 0.94 0.34 0.38 0.4 0.42]
 Class #4: [0.5 0.5 0.97 0.38 0.40 0.42]
 Class #5: [0.5 0.5 0.5 0.98 0.4 0.42]
 Class #6: [0.5 0.5 0.5 0.5 0.99 0.42]
 Class #7: [0.5 0.5 0.5 0.5 0.5 1]
Applying this to the example above, we now have the following.


Unlike the onehot encoding example above, more neurons are now incorrect.
I originally learned of equilateral encoding from the following two sources. Guiver is the creator of the algorithm. I have never been able to find Guiver’s original article. I learned of this method from Masters. To see how to calculate the values for equilateral encoding, you can see my Javascript example.
 Masters, T. (1993). Practical neural network recipes in C++. Morgan Kaufmann.
 Guiver, J. P., & Klimasauskas, C. C. (1991). Applying neural networks, Part IV: improving performance. PC AI Magazine<, 5, 3441.