Encog performance related to Matrix operations

Tomer.Gal's picture

Hi Jeff,
I've bought your C# Neural Networks book, I think that you're written a very good & useful book.

I have a M.Sc in Computer Science and have worked at Intel for 4 years.

Due to my familiarity with optimizations (software & hardware), I have a note about performance related to Encog:

Let's say that M is a matrix and Mt is the transposed matrix.
Then, Vector * M should be slower than Mt * Vector.
Vector * M = The vector multiplied by the COLUMNS of the matrix
Mt * Vector = The ROWS of a matrix multiplied by the vector.

The reason is related to hardware, the cache.
Let's say you have a 128x128 matrix, I'll give you an example of accessing a single row vs. a single column:
When accessing a row, you'll have a cache miss for the first array cell, but accessing the other array cells will result in cache hits as the cache line was brought from memory after the first hit.
So, 1 miss, 127 hits.

When accessing a column, almost all of your accesses will result in a miss as each of your data set sits on a different row, therefore a different cache line.
So, 128 MISSES.

That's a huge performance difference, (1 miss, 127 hits) vs. (0 hits, 128 misses).
I have noticed that for Encog you've set neuron 1 for column 1 of the matrix and etc.
So, your multiplication is that of a Vector * M, which should be the less preferred solution (performance wise).

Check it out using some simple benchmark, if the penalty is big and a swap is needed then it's better to represent the neurons inside the weigh matrix as rows and not columns.
Basically, a transpose of the current representation.

The 2nd note... Have you thought about GPU acceleration?

Regards,
Tomer Gal

jeffheaton's picture

I have noticed something along those lines, and was wondering if it might be more optimal to swap the rows and columns for the Encog weight matrixes. I will look more into this, it would not be a hard change.

As to GPU, that is interesting. I've read some about it. It is something I would like to see Encog get more into, but it is not something I know a great deal about. Specifically how to actually request that the GPU perform matrix operations using Java or C#.

Jeff

Tomer.Gal's picture

Hi Jeff,
This link should interest you:
http://www.codeproject.com/KB/graphics/GPUNN.aspx

"The GPU version speeds up by 270 times compared to CPU version...."

jeffheaton's picture

That looks very interesting, I will be reading more about this. I am very interested in adding GPU support for both Encog Java and C#.

theron92's picture

Sorry I couldn't understand it. I am a student of business faculty. So, I have no knowledge of engineering level.

jeffheaton's picture

Do you not understand? You need to be more specific. Basically we are offloading some of the work, that would normally be done by the CPU, to the graphics card. The graphics card actually has many CPU's on it.

Thanks...

Jeff


Copyright 2005 - 2012 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright, license and trademark information.