OpenCL/GPU Processing

jeffheaton's picture

I am making some progress on GPU programming. I just finished my first neural network that executes totally inside of a GPU. This is just the beginning, and it is somewhat "rough" but it does execute on the GPU! Once I have it polished a bit more I will check it into Encog.

The next big step will be to get the gradient calculation working with the GPU. I am really excited to see what this will do. Most graphics cards have at least a hundred, or so, stream processors. I should be able to literally throw every training set element off and let the stream processors tackle them simultaneously. This could really speed up Encog training. No idea by how much at this point.

OpenCL is really pretty interesting. Very much "C" based. C was my third programming language, after BASIC and 6510 assembly, so I am finding OpenCL to be reasonably easy to program.

jeffheaton's picture
soupbone's picture

Are you using any sort of "threading" or concurrency on the GPU? It has multiple processors, right?

jeffheaton's picture

Using full threading on the GPU. So for my lower-end(but modern) card we have probably 100 stream processors working on it.

Graphics cards are actually fairly interesting. You have to thread, because they have a very large number of low-powered processors. I am not totally sure how fast a single stream processor is, but they do not feel all that powerful. But you are often dealing with over 100 of them! So threading becomes very important.

OpenCL is interesting too. Threading is very much build into the language, way moreso than Java or C#. However, OpenCL is primitive. Non OOP. Very much like old-school C programming. All 1D arrays.

juanchoc's picture

Some high end graphics cards have around 1600 stream processors (price around U$S 500), so you can expect 16x better performance if you get one of those.

jeffheaton's picture

Thanks for the info.

I will probably upgrade in a month or two, once I have the basics of this working. I am not much of a gamer, so I never paid too much attention to the latest graphics card. But these stream processors can just tear through my matrix math! So I will very likely upgrade soon.

I would also like to have two just so I can test with that setup. Since there a few extra steps I have to go through if I want to spread the workload over my CPU cores and multiple GPU's. It will also help in my own neural network research.

Plus I will likely write an article about GPU programming in the near future.

I find this really interesting.

http://fastra.ua.ac.be/en/specs.html

They pretty much built a near-super computer from off the shelf parts. Using four high-end dual video cards. 8 GPU's total! All for around 4,000 Euros. They claim it outperforms a cluster of 100 PC's!!


Copyright 2005 - 2012 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright, license and trademark information.