Amazon EC2 Encog Performance
Amazon EC2 provides a number of different performance tiers. I've bench marked Encog on a number of these tiers to see how well Encog Multitasking scales up. So far the results have been quite good. Encog does not yet have scalable GPU processing, so these numbers are just using CPU power. At least for now!
Encog on an Amazon EC2 c1.xlarge
The c1.xlarge is similar to my desktop computer, in terms of its CPU capabilities. It is a Quadcore Intel i7. Amazon lists it as follows.
- High-CPU Extra Large Instance
- 7 GB of memory
- 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
- 1690 GB of local instance storage
- 64-bit platform
Executing the Encog for C benchmark produced the following results.
* * Encog C/C++(64 bit) Command Line v0.1 * * Processor/Core Count: 8 Performing benchmark Input Count: 10 Ideal Count: 1 Records: 10000 Iterations: 100 Benchmark time(seconds): 42
I ran this a few times, result were typically in the 40-42 second range. My desktop QuadCore typically gets nearly that exact range. The cost for this Amazon EC2 instance is $0.68 (USD) an hour.
Encog on an Amazon EC2 cc2.8xlarge
Cluster Compute Eight Extra Large 60.5 GB memory,
- 88 EC2 Compute Units,
- 3370 GB of local instance storage,
- 64-bit platform,
- 10 Gigabit Ethernet
* * Encog C/C++(64 bit) Command Line v0.1 * * Processor/Core Count: 32 Performing benchmark Input Count: 10 Ideal Count: 1 Records: 10000 Iterations: 100 Benchmark time(seconds): 10
You can see a large difference from the previous test. There are four times the number of cores, and the result was also executed in 1/4th the time. For this test, Encog scaled up from 8 cores to 32 cores.
The following screen capture was taken while Encog was in the middle of training. You can see that even with 32 cores, Encog was keeping every CPU quite busy. The cost for this machine is $2.40 (USD) an hour.
Encog on an Amazon EC2 cg1.4xlarge
Amazon makes available a node type that offers two high-end GPU's. This instance type is used to test Encog in dual GPU mode. The following shows Encog dumping the CUDA stats for Amazon's Two Tesla M2050's on their GPU cluster instance.
* * Encog C/C++(64 bit, CUDA) Command Line v0.1 * * Processor/Core Count: 16 Device 0: Tesla M2050 CUDA Driver Version / Runtime Version 3.2 / 3.1 CUDA Capability Major/Minor version number: 2.0 Total amount of global memory: 2687 MBytes (2817982464 bytes) (14) Multiprocessors x (32) CUDA Cores/MP: 448 CUDA Cores GPU Clock Speed: 1.15 GHz Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 65535 x 65535 x 1 Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Device 1: Tesla M2050 CUDA Driver Version / Runtime Version 3.2 / 3.1 CUDA Capability Major/Minor version number: 2.0 Total amount of global memory: 2687 MBytes (2817982464 bytes) (14) Multiprocessors x (32) CUDA Cores/MP: 448 CUDA Cores GPU Clock Speed: 1.15 GHz Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 65535 x 65535 x 1 Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Performing CUDA test. Vector Addition CUDA Vector Add Test was successful.
