Amazon EC2 Encog Performance

From Encog Machine Learning Framework
Revision as of 14:19, 5 February 2012 by JeffHeaton (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Amazon EC2 provides a number of different performance tiers. I've bench marked Encog on a number of these tiers to see how well Encog Multitasking scales up. So far the results have been quite good. Encog does not yet have scalable GPU processing, so these numbers are just using CPU power. At least for now!

Encog on an Amazon EC2 c1.xlarge

The c1.xlarge is similar to my desktop computer, in terms of its CPU capabilities. It is a Quadcore Intel i7. Amazon lists it as follows.

  • High-CPU Extra Large Instance
  • 7 GB of memory
  • 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
  • 1690 GB of local instance storage
  • 64-bit platform

Executing the Encog for C benchmark produced the following results.

* * Encog C/C++(64 bit) Command Line v0.1 * *
Processor/Core Count: 8

Performing benchmark
Input Count: 10
Ideal Count: 1
Records: 10000
Iterations: 100
Benchmark time(seconds): 42

I ran this a few times, result were typically in the 40-42 second range. My desktop QuadCore typically gets nearly that exact range. The cost for this Amazon EC2 instance is $0.68 (USD) an hour.

Encog on an Amazon EC2 cc2.8xlarge

Cluster Compute Eight Extra Large 60.5 GB memory,

  • 88 EC2 Compute Units,
  • 3370 GB of local instance storage,
  • 64-bit platform,
  • 10 Gigabit Ethernet
* * Encog C/C++(64 bit) Command Line v0.1 * *
Processor/Core Count: 32

Performing benchmark
Input Count: 10
Ideal Count: 1
Records: 10000
Iterations: 100
Benchmark time(seconds): 10

You can see a large difference from the previous test. There are four times the number of cores, and the result was also executed in 1/4th the time. For this test, Encog scaled up from 8 cores to 32 cores.

The following screen capture was taken while Encog was in the middle of training. You can see that even with 32 cores, Encog was keeping every CPU quite busy. The cost for this machine is $2.40 (USD) an hour.

Encog-32core.png

Encog on an Amazon EC2 cg1.4xlarge

Amazon makes available a node type that offers two high-end GPU's. This instance type is used to test Encog in dual GPU mode. The following shows Encog dumping the CUDA stats for Amazon's Two Tesla M2050's on their GPU cluster instance.

* * Encog C/C++(64 bit, CUDA) Command Line v0.1 * *
Processor/Core Count: 16
Device 0: Tesla M2050
   CUDA Driver Version / Runtime Version          3.2 / 3.1
  CUDA Capability Major/Minor version number:    2.0
  Total amount of global memory:                 2687 MBytes (2817982464 bytes)

  (14) Multiprocessors x (32) CUDA Cores/MP:     448 CUDA Cores
  GPU Clock Speed:                               1.15 GHz
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
Device 1: Tesla M2050
   CUDA Driver Version / Runtime Version          3.2 / 3.1
  CUDA Capability Major/Minor version number:    2.0
  Total amount of global memory:                 2687 MBytes (2817982464 bytes)

  (14) Multiprocessors x (32) CUDA Cores/MP:     448 CUDA Cores
  GPU Clock Speed:                               1.15 GHz
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
Performing CUDA test.
Vector Addition
CUDA Vector Add Test was successful.
Personal tools