Late Blooming Tech Geek: First speed test results on my laptop's NVIDIA gpu!

I bought one of the rare laptops with a discrete graphics card (ASUS Zenbook UX32VD) last summer with hopes of running PyCUDA code on it to speed up my scientific computations. I have run computations on a GPU grid in the past, but I have lots of reasons for wanting to run my code locally.

Here is the progress I've made:

-installed Bumblee

-installed NVIDIA drivers (64-bit)

The first test I ran to make sure the installation was good at this point was:

cd /opt/VirtualGL/bin

optirun ./glxspheres

output:

Polygons in scene: 62464

Visual ID of window: 0x20

Context is Direct

OpenGL Renderer: GeForce GT 620M/PCIe/SSE2

78.883064 frames/sec - 88.033499 Mpixels/sec

78.467656 frames/sec - 87.569904 Mpixels/sec

80.333654 frames/sec - 89.652358 Mpixels/sec

80.144371 frames/sec - 89.441118 Mpixels/sec

78.961290 frames/sec - 88.120799 Mpixels/sec

Compared to the speeds for running it without optirun:

./glxspheres

output:

Polygons in scene: 62464

Visual ID of window: 0x20

Context is Direct

OpenGL Renderer: Mesa DRI Intel(R) Ivybridge Mobile x86/MMX/SSE2

60.521833 frames/sec - 67.542366 Mpixels/sec

59.924707 frames/sec - 66.875973 Mpixels/sec

60.537515 frames/sec - 67.559867 Mpixels/sec

60.056097 frames/sec - 67.022604 Mpixels/sec

59.947988 frames/sec - 66.901955 Mpixels/sec

Great! Optirun accessed the GPU-- a GeForce GT 620M and used it to run at 80 fps as opposed to the 60 fps given by the intel chip. But, what I really want to do is to run computations on the gpu using PyCUDA. I found some a great sample speed test written with pycuda here: http://wiki.tiker.net/PyCuda/Examples/SimpleSpeedTest.

I started up ipython with optirun ipython --pylab and then ran the script. The smaller the time the better, and the first three values are the results of the repeated numerical computation-- they should all be identical.

output:

Using nbr_values == 8192

Calculating 100000 iterations

SourceModule time and first three results:

0.264477s, [ 0.005477 0.005477 0.005477]

Elementwise time and first three results:

0.349142s, [ 0.005477 0.005477 0.005477]

Elementwise Python looping time and first three results:

2.101326s, [ 0.005477 0.005477 0.005477]

GPUArray time and first three results:

6.302837s, [ 0.005477 0.005477 0.005477]

CPU time and first three results:

4.637617s, [ 0.005477 0.005477 0.005477]

The speedtest compares five different methods of computation. The first four are on the GPU, and the last one is on the CPU. Some of my times are better than the examples given in the comments of the code, and some are worse. It looks like I might be able to expect a factor of 10 speed improvement from using the GPU, but maybe not any more than that. I wish I had someone else's results from their ASUS Zenbook (GPU specs here: http://www.geforce.com/hardware/notebook-gpus/geforce-gt-620m/specifications) to compare to!

Late Blooming Tech Geek

Sunday, April 20, 2014

First speed test results on my laptop's NVIDIA gpu!

No comments:

Post a Comment