GPU Computation
WestGrid has created a small GPU testbed as part of the checkers cluster at the University of Alberta. There are four nVidia Quadro Plex 2200 S4 units connected to eight checkers nodes. Each Quadro Plex 2200 S4 contains four Quadro FX 5800 GPUs (a total of 16 GPUs), with each of the eight checkers nodes having access to two GPUs for interactive visualization or GPU based computation. For more information on the WestGrid Visualization Server, its hardware and software configuration see the Checkers QuickStart page.
Running GPU jobs on checkers
Using the checkers cluster for GPU computation is quite straightforward. There is a separate queue for submitting jobs to the GPU nodes. Users should specify the queue in their PBS script as follows:
#!/bin/bash
#PBS -q gpu
#PBS -N cuda
#PBS -l nodes=1:ppn=1
echo "Hello from $HOSTNAME: date = `date`"
nvcc --version
echo "Finished at `date`"
Running an application that makes use of one of the GPUs on the node you are allocated to will make use of one of the GPUs on that node. For more information on running jobs using the batch queuing system, please refer to the Running Jobs web page.
Running jobs with multiple GPUs
Each GPU enabled checkers node has two GPUs attached. Thus when running codes that make use of multiple GPUs you can either start a job on a single node that takes advantage of up to two GPUs or run your code on multiple nodes and use two or more GPUs (depending on how many nodes you use).
#!/bin/bash
#PBS -q gpu
#PBS -N cuda
#PBS -l nodes=1:ppn=1,gres=gpu:2
echo "Hello from $HOSTNAME: date = `date`"
nvcc --version
echo "Finished at `date`"
The above PBS batch script will allocate a single processor on a single node with two GPU nodes allocated to that job. Using "ppn=2,gres=gpu:1" will allocate two processors on the node and allocate a single GPU per allocated processor. Using "nodes=4:ppn=2,gres=gpu:1" will allocate 8 processors and 8 GPUs across 4 nodes (2 processors and 2 GPUs per node).
Note that there is a bug/feature when trying to request more than 2 CPUs per node on the GPU nodes.The normal way of requesting more than 2 CPUs with "ppn=8" has the unwanted side effect of telling the queuing system that it requires 8 GPUs on that node as well (one GPU per "task" or processor allocated). Given that none of the GPU nodes have more than 2 GPUs this job will NEVER run. In order to utilize more than two CPUs on a GPU node you should ask for two processors "ppn=2" (which in turn will allocate 2 GPUs) and "naccesspolicy=singlejob". This reservers the whole node for use by your job and allows you to use up to 8 CPUs and 2 GPUs on that node. Note that this implies that currently it is only possible to schedule a job that uses 1 CPU and 1 GPU, 2 CPUs and 2 GPUs, or 8 CPUs and 2 GPUs. Please check back here for updates on fixes for this scheduling limitation.
For more information on running jobs using the batch queuing system, please refer to the Running Jobs web page.
Compiling GPU software
WestGrid has two nodes, with 4 GPUs (two GPUs per node), that it has allocated for interactive use, primarily for visualization (see the WestGrid Visualization Server page for more details about visualization). Because these nodes have GPUs attached, it is these nodes that you should use for compiling and testing your GPU codes. Note that you SHOULD NOT run long GPU computations on these nodes, they should be used for compilation and testing only.
To connect to the interactive nodes, you need to use a secure shell (ssh) client to connect to that node. There are currently two nodes allocatedfor such purposes, node 151 and node 152. These nodes have ssh directly enabled and have the IP numbers 206.12.25.51 and 206.12.25.52 respectively. Use the following command to connect to node 151:
ssh 206.12.25.51
You can then use the nVidia developer environment as you would on any machine to compile GPU code. For more information on how to compile GPU code using CUDA please refer to the nVidia CUDA documentation.
Modules and GPU compilation
The checkers cluster, like most WestGrid machines, uses Linux modules to set up the user environment (see the WestGrid modules page for more information about modules). The visualization and CUDA modules are loaded automatically for you whenever you log in to a visualization node on the checkers cluster. This provides you with a default user environment the puts the nVidia developer tools in your path. This allows you to issue simple commands like nvcc (the nVidia compiler front end) rather than typing the full path to the commands.
