Running Jobs on the WestGrid Cortex System
Introduction
Please see the Running Jobs page for a general introduction to the batch queuing and scheduling system used at WestGrid sites. Additional site-specific information and examples for the Cortex system are shown below.
File Systems
Jobs should be submitted from the $HOME space. It is located on the GPFS file systems /scratch_ibm, and /scratch_ds4700, and is available on every IBM machine in the complex as a cluster filesystem.
For more information on file systems on Cortex see here.
Submitting Jobs
Use the Torque qsub command to submit jobs on WestGrid. The output from the job will be routed to the directory from which you submitted the job. The syntax for the qsub command is: qsub [ option(s) ] [ script-file ]. For more information on the qsub command see Cluster Resources documentation here.
Torque Submission Scriptfile
| Torque Script | Documentation |
| #! /bin/sh | |
| #PBS -S /bin/sh | This is the shell that PBS will use to execute your script file. If omitted, your login shell on the execution host is used. This has an impact on which set of startup files are processed, and consequently, which set of environment variables your script inherits. This shell and the one specified on the #! line don't have to be the same. The only rule is that the syntax of the script must be consistent with the shell specified on the #! line. |
| #PBS -q ibms | The requested queue; and for this example we have chosen "ibms" the default queue on cortex. Each queue has resource limits which reflect the capabilities of the machine(s) that hosts the queue or which are derived from administrative policies. Please see queues section bellow for more informantion. |
| #PBS -l host=synapse | The requested host; and for this example we have chosen "synapse". It is not recommended to set the host variable unless you require your job to run on a specific host. Jobs without a specified host will tend to run earlier as the jobs may be run on other hosts when space becomes available. The host specified must be able to run the job, and meet walltime, cpu number, memory size, and queue restrictions if set. |
| #PBS -l ncpus=64 | Number of cpus required by your job. In this example, 64 processors are been requested. Just remember to request one cpu for each parallel thread of execution. The syntax of specifying processors is different from commodity clusters such as Glacier. |
| #PBS -l walltime=12:00:00 | This specifies the maximum amount of elapsed time required by your job. This should be larger than the time needed to reach the first checkpoint. See the main Running Jobs page for more detail. On Cortex, the maximum walltime you can request is 24 hours. |
| #PBS -m bea | Instruct PBS to email you when your job begins and ends, or is aborted. The actual email address is specified on the next line ; i.e., in the "M" directive. |
| #PBS -M your_email@ualberta.ca | Specifies the email address that will receive PBS notifications. In this example PBS would try to send mail to one of our analysts, please be sure to substitute your own email address. |
| #PBS -N myjob | In this example we have chosen an unimaginative name of "myjob". Chose a short name (less than 16 characters, no spaces) that you can use to identify your job. If omitted, the name of the script (truncated to the first 15 characters) is used. |
| ./a.out | This line executes the program a.out |
For the sake of simplicity, we recommend that you save the script in the same directory as your program files, preferably in a personal directory under /scratch. Give the script an appropriate name, in our example we called it "myjob.sh" . When the job runs, the script is executed on your behalf using the specified shell on the specified host. At this point, it's just an ordinary shell script and all comments, including PBS directives, are ignored. The program (a.out in the example) runs as a child process of the script. When the program terminates, execution returns to the script, which itself terminates, finally terminating the job.
Large memory jobs
For jobs requiring more than 4GB per process please calculate an equivalent number of processors. Divide the total memory required by 4GB. Use this resulting number as your ncpus parameter if it is greater than the number of processors your job requires. Do not use "-l mem=..." or "-l vmem=..." directives in your jobs scripts for the machines available through Cortex.
Queues
There are a variety of queues to which jobs can be submitted. There are two production queues: 'ibms' and 'pwr4', a debugging queue: 'test', and administrator adjustable queue: 'special'. Some characteristics of the jobs accepted by the various queues are shown in the table below.
| Submit Queue Name | Execute Queue Name | Nodes That Jobs Will Be Run On: | Min CPUs | Max CPUs | Default CPUs | Max Walltime (hh:mm) | Default Walltime (hh:mm) |
| ibms | q1 | Cortex | 1 | 2 | 1 | 24:00 | 12:00 |
| q2 | Dendrite | 3 | 29 | 4 | 24:00 | 12:00 | |
| q3 | Dendrite, Synapse | 30 | 64 | 64 | 24:00 | 12:00 | |
| pwr4 | pwr4 | Bigfoot, Adenine, Guanine | 1 | 32 | 1 | 24:00 | 12:00 |
| test | test | ANY | 1 | 64 | 1 | 00:10 | 00:01 |
| special | special | ANY | 1 | 64 | 1 | ANY | ANY |
There is one queue for general use: ibms. For the ibms queue the minimum number of processors of a job is 32 on synapse, 3 on dendrite, 1 on cortex. Jobs submitted to the ibms queue will appear as queue q1, q2, or q3 when monitored with, qstat, pqstat, checkjob commands.
Jobs submitted to the 'pwr4' queue will run on the IBM power4 p690 hosts: bigfoot, adenine, guanine
There is a 'test' queue used to run very short test jobs, to test job submission scripts for example. There is a limit of 1 job per user but no restriction on the minimum number of processors to run on any machine.
Access to the 'special' queue is by special permission only and will only be granted in special cases, such as in order to meet RAC allocations that could not otherwise be met within our site's general job policies. Please contact support@westgrid.ca if you wish to use it.
Priority
The priority of a job is determined by a combination of the following factors:
- A project group's recent usage of system and the project group's RAC allocation using the fairshare algorithm.
- Large parallel and large memory jobs are favored.
- How long has the job been sitting in the queue. Only user's oldest jobs gain eligible queue time and priority this way.
For more information please see information of WestGrid allocations and the WestGrid Resource Allocation Commitee (RAC), as well as Cluster Resources documentation on fairshare algorithm, resource priority factor and service priority factor.
Monitoring Jobs
The status and priority of jobs, both running and queued, can be checked with the local command pqstat which provides a detailed and consise view of jobs in the complex. An explanation of the output of the pqstat command can be found here.
The checkjob command shows very detailed information for a single job, an explanation of the output of the checkjob command can be found here.
The mdiag -n command shows a summary of the true current condition of machines in a cluster and is available to users on the Cortex complex. An explanation of the output of the mdiag -n command can be found here.
Updated 2008-12-04.
