Gaussian on WestGrid
Table of Contents
- Introduction
- Migrating from Lattice
- User Responsibilities
- System Characteristics and Limitations
- Using Gaussian
Introduction
WestGrid has acquired a full commercial license for Gaussian 09 (G09). Gaussian 03 (G03) is also available for a time, although we encourage users to migrate to Gaussian 09 as soon as possible. Gaussian software is available for use by all approved WestGrid account holders, subject to some license restrictions.
Note: In addition to using WestGrid, University of Alberta users may apply to use Gaussian 03 on machines operated by Academic Information and Communication Technologies. Contact research.support@ualberta.ca for more information. Similarly University of Calgary researchers may apply to use Gaussian 03 on local U of C resources by contacting support@hpc.ucalgary.ca.
Migrating from Lattice
Gaussian was formerly available on the WestGrid Lattice cluster. (The old documentation for running on Lattice is here.)
If you have checkpoint files generated using Gaussian 03 on the Lattice cluster, these can be converted for use with Gaussian 09 on Checkers using a utility, c8609, supplied with Gaussian 09. This command-line utility accepts one argument, the file name or path to the file to be converted. Note that the conversion occurs in-place, overwriting the input file. If you would like to retain the original input file, make a copy of it before running c8609. As explained in more detail below, the module command should be run to set up your environment.
Example of using c8609 interactively on Checkers to convert an old checkpoint file, water.chk, to a format suitable for Gaussian 09:
module load gaussian
cp water.chk water.chk.old
c8609 water.chk
User Responsibilities
Due to licensing restrictions, researchers must agree to certain conditions in order to use Gaussian software on WestGrid systems. Follow the directions given here to apply.
Users are expected to be generally familiar with Gaussian capabilities, input file format and the use of restart files. Also, please read the Efficiency Considerations section of the Gaussian 09 Online Manual in order to learn about memory and disk requirements of different types of analyses.
There is also system-specific information given in the following sections that is important for effective use of Gaussian on the WestGrid lattice cluster.
- Please note that inappropriate use of the memory-related parameters can cause jobs to fail or prevent the batch scheduler from using the system efficiently.
System Characteristics and Limitations
Gaussian is available to approved WestGrid users on checkers.westgrid.ca. See the Checkers QuickStart Guide for an overview of the Checkers cluster.
Parallel job limitations
The Linda environment is not available on this system, so a parallel job is restricted to at most the eight processor cores on a single node.
File system issues
By default, if the Gaussian environment is initialized by an appropriate module command (described below), the Gaussian scratch directory will be automatically assigned to use the most appropriate file system.
To avoid overloading the NFS file server, do not include a %rwf directive of the form:
%rwf=read_write_file.wf
in your Gaussian command file. Including such a directive places the frequently-accessed temporary "read/write file" in the job's current working directory on the NFS server. Leaving this directive out of the input file would put the rwf file in a job-specific temporary directory in a non-NFS local file system on the execution node. This yields better performance for the Gaussian job and takes the load off the NFS file server. It is often necessary to include a %chk directive in order to save the checkpoint file, but it is never necessary to save the rwf file.
Using Gaussian
Job Submission
Gaussian is available to WestGrid users only on checkers.westgrid.ca. Like other jobs on WestGrid systems, Gaussian jobs are run by submitting an appropriate script for batch scheduling using the qsub command. A sample script, gaussian.pbs, is shown further down on this page. For example, to submit a serial Gaussian job with a time limit of 168 hours (one week), use
See the Checkers QuickStart Guide and the Running Jobs page for more information about submitting jobs on Checkers.
Job Time Limit and Restart (Checkpoint) Files
At the time of writing, the maximum time limit for serial jobs on Checkers is 21 days (504 hours). The -l flag with the walltime option, as shown on the qsub command above, is used to request a specific limit on the elapsed time for the job. The format is qsub -l walltime=hhh:mm:ss for a given number of hours (hhh), minutes (mm) and seconds (ss).
To avoid lost work if there is an interruption during a long job, it is recommended that a checkpoint file be specified in your Gaussian command file, using the %chk directive:
If you underestimated the time required for the job or if it was stopped due to a system problem (other than a disk failure!), you may be able to use the restart.chk file to continue the calculation in a subsequent job.
Matching TORQUE and Gaussian Memory Limits
The Gaussian command file directive %mem=[Gaussian_memory] can be used to increase the internal memory allocation for the Gaussian program, where, for example, Gaussian_memory=1600MB. At least for G03 the amount of memory used by Gaussian is significantly more than requested by %mem. This can cause the scheduler to assign jobs to nodes that do not have sufficient memory, which can lead to job failures and conflicts with other users' jobs.
The amount of extra memory used varies from about 300 MB to 700 MB, increasing with the amount requested. For example, if you used %mem=1600MB, you should tell Torque that your job needs 2000 MB, but, if you request %mem=3000MB, TORQUE should be advised that the job needs 3700 MB or more.
The mem option is used with the -l flag on the qsub command line to tell TORQUE how much memory the job requires. It can be combined with the walltime option as shown in this example:
Running Parallel Gaussian Jobs
Some of the analyses available in the Gaussian suite support parallel processing. If you have not previously run parallel Gaussian jobs, please ask your colleagues for advice on which kind of analyses work well in parallel, or do some short test runs to compare the elapsed time as you increase the number of processors from 1 to 2 (or up to 8). No more than 8 processors may be used for a single Gaussian job on Checkers.
To request a parallel calculation, use the %nproc directive in your Gaussian command file. For example:
As with the memory, it is not sufficient to tell only Gaussian how many processors you wish to use. TORQUE must also be told so that the batch job handling system assigns your job the correct number of processors. This is done using the nodes=1:ppn=... option of the -l flag on the qsub command line. For example:
In the example, ppn stands for processors per node. Two processors were requested. The number of nodes requested should always be one for Gaussian jobs. Please ensure that a colon, not a comma, is used to separate the nodes=1 from the ppn.
Administrators have noticed that users sometimes forget to add the memory and processor directives on the qsub command line. As noted on the Running Jobs page, you can add these directives to the batch job script instead, with lines of the form:
#PBS -l nodes=1:ppn=2
This is also illustrated in the sample job below.
Sample TORQUE Script for Running G09
The script below is an example of what gaussian.pbs might look like. Modify the lines containing your_g09_commands.com to reference your own input file of Gaussian commands.
#PBS -S /bin/bash
#PBS -l mem=2000MB
#PBS -l nodes=1:ppn=2
# Adjust the mem and ppn above to match the requirements of your job
# Sample Gaussian job script
cd $PBS_O_WORKDIR
echo "Current working directory is `pwd`"
echo "Running on `hostname`"
echo "Starting run at: `date`"
# Set up the Gaussian environment using the module command:
module load gaussian
# Run g09
g09 < your_g09_commands.com
Note the use of the module command to set up the environment for running Gaussian. General information about modules is available on the WestGrid modules page. There are specific modules for G09 and G03, named gaussian/g09 and gaussian/g03, respectively. However, G09 is the default, so, one can shorten the module name to gaussian, as in the script above, instead of using gaussian/g09.
Sample TORQUE Script for Running G03
A batch job script for running G03 is shown below. it is the same as the G09 script except for replacing the module name with gaussian/g03 and the executable name with g03.
#PBS -S /bin/bash
#PBS -l mem=2000MB
#PBS -l nodes=1:ppn=2
# Adjust the mem and ppn above to match the requirements of your job
# Sample Gaussian job script
cd $PBS_O_WORKDIR
echo "Current working directory is `pwd`"
echo "Running on `hostname`"
echo "Starting run at: `date`"
# Set up the Gaussian environment using the module command:
module load gaussian/g03
# Run g03
g03 < your_g03_commands.com
Using formchk
To run formchk interactively, to convert Gaussian output to a form suitable for transfer to another system, you should initialize the Gaussian environment in a manner similar to what is done in the batch job example above.
formchk input.chk output.fchk
Updated 2010-02-01.