You are here

GROMACS on Glacier

Table of Contents

Introduction

Please see the WestGrid GROMACS page for some general remarks about running GROMACS on WestGrid systems. Specific scripts for running on Glacier are shown below. Scripts and data for sample runs are available on Glacier under /global/software/gromacs-3.3/job_examples. The scripts shown below are based on the bash shell, but, a C shell example is included in the job_examples directory.

In the combined example shown below, calls to grompp and mdrun are combined into a single job. In subsequent sections, scripts are shown to run grompp and mdrun separately. A script to run tpbconv, to process output from one mdrun job to prepare for a subsequent run, is also provided. However one of the simplest ways of submitting GROMACS jobs is to use a master script to run an alternating sequence of mdrun and tpbconv commands as discussed below.

Chaining GROMACS jobs

One of the simplest ways to submit GROMACS jobs on Glacier is to use the chain_glacier script. To try this, copy all the files from /global/software/gromacs-3.3/job_examples/chain into your own directory and execute:

chain_glacier dppc

Details of the chain_glacier script are discussed elsewhere.

Example combining grompp and mdrun in one job

Since the grompp preprocessor must be run prior to using the energy minimization program, mdrun, it may be convenient to combine these in a single job. This also makes it easy to ensure that the number of processors specified in the grompp call matches that used for mdrun.

Here is a batch job script, gromacs_glacier.pbs, for a job to run grompp, followed by mdrun. The script and data are located on Glacier in the directory /global/software/gromacs-3.3/job_examples/dppc_pme . To run the example, first copy the files from that directory to your own subdirectory. To submit the job to run on 4 processors, use:

qsub -l nodes=2:ppn=2 gromacs_glacier.pbs

Here is the gromacs_glacier.pbs script:

#!/bin/bash
#PBS -S /bin/bash

# Glacier version
# DSP 2006-08-10, 2008-03-13.

cd $PBS_O_WORKDIR

echo "Current working directory is `pwd`"

echo "Node file: $PBS_NODEFILE :"
echo "---------------------"
cat $PBS_NODEFILE
echo "---------------------"
PBS_NP=`/bin/awk 'END {print NR}' $PBS_NODEFILE`
echo "Running on $PBS_NP processors."

echo "Starting run at: `date`"
BINDIR=/global/software/gromacs-3.3/gcc-fftw-3.1-single/bin
. ${BINDIR}/GMXRC
PRECISION=""
PARALLEL=_mpi
MDRUN_RUN_NAME="dppc"

${BINDIR}/grompp${PRECISION} -np ${PBS_NP} -shuffle -sort -f grompp.mdp\
-p topol.top -c conf.gro -o ${MDRUN_RUN_NAME}.tpr

MY_PROG="${BINDIR}/mdrun${PRECISION}${PARALLEL} \
-deffnm ${MDRUN_RUN_NAME}"

# Using LAM-MPI
export LAMRSH="/usr/bin/ssh -x"
lamboot -v $PBS_NODEFILE

MPIRUN="/usr/bin/lamrun"

$MPIRUN -np $PBS_NP $MY_PROG

lamhalt

echo "Job finished at: `date`"

Running grompp separately

Following is a script for running the GROMACS preprocessor, grompp, on the WestGrid Glacier cluster, glacier.westgrid.ca. The script accepts several arguments. The first argument is a run name, for example, "dppc", to be used as part of output file name and subsequently for the root of various file names associated with an mdrun job. The second argument is the number of processors to be used in the mdrun job, for example, 4. A final, optional, argument is a sequence number to be used as a suffix to the base run name. This is useful for simulations that are run as a series of jobs. The default sequence number is 1. Here is an example to show how the script can be run:

run_grompp_glacier dppc 4

Here is the run_grompp_glacier script:

#!/bin/bash

# Script to run grompp to prepare for a subsequent mdrun.
# Glacier version.
# DSP 2006-08-16.

# Input arguments:

# 1 - Run base name (to which sequence number is appended to get
# the basename associated with a particular job)

# 2 - Number of processors to use for subsequent mdrun jobs

# 3 - The sequence number of the output file, which will have the form
# run_base_name.sequence_number.tpr

# Modify these input file names as appropriate:

INITIAL_MDP_FILE="grompp.mdp"
INITIAL_TOP_FILE="topol.top"
INITIAL_GRO_FILE="conf.gro"

# Location of grompp. The grompp binary is assumed to be
# grompp_s or grompp_d, according to whether it is
# single or double precision

WGSYSTEM="_glacier"
PRECISION=""
BINDIR=/global/software/gromacs-3.3/gcc-fftw-3.1-single/bin
GROMPP=${BINDIR}/grompp${PRECISION}

echo ""
USAGE="Usage: run_grompp${WGSYSTEM} run_base_name pbs_np output_sequence_number."

if [ $# -lt 2 ] ; then
echo $USAGE
echo ""
exit
fi

if [ $# -gt 3 ] ; then
echo $USAGE
echo ""
exit
fi

if [ $# -eq 3 ] ; then
SEQUENCE_NUMBER=$3
else
SEQUENCE_NUMBER=1
fi

RUN_BASE_NAME=$1
PBS_NP=$2

# Define RUN_NAME so that RUN_NAME.tpr will be the
# output from grompp.

RUN_NAME=${RUN_BASE_NAME}.${SEQUENCE_NUMBER}

RUN_GROMPP_LOG_FILE=${RUN_BASE_NAME}_run_grompp.log

# Run grompp
echo "Starting grompp at: `date`" > ${RUN_GROMPP_LOG_FILE}
echo "Running $GROMPP" >> ${RUN_GROMPP_LOG_FILE}
echo "Running $GROMPP"

. ${BINDIR}/GMXRC
$GROMPP -np ${PBS_NP} -shuffle -sort \
-f ${INITIAL_MDP_FILE} \
-p ${INITIAL_TOP_FILE} \
-c ${INITIAL_GRO_FILE} \
-o ${RUN_NAME}.tpr >> ${RUN_GROMPP_LOG_FILE} 2>&1

EXIT_STATUS=$?

echo "Exit status from grompp run: ${EXIT_STATUS}" >> ${RUN_GROMPP_LOG_FILE}
echo "Finished grompp at: `date`" >> ${RUN_GROMPP_LOG_FILE}

if [ $EXIT_STATUS -ne 0 ] ; then
echo "Error in grompp. Check ${RUN_GROMPP_LOG_FILE}"
exit 64
else
echo "Finished grompp successfully."
exit 0
fi

Submitting an mdrun job

After grompp has been run on the login node, a batch job can be submitted to run mdrun. If the TORQUE script shown below is called run_mdrun_glacier.pbs, it can be submitted to run on 4 processors (2 processors on each of 2 nodes) with:

qsub -v MDRUN_RUN_NAME="dppc.1" -l nodes=2:ppn=2 run_mdrun_glacier.pbs

Here is the run_mdrun_glacier.pbs script:

#!/bin/bash
#PBS -S /bin/bash
#PBS -v MDRUN_RUN_NAME

# Script to initiate a GROMACS run on Glacier
# DSP 2006-08-09, 2008-03-13.

WGSYSTEM="_glacier"

# MDRUN_RUN_NAME should be defined before running the script

if [ "X" = "X${MDRUN_RUN_NAME}" ] ; then
echo "MDRUN_RUN_NAME not defined in run_mdrun${WGSYSTEM}.pbs . Quitting."
exit 64
else
echo "MDRUN_RUN_NAME=${MDRUN_RUN_NAME}"
fi

cd $PBS_O_WORKDIR

echo "Current working directory is `pwd`"

echo "Node file: $PBS_NODEFILE :"
echo "---------------------"
cat $PBS_NODEFILE
echo "---------------------"
PBS_NP=`/bin/awk 'END {print NR}' $PBS_NODEFILE`
echo "Running on $PBS_NP processors."

echo "Starting run at: `date`"
BINDIR=/global/software/gromacs-3.3/gcc-fftw-3.1-single/bin
. ${BINDIR}/GMXRC
PRECISION=""
PARALLEL=_mpi

MY_PROG="${BINDIR}/mdrun${PRECISION}${PARALLEL} \
-deffnm ${MDRUN_RUN_NAME}"

# Using LAM-MPI
export LAMRSH="/usr/bin/ssh -x"
lamboot -v $PBS_NODEFILE

MPIRUN="/usr/bin/lamrun"

$MPIRUN -np $PBS_NP $MY_PROG > run_mdrun_${MDRUN_RUN_NAME}.log 2>&1

lamhalt

echo "Job finished at: `date`"

Post-processing with tpbconv

Here is a script to run tpbconv, to process output from one mdrun job to prepare input for a subsequent job. It is presumed that the files from a previous run have names such as dppc.3.gro, dppc.3.trr, etc. formed from a base name (dppc in the example) and a sequence number (3). The job is submitted using a command of the form:

qsub -v MDRUN_RUN_NAME="dppc.3",PS_TO_EXTEND="1" run_tpbconv_glacier.pbs

Here, PS_TO_EXTEND is the number of picoseconds by which to extend the simulation in a subsequent job.

#!/bin/bash
#PBS -S /bin/bash
#PBS -v MDRUN_RUN_NAME,PS_TO_EXTEND

# Post-processing after a GROMACS run to prepare
# files for a subsequent run.

# Glacier version
# DSP 2006-08-09

WGSYSTEM="_glacier"

# Set up parameters particular to this run:

# MDRUN_RUN_NAME should be defined before running the script
# and should end in a sequence number.

echo "MDRUN_RUN_NAME=${MDRUN_RUN_NAME}"
echo "PS_TO_EXTEND=${PS_TO_EXTEND}"

if [ "X" = "X${MDRUN_RUN_NAME}" ] ; then
echo "MDRUN_RUN_NAME not defined in run_tpbconv${WGSYSTEM}.pbs . Quitting."
exit 64
else
echo "MDRUN_RUN_NAME=${MDRUN_RUN_NAME}"
FILE_BASE_NAME=${MDRUN_RUN_NAME}
fi

SEQUENCE_NUMBER=${MDRUN_RUN_NAME##*.}
RUN_BASE_NAME=`basename ${MDRUN_RUN_NAME} .${SEQUENCE_NUMBER}`

echo "RUN_BASE_NAME=${RUN_BASE_NAME}"

echo "Input files for tpbconv have sequence number $SEQUENCE_NUMBER"

if [ "X" = "X${PS_TO_EXTEND}" ] ; then
echo "PS_TO_EXTEND not defined in run_tpbconv${WGSYSTEM}.pbs . Quitting."
exit 65
else
echo "PS_TO_EXTEND=${PS_TO_EXTEND}"
fi

NEW_SEQUENCE_NUMBER=$((SEQUENCE_NUMBER + 1))
NEW_FILE_BASE_NAME=${RUN_BASE_NAME}.${NEW_SEQUENCE_NUMBER}

echo "Starting tpbconv run at: `date`"

cd $PBS_O_WORKDIR

echo "Current working directory is `pwd`"

echo "Node file: $PBS_NODEFILE :"
echo "---------------------"
cat $PBS_NODEFILE
echo "---------------------"
PBS_NP=`/bin/awk 'END {print NR}' $PBS_NODEFILE`
echo "Running on $PBS_NP processors."

BINDIR=/global/software/gromacs-3.3/gcc-fftw-3.1-single/bin
. ${BINDIR}/GMXRC
PRECISION=""

tpbconv${PRECISION} \
-s ${FILE_BASE_NAME}.tpr \
-f ${FILE_BASE_NAME}.trr \
-e ${FILE_BASE_NAME}.edr \
-o ${NEW_FILE_BASE_NAME}.tpr \
-extend ${PS_TO_EXTEND} \
> create_${NEW_FILE_BASE_NAME}.tpr.log \
2>&1

echo "tpbconv job output written to create_${NEW_FILE_BASE_NAME}.tpr.log"
echo "tpbconv job finished at: `date`"