Glacier QuickStart Guide

About this QuickStart Guide

This QuickStart guide provides a brief overview of the WestGrid Glacier facility, indicating its role within WestGrid and highlighting some of the features that distinguish it from other WestGrid resources. It is intended to be read by new WestGrid account holders and by current users considering whether to move to the Glacier system.
For more detailed information about the Glacier hardware and performance characteristics, available software, usage policies and how to log in and run jobs, follow the links given below.

Introduction

Glacier is an IBM IBM-eserver-logo cluster of 840 nodes, connected via gigE network. The system is most suitable for serial processing and parallel jobs which do not require a fast interconnect fabric and can use a 32-bit architecture.

The address glacier.westgrid.ca is an alias for 3 head nodes:

nunatak1.westgrid.ca
nunatak2.westgrid.ca
nunatak3.westgrid.ca

(Nunatak is an Inuktitut word meaning "lonely peak", a rock or mountain rising above ice.)

Hardware

Processors

The Glacier cluster has 840 computational nodes (each with two 3.06 GHz Intel Xeon 32-bit processors). The nodes are assigned names of the form ice{i}_{j} where i = [1-60] is a chassis number and j = [1-14] denotes a blade inside the chassis. Nodes in Chassis 1-54 have 2GB of RAM, and nodes in Chassis 55-60 have 4GB.

Interconnect

The interconnect between blades within a chassis is Gigabit Ethernet (GigE). The chassis are connected through four GigE uplinks.

Storage

Storage space is provided through IBM's General Parallel File System (GPFS) - a high-performance shared-disk file system that can provide fast data access from all nodes. A Storage Area Network (SAN) with almost 14 TB of disk space connected directly to 8 storage nodes (moraine1,...,moraine8) is used to fulfill I/O requests from all nodes.

There are two general-access file systems available on Glacier with different characteristics and purposes, as summarized here:

/global/home

  • /global/home/username is your home directory (assigned to the HOME environment variable).
  • Disk space is limited, so, please use this file system to store only your essential data (source code, processed results if "small" in size, etc.)
  • If your code creates large data sets do not use this file system as a starting directory for your jobs. Please use /global/scratch instead.
  • We backup the /global/home file system with a 14-day expiration policy (backup frequency every 36h).
  • Size: 11TB .

/global/scratch

  • File system designed for fast changing "large" data sets and work area.
  • Please create a subdirectory of your choice (cd /global/scratch ; mkdir username ) and use it as a starting directory for your jobs.
  • Note: We do not backup this file system.
  • Size: 11TB .

In addition to the above file systems, each compute node has an approximately 35 GB local partition for temporary files associated with running jobs. On the compute nodes, you can access this temporary storage area as either /scratch or /tmp - both directory references point to the same space.  For jobs using many small files (a few MB each, say) use this local scratch storage (/scratch or /tmp) instead of /global/scratch, as the latter is optimized for large files. 

Software

See the main WestGrid software page for tables showing the installed software on Glacier and other WestGrid systems, including information about the operating system and compilers.

Using Glacier

To log in to Glacier, connect to glacier.westgrid.ca using an ssh (secure shell) client. For more information about connecting and setting up your environment, see Setting up Your Computer.

As on other WestGrid systems batch jobs are handled by a combination of TORQUE and Moab software. For more information about submitting jobs, see Running Jobs. Please note that the maximum walltime limit per job on Glacier is 240 hours.

To facilitate testing and debugging, a couple of Glacier nodes are reserved for short jobs (less than 10 minutes). To requests these nodes, add the debug Quality of Service (QOS) resource request to your job script.

#PBS -l qos=debug,walltime=00:10:00

To improve the startup time of parallel jobs use the parallel QOS request:

#PBS -l qos=parallel

Please note the following details regarding the QOS requests:

debug QOS : Maximum 4 cpus; maximum walltime 10 minutes; uses nodes ice1_1 and ice1_2 - see associated memory limits below.

parallel QOS: Minimum 4 cpus; maximum walltime 240 hours; uses all nodes except those reserved for QOS:debug.

normal QOS: Maximum walltime 240 hours; uses all nodes except those reserved for QOS:debug.

Memory specification

A default memory limit of 768 MB is assigned to each job. To override this value use the mem resource request on the qsub command line or batch job script. For example:

#PBS -l mem=1024mb

The maximum value of the mem parameter for a single processor is:

2007mb for the 756 nodes (90% of the cluster) in racks 1-9 (ice1_1,...,ice54_14) and
4005mb for the 84 nodes (10% of the cluster) in rack 10 (ice55_1,...,ice60_14).

The mem parameter is the total memory limit for a job. For a parallel job, the pmem parameter can be used to specify a per-process memory requirement. For example:

#PBS -l nodes=10,mem=20gb,pmem=2gb

means that submitted job needs 10 processors, 20gb of memory with 2gb of RAM per process. Since 2gb (2048 MB) is > 2007mb, this job can be only executed on nodes ice55_1,...,ice60_14. One might expect such a job to wait in the input queue for a longer time than a job that could run on one of the smaller memory nodes.


Updated 2008-10-22.