Data Storage

Introduction

This document is about storing your files on tape and/or hard disk.  Where you store your data depends primarily on whether you are actively using it in association with computations, or just need it for possible future reference.

Compute systems

Each WestGrid compute system has file systems (such as /home and /global/scratch) for short term storage of files you need for your computations on that system for the time period in which you are doing computations. 

There may also be local scratch space available on the compute nodes.

On most systems home volumes are backed up and scratch volumes are not. If you rely on backups on a particular system, check the QuickStart Guide for that system to make sure that backups are provided.

The RAC (Resource Allocation Committee) does not allocate storage on compute systems.

This storage space is limited, so there are quotas on disk usage.  Quotas may vary by system.

Please move data which you are not actively using elsewhere or remove it.

Refer to the storage section of the QuickStart Guide for the particular system on which you working for more information on such things as the amount of local storage and quotas and backup policies.

Storage facilities

Please only keep active files on the compute systems, store files on the storage systems.

WestGrid has two large storage facilities,  Silo at USask and Bugaboo, at SFU.  The previous storage faciility, Gridstore, is being repurposed in early 2010. Silo is newer and has more space available so we recommend you use Silo for new storage. Users who actively need large amounts of disk (hundreds of GB, say) in association with their day-to-day computations should consider using Bugaboo. 

To transfer files from a WestGrid compute site to a WestGrid storage site, we provide a file transfer utility called gcp which is very efficient.

If you plan to transfer a large amount of data, please contact support first.

The storage facilities provide several levels of protection from the risks of losing your data.  Different volumes have different policies on the number of backup copies kept.  Since keeping more copies is expensive in terms of storage resources used, there are more  usage restrictions on these volumes.  For some you will need an allocation from the RAC to use them.  For all volumes, there are limits to resources used without a RAC allocation.

If you store data on the storage facilities, please delete the copies on compute systems to free up those resources. 

Storage Policies

There is a WestGrid Storage Policy document that we encourage you to read.  The policy document covers appropriate use, allocations, quotas, backups, archives, ownership of data, retention of data, and risks of data loss.


Updated 2010-01-07.