You are here

Orcinus

Orcinus System Status

Post date System Status: Update Notes
2016-10-14 - 13:37 PDT Conditions

Windstorm and Power Fluctuations/Outages

 As a result of current windstorms we suffered a brief power outage and, consequently, some of the compute nodes powered off and a some jobs were lost. Please check your data and resubmit any lost work. We are sorry for the storm's interruption. Until the storm passes, we have decided to suspend the scheduling of new jobs and we will do our best to keep everything else operational.

2016-10-05 - 02:15 PDT Online

Tuesday, October 4, 2016 12:00 PM (PDT)

 UBC Campus wide power glitch caused multiple orcinus nodes to loose
 power. Multiple jobs were lost.


 Please examine your files and resubmit the lost computations.
 Sorry for all inconveniences

2016-09-27 - 12:36 PDT Online

Orcinus down with Lustre file system problem Monday night/Tuesday morning

2016-09-26/27 (Monday night/Tuesday morning) - Orcinus is down with what appears to be a Lustre file system problem.  A system administrator has been looking at the problem and will update this message when more information is available.  Sorry for the inconvenience.

Monday, September 26, 10:30 PM (PDT)

 Last night orcinus experienced a lustre FS failure (/global/scratch).
 Most of the jobs running on the system were lost. Please examine your
 files and resubmit the lost computations. Sorry for all inconveniences.

 

2016-09-27 - 03:23 PDT Offline

Orcinus down with Lustre file system problem Monday night/Tuesday morning

2016-09-26/27 (Monday night/Tuesday morning) - Orcinus is down with what appears to be a Lustre file system problem.  A system administrator has been looking at the problem and will update this message when more information is available.  Sorry for the inconvenience.

2016-06-08 - 02:23 PDT Online
2016-04-20 - 11:25 PDT Online

Rack Cooling Maintenance

System is fully operational.

2016-04-19 - 10:57 PDT Conditions

Rack Cooling Maintenance

Please note that the scheduler for Orcinus is temporarily paused while maintenance is being performed on the cooling systems for various racks in the cluster.

We will restore regular scheduling operations as soon as the maintenance has been completed.

Pages