You are here

Orcinus

Orcinus System Status

Post date System Status: Update Notes
2017-12-02 - 20:00 PST Offline

Power and cooling issues Dec. 2, 2017 10:30AM PST

Dec. 2, 2017 10:30AM PST

Around 10:30AM PST UBC Campus experienced short power outage. AS a result Orcinus cluster lost cooling. All running jobs were lost. The power and cooling has been restores and we are working to restore the operational status of the system ASAP.

2017-12-02 - 20:00 PST Offline

Power and cooling issues Dec. 2, 2017 10:30AM PST

Dec. 2, 2017 10:30AM PST

Around 10:30AM PST UBC Campus experienced short power outage. AS a result Orcinus cluster lost cooling. All running jobs were lost. The power and cooling has been restores and we are working to restore the operational status of the system ASAP.

2017-08-15 - 02:16 PDT Online

Orcinus fully operational

Orcinus is fully operational. However some of nodes are down as a result of the last power outage. We trying to bring these nodes back to operation. Due to the lack of warranty and difficulty to find spare parts it can be a lengthy process. Sorry for any delays in scheduling. Our current count is 9340 cores.

2017-08-05 - 17:17 PDT Online

Power Outage at UBC Campus

On Thursday Aug. 3, 2017 around 4:15 PM (PDT)  Orcinus experienced power outage.

All running jobs were lost. Please examine your intermediate data and resubmit the lost computations.

The full operation was restored on Friday Aug. 4 (afternoon). We are sorry for all inconveniences.

2017-07-28 - 14:16 PDT Online

File system issue

This morning we experienced a file system problem. We are currently working to resolve the issue. Sorry for the inconvenience. We will try to restore full operations as quickly as possible.

1:30 PM (PDT)

The issue with the Lustre FS has been resolved. Some of the running jobs failed.

Please examine your intermediate data and resubmit the lost computations.

 

2017-07-28 - 10:58 PDT Offline

File system issue

This morning we experienced a file system problem. We are currently working to resolve the issue. Sorry for the inconvenience. We will try to restore full operations as quickly as possible.

2017-07-10 - 09:57 PDT Online

One of the login nodes crashed

The "seawolf3" login node has crashed.

UPDATE - the login node ("seawolf3") was rebooted and is now working.

Pages