System Notices
Tue Dec 17 Nexus Cluster Outage - Wen 18 Dec Back in production
Tue Dec 17 Nexus Cluster Outage:
Nexus and all SGI machines at UofA are unaccessible due to the hardware
problems on a disk array serving global file systems. This equipment is
no longer covered by any warranty or support agreement with the vendor.
We are trying to revive the faulty hardware, and will announce
machine(s) availability to the users as soon as the problem is fixed.
UPDATE 16:29 PST 17 DEC
The file system has been fixed.
"We think that machines will be back up tomorrow around noon. We are
taking this opportunity to clean up our CXFS settings."
UPDATE 11:15 PST 18 DEC
Nexus, all SGI machines and filesystems are back in productio.
Tue Dec 22 09:00:00 MST: Scheduled outage on Checkers cluster. (Back in production)
Update Tue Dec 22 15:30:00 MST:
Checkers cluster is back in production.
Tue Dec 22 09:00:00 MST:
There will be an scheduled outage on checkers cluster in order to
effect changes to the network file system.
Jobs that can not finish before this outage will not be started
till after the outage is complete.
Nov 26 10:30 - 15:00 MST, Cortex complex compute nodes Gaunine and Adenine outage due to UPS hardware failure
Nov 26 10:30AM MST
An UPS powering Guanine and Adenine experianced a spectacular hardware failure.
Jobs running on Adenine and Guanine during the UPS failure were effected.
Guanine and Adenine were been powered off as a result.
Update 15:00 MST:
Guanine and Adenine are back in production running on utility power.
Orcinus is back online
Orcinus is Offline for Scheduled Maintenance
Nov 14, 12:05 - Bugaboo available again
Bugaboo downtime extended until Sat, Nov 14, 12:00
Because of complications during the hardware and software upgrade we are forced to extend the Bugaboo downtime until noon, Saturday, Nov. 14 - 12:00 (Pacific).
Tue, Dec 1, 2009 through Sun, Dec 6, 2009 - Matrix cluster unavailable due to major system upgrade
The Matrix cluster will be unavailable from Tuesday, December 1, 2009 through the following weekend, in order to undertake a major system upgrade. The system should be available again on Monday, December 7, 2009.
A scheduling reservation will be put in place that will prevent jobs from starting if the specified walltime would extend into the maintenance period. If there are jobs you can fit in during the days leading up to the shutdown by successively reducing the walltime to fit the shrinking window of opportunity, please do so.
If there are any jobs still running on the morning of Dec 1, they will be killed.
Jobs that are in the input queue at the time of the system shutdown should be handled normally by the batch scheduling software when the system is brought back up after the upgrade.
