You are here

Grex

Grex System Status

Post date System Status: Update Notes
2017-03-22 - 14:34 PDT Online

System fully operational

Finished on March 22, 2017 - 21:34 GMT

2017-03-20 - 12:16 PDT Downtime Scheduled

Downtime scheduled for Grex login nodes

To install 10Gb Ethernet adapters into Grex login nodes, we will be reinstalling them on Wednesday, March 22, after 3PM CST. This will not affect running jobs or  user data, but access to a particular login node might be interrupted while it is being reinstalled.

2017-02-07 - 10:59 PST Online

Lustre file system is working again.

Lustre file system on Grex is working again. Some of the running jobs might have expired while FS was not available.

2017-02-06 - 09:12 PST Testing

Lustre file system is having problems

Lustre filesysterm that serves /global/scratch failed again: another OSS server reboots at 9:30AM CST.
The files located  on OST's on that server are unavailable. Jobs queue is stopped.

2017-02-05 - 08:15 PST Conditions

Lustre file system is having problems

Lustre filesysterm that serves /global/scratch is restored; and the filesystem is available.

However, there appears to still be be an issue with one of Lustre's object storage servers. We will be working on resolving it.

System is availavle for user access and job queues are enabled.

2017-02-04 - 17:33 PST Testing

Lustre file system is unavailable

lustre filesysterm that serves /global/scratch became unresponsive on grex. We are working on restoring it and identifying the cause .

2017-01-16 - 11:10 PST Online

Grex Home filesystem is back

Most of the running jobs shoud have not been affected.

Pages