You are here

Orcinus

Orcinus System Status

Post date System Status: Update Notes
2015-10-12 - 21:25 PDT Online

Lustre File System Issue

Today, around 12:40 pm (PDT), we experienced a Lustre FS problem. The issue has been resolved and full access to the file system has been restored. Most of the jobs continued to run. However, please verify your intermediate data, and eventually resubmit any lost computations. We apologize for any inconveniences.

2015-09-21 - 16:46 PDT Online

System fully operational

System fully operational

2015-08-31 - 10:16 PDT Online

System fully operational

Due to a wind storm, Orcinus temporarily lost cooling capability. As a result, we lost power to half of the compute nodes and a significant number of compute jobs were lost. All scheduling activities have been resumed. Please check your data and resubmit your jobs. We apologize for the incovenience.

2015-08-29 - 17:07 PDT Conditions

Orcinus Knocked Temporarily Offline

Due to a wind storm, Orcinus temporarily lost cooling capability. As a result, we lost power to half of the compute nodes and a significant number of compute jobs were lost. After all of the nodes have been rebooted and the system is deemed stable, we will reinitialize scheduling.

2015-08-11 - 14:03 PDT Online

Temporary /global/scratch file system issue

This afternoon, we experienced a job-related file system issue. The problem has been resolved (although we are still conducting further investigations). As a result, some compute nodes were rebooted and jobs were lost. Please check your data and resubmit your jobs. We apologize for this inconvenience.

2015-07-13 - 11:43 PDT Online

System fully operational

2015-07-11:

Orcinus is fully operational.

2015-05-29 - 15:13 PDT Online

RAC Scheduling Dedication Notice

We are required to allocate a significant portion of the orcinus QDR partition for a RAC approved/allocated parallel simulation (using up to 4096 cores per single job). We have consequently set a system reservations which takes effect next week.

Please expect some delays when submitting 'long' jobs.

Pages