You are here

Cedar

Cedar System Status

Post date System Status: Update Notes
2018-06-04 - 08:48 PDT Online

System fully operational

Finished on June 4, 2018 - 8:48 PDT

2018-05-18 - 17:23 PDT Offline

Planned Outage May 28 - June 1

The Cedar System will be unavailable starting May 28 for system maintenance. The expected length of the downtime is 4 days. See http://status.computecanada.ca/ for more details.

2018-02-02 - 10:43 PST Online

Scheduler problem

UPDATE Feb. 02 - The issues have been resolved and Cedar is fully operational. 

The job scheduler is misbehaving, which may impair the ability to start new jobs. L'ordonnanceur a un problème, ce qui peut empêcher de démarrer de nouvelles tâches.

Jobs starting and finishing are having trouble returning results to the main Scheduling server, there may be delay in jobs being scheduled or completing. Also, squeue results may show incorrect status until we can get this resolved.

2018-01-31 - 05:55 PST Conditions

Scheduler problem

The job scheduler is misbehaving, which may impair the ability to start new jobs. L'ordonnanceur a un problème, ce qui peut empêcher de démarrer de nouvelles tâches.

Jobs starting and finishing are having trouble returning results to the main Scheduling server, there may be delay in jobs being scheduled or completing. Also, squeue results may show incorrect status until we can get this resolved.

2018-01-03 - 12:01 PST Online

Cedar storage problem causing i/o errors

Jan 3 12:54MT: the problem has been resolved.

Jan 3 07:30MT: One of the Cedar storage metadata servers seems to be running out of resources. Users may see the following issues

  • unable to login (connection closed).
  • jobs or sessions crashing with read errors

The vendor has been contacted.

2018-01-03 - 11:54 PST Conditions

Cedar storage problem causing i/o errors

One of the Cedar storage metadata servers seems to be running out of resources. Users may see the following issues

  • unable to login (connection closed).
  • jobs or sessions crashing with read errors

The vendor has been contacted.

2017-12-13 - 13:59 PST Online

Scheduling issue for jobs requesting less than 4 GPU's

Dec.13, 2017: The problem has been fixed.

Dec.11, 2017: The problem is still evident - we're working on it.

Nov.30, 2017: Jobs submitted through sbatch should now see the correct GPU(s). Jobs submitted through salloc (interactive jobs) still have a problem. 

Nov.30, 2017: Jobs requesting less than a full node worth of GPU (i.e. less than 4 GPUs) may be assigned the same GPU by Slurm since the latest upgrade. 

Pages