2017-06-13: Resolved: PFS outage. Systems back to normal.

  • Posted on: 13 June 2017
  • By: zao

The PFS file system is having some server problems and is currently not accessible.
Due to this, it is not possible to log in to the login nodes.

The batch queues are suspended as we work on this.

We will update this news entry as we make progress and/or resolve the problem.

2017-06-13 17:45
The failing hardware have been reported to our hardware vendor.

2017-06-14 12:24
We are working with the vendor to resolve the problem.

2017-06-15 13:57
Our vendor has provided a new build of the Lustre server code and we are implementing it.
After all servers have been updated we will test the system a while to make sure it really fixes our problem.

2017-06-15 15:40
The system is now in testing phase, we will keep it that way for about an hour to make sure it is stable.

2017-06-15 17:00
Systems returning to normal production.

We now believe the issue has been resolved. If you encounter issues with jobs started after this point in time, please contact support@hpc2n.umu.se.

 

Updated: 2017-09-21, 11:05