Powered by:
Open Science Grid
Follow us on social media:Twitter
Center for High Throughput Computing

User News

Below is a list of important user news updates, sorted by date. Please stay up to date with news which is relevant to you, as CHTC policy changes may affect the jobs of users.

For older updates not shown on this page, see our user mailing list archives.


Brief outage of some HPC Cluster nodes yesterday evening

Wednesday, June 5, 2019

Greetings,

Following the prior (early May) campus water outage, there was a temporary issue with cooling in the datacenter hosting the HPC Cluster yesterday evening, requiring an automated shutdown of some HPC Cluster nodes to preserve the datacenter's operating temperature. These nodes were rebooted this morning after the campus physical plant was able to perform some maintenance overnight. Jobs that were interrupted by the brief shutdown will need to be resubmitted.

Please email us at chtc@cs.wisc.edu with any concerns or questions.

Best,

Your CHTC Team


Ongoing CHTC Gluster semi-outage

Wednesday, June 5, 2019

Greetings CHTC users,

This message is for users of the high throughput (HTC) system who also use the Gluster file system to stage their data.

We've been unable to completely fix the issues that brought down the Gluster file system over the weekend, and it will take some time to fully resolve. In the meantime, we want to give CHTC users the option to try (if they want) to continue running work, rather than shutting down Gluster completely while we work on a long-term solution. Therefore, Gluster was brought back up on Monday and jobs that use Gluster have been allowed to run. Note that some jobs that use Gluster may still fail to access files there. A typical error when this happens looks like:

"Transport endpoint is not connected"

Thanks for your patience as we work out a long-term solution. If you have longer jobs or will lose significant progress by not having consistent access to Gluster, feel free to reach out to us at chtc@cs.wisc.edu to discuss potential solutions.

Best,

Your CHTC team


CHTC Gluster Outage

Friday, May 31, 2019

Hi Everyone,

This message is for users of the high throughput (HTC) system who also use Gluster to stage their data.

The Gluster file system is currently down and will likely remain down for at least the remainder of the weekend. We have no indication that any data has been lost at this point, but jobs that depend on reading from, writing to, or running executables from Gluster will likely see failures due to the outage.

Jobs that do not depend on Gluster should not be impacted by this outage.

Please continue to direct questions to chtc@cs.wisc.edu, and have a great weekend!

Cheers,

Your CHTC Team