Known Issues on the HTC
This page documents some common and known issues encountered on the HTC system. While this page can be beneficial in troubleshooting, it does not contain a comprehensive list of errors.
Visit our Get Help page to find more resources for troubleshooting.
Table of Contents
[General] When submitting a job, it doesn't run / goes on hold and shows the error "Job credentials are not available".
Cause:
This is a complicated bug that can strike randomly. We’re working on a fix.
Solution:
To work around this issue, run the following command on the access point before resubmitting the job.
echo | condor_store_cred add-oauth -s scitokens -i -
[Container] When building an Apptainer, "apt" commands in the %post block fail to run.
Example error message:
Couldn't create temporary file /tmp/apt.conf.9vQdLs for passing config to apt-key
Cause:
The container needs global read/write permissions in order to update or install packages using the apt
command.
Solution:
Add chmod 777 /tmp
to the front of your %post
block. See the example below:
Bootstrap: docker
From: ubuntu:22.04
%post
chmod 777 /tmp
apt-get update -y
We also recommend using the -y
option to prevent installation from hanging due to interactive prompts.
[Container] When attempting to run a Docker container, it fails with the error message "[FATAL tini (7)] exec ./myExecutable.sh failed: Exec format error".
Cause:
The Docker container is likely built on an Apple computer using an ARM processor, which is incompatible with Linux machines.
Solution:
To resolve this, when building your Docker container, use the command:
docker build --platform linux/amd64 .
[GPU] My GPU job has been in the queue for a long period of time and is not starting.
Cause:
Jobs default to using CentOS9, but most GPU nodes are currently running CentOS8.
Solution:
To your submit file, add the following line and resubmit:
requirements = (OpSysMajorVer > 7)
Can’t find your issue?
Visit our Get Help page.