Known Issues on the HTC

This page documents some common and known issues encountered on the HTC system. While this page can be beneficial in troubleshooting, it does not contain a comprehensive list of errors.

Visit our Get Help page to find more resources for troubleshooting.


[General] When submitting a job, it doesn't run / goes on hold and shows the error "Job credentials are not available".

Cause:

This is a complicated bug that can strike randomly. We’re working on a fix.

Solution:

To work around this issue, run the following command on the access point before resubmitting the job.

echo | condor_store_cred add-oauth -s scitokens -i -

[Container] When building an Apptainer, "apt" commands in the %post block fail to run.

Example error message:

Couldn't create temporary file /tmp/apt.conf.9vQdLs for passing config to apt-key

Cause:

The container needs global read/write permissions in order to update or install packages using the apt command.

Solution:

Add chmod 777 /tmp to the front of your %post block. See the example below:

Bootstrap: docker
From: ubuntu:22.04

%post
    chmod 777 /tmp
    apt-get update -y

We also recommend using the -y option to prevent installation from hanging due to interactive prompts.


[Container] When attempting to run a Docker container, it fails with the error message "[FATAL tini (7)] exec ./myExecutable.sh failed: Exec format error".

Cause:

The Docker container is likely built on an Apple computer using an ARM processor, which is incompatible with Linux machines.

Solution:

To resolve this, when building your Docker container, use the command:

docker build --platform linux/amd64 .

[GPU] My GPU job has been in the queue for a long period of time and is not starting.

Cause:

Jobs default to using CentOS9, but most GPU nodes are currently running CentOS8.

Solution:

To your submit file, add the following line and resubmit:

requirements = (OpSysMajorVer > 7)



Can’t find your issue?

Visit our Get Help page.