Powered by:
Open Science Grid
Center for High Throughput Computing

Troubleshooting in ChtcRun

For CHTC users who are using our "ChtcRun" tools, the log, output and error files have been set up in advance using the mkdag command. You can find these files for each job in the individual job output directories. There are two primary files to use for troubleshooting:

process.log

process.log is the job's log file, and has all the information described in the log file above.

ChtcWrapper###.out

This is the output file, also as described above. The ChtcWrapper.out file is highly structured and contains the following information:

  • Where the job runs:
    Running here: e032.chtc.wisc.edu
  • Working directory after the job ran: you should see the appropriate Matlab/R/Python package and any input files and scripts your job needs.
    *********** SANDBOX SPACE size: Before your job runs. Shows all files before 
    the job runs *************
    415M    .
    total 27228
    -rw-r--r--  1 slot1 slot1      233 Apr  1 08:32 AuditLog.job1
    -rw-r--r--  1 slot1 slot1      118 Apr  1 08:32 CURLTIME_4820
    -rw-r--r--  1 slot1 slot1  1366788 Apr  1 08:32 ChtcWrapperjob1.out
    drwxr-xr-x 15 slot1 slot1     4096 May 19  2014 R-3.1.0
    lrwxrwxrwx  1 slot1 slot1       16 Apr  1 08:32 RLIBS.tar.gz -> sl6-RLIBS.tar.gz
    drwxr-xr-x  3 slot1 slot1     4096 Apr  1  2015 RR
    -rw-r--r--  1 slot1 slot1  2677087 Apr  1 08:32 SLIBS2.tar.gz
    drwxr-xr-x  2 slot1 slot1     4096 Mar 19 10:10 SS
    -rw-r--r--  1 slot1 slot1      317 Apr  1 08:32 _condor_stderr
    -rw-r--r--  1 slot1 slot1      899 Apr  1 08:32 _condor_stdout
    -rwxr-xr-x  1 slot1 slot1    45862 Apr  1 08:32 chtcinnerwrapper
    -rwxr-xr-x  1 slot1 slot1     5393 Apr  1 08:32 condor_exec.exe
    -rw-r-----  1 slot1 slot1    81935 Apr  1 08:32 gapminderDataFiveYear.txt
    -rw-r--r--  1 slot1 slot1      137 Apr  1 08:32 sl6-R-3.1.0_INFO
    -rw-r--r--  1 slot1 slot1 23644643 Apr  1 08:32 sl6-RLIBS.tar.gz
    -rw-r--r--  1 slot1 slot1     1543 Apr  1 08:32 sleep.R
    drwxr-xr-x  2 slot1 slot1     4096 Apr  1 08:32 temp
  • Output from the job - anything your job should print to screen.
    *********** YOUR JOB OUTPUT BELOW *************
    Loading required package: methods
    Saving 7 x 7 in image
    [1] TRUE
    *********** YOUR JOB OUTPUT ABOVE *************
  • Working directory after the job ran: you should see your output files.
    *********** SANDBOX SPACE size: After your job runs. Shows all files after 
    the job ran *************
    415M    .
    total 27M
    -rw-r--r--  1 slot1 slot1  390 Apr  1 08:33 AuditLog.job1
    -rw-r--r--  1 slot1 slot1  118 Apr  1 08:32 CURLTIME_4820
    -rw-r--r--  1 slot1 slot1 1.4M Apr  1 08:33 ChtcWrapperjob1.out
    drwxr-xr-x 15 slot1 slot1 4.0K May 19  2014 R-3.1.0
    lrwxrwxrwx  1 slot1 slot1   16 Apr  1 08:32 RLIBS.tar.gz -> sl6-RLIBS.tar.gz
    drwxr-xr-x  3 slot1 slot1 4.0K Apr  1  2015 RR
    -rw-r--r--  1 slot1 slot1 4.9K Apr  1 08:33 Rplots.pdf
    -rw-r--r--  1 slot1 slot1 2.6M Apr  1 08:32 SLIBS2.tar.gz
    drwxr-xr-x  2 slot1 slot1 4.0K Mar 19 10:10 SS
    -rw-r--r--  1 slot1 slot1  317 Apr  1 08:32 _condor_stderr
    -rw-r--r--  1 slot1 slot1  899 Apr  1 08:32 _condor_stdout
    -rwxr-xr-x  1 slot1 slot1  45K Apr  1 08:32 chtcinnerwrapper
    -rwxr-xr-x  1 slot1 slot1 5.3K Apr  1 08:32 condor_exec.exe
    -rw-r-----  1 slot1 slot1  81K Apr  1 08:32 gapminderDataFiveYear.txt
    -rw-r--r--  1 slot1 slot1  137 Apr  1 08:32 sl6-R-3.1.0_INFO
    -rw-r--r--  1 slot1 slot1  23M Apr  1 08:32 sl6-RLIBS.tar.gz
    -rw-r--r--  1 slot1 slot1 1.6K Apr  1 08:32 sleep.R
    drwxr-xr-x  2 slot1 slot1 4.0K Apr  1 08:32 temp