Running Matlab Jobs on CHTC
The examples and information in this guide work best for the below cases*:
(*see the HTCondor software manual and online examples from other organizations for cases outside of those we cover on the CHTC website)
- Submission to an HTCondor System with file transfer (rather than a shared filesystem).
- Submission to an HTCondor System that is unix-based (Linux or Mac operating system, as Windows may have important differences).
To best understand the below information, users should already have an understanding of:
Like most programs, Matlab is not installed on CHTC's high throughput compute system.
One way to run Matlab where it isn't installed is to compile Matlab
files into a binary file and run the binary by using a set of files called the Matlab
runtime. In order to run Matlab in CHTC, it is therefore necessary to perform the
following steps which will be detailed in the guide below (click on the links to go
to the relevant section):
- Prepare your Matlab program
- Write a submit file that uses the compiled code and script
If your Matlab code depends on random number generation, using
a function like
randperm, please see
the section on ensuring randomness below.
1. Preparing Your Matlab Program
You can compile
.m files into a Matlab binary
yourself by requesting an interactive session on one of our build machines.
The session is essentially a job without an executable;
you are the one running the commands instead (in this case, to compile the code).
Instructions for submitting an interactive build
job are here: http://chtc.cs.wisc.edu/inter-submit.shtml. You'll need
to do Step 2 (Creating Interactive Submit Files), and the first command of Step 3
(Submitting and Working Interactively).
For Step 2, you'll need to change
reflect all the
.m files on which your program depends.
These files need to be uploaded to the submit server before you submit
the interactive job for compiling.
If you have many files or directories that are part of your code,
we recommend compressing them into a tarball (
A. Compile Matlab Code
Once you've done Steps 2 and 3 of the interactive job
guide, and the interactive job has started, you can compile your code.
In this example,
foo.m represents the
name of the primary Matlab script; you should replace
with the name of your own primary script. Note that if your main
script references other
.m files, they will all be compiled
together with the main script into one binary.
Choose one of the following compile commands, based on the version of Matlab you'd like to use:
[alice@build]$ /usr/local/MATLAB/R2015b/bin/mcc -m -R -singleCompThread -R -nodisplay -R -nojvm foo.m
[alice@build]$ /usr/local/MATLAB/R2014b/bin/mcc -m -R -singleCompThread -R -nodisplay -R -nojvm foo.m
[alice@build]$ /usr/local/MATLAB/R2013b/bin/mcc -m -R -singleCompThread -R -nodisplay -R -nojvm foo.m
[alice@build]$ /usr/local/MATLAB/R2011b/bin/mcc -m -R -singleCompThread -R -nodisplay -R -nojvm foo.m
There are other options for the
mcc Matlab compiler. If you have questions about your particular code, contact a facilitator or see the Matlab documentation.
B. Modifying the Executable
mcc command should have created a script called
run_*.sh (where * is the name of your Matlab script; our
example uses the name
script will be the executable for your Matlab jobs and already has
almost all the necessary commands for running your Matlab code. You'll need
to add one line at the beginning of the
run_*.sh script that unpacks
the Matlab runtime. We'll also add some extra options to ensure Matlab
runs smoothly on any Linux system.
The command that needs to be added at the start of this script looks like this
r2015b.tar.gz with the appropriate version
of Matlab, if you used something different to compile):
# script for execution of deployed applications
# Sets up the MATLAB Runtime environment for the current $ARCH and executes
# the specified command.
# Add these lines to run_foo.sh
tar xzf r2015b.tar.gz
# Rest of script follows
exit after you have compiled your code (step A) and edited
the executable script (B).
Condor will transfer your compiled code and its scripts back automatically.
Back on the submit node, you
should now have the following files:
[alice@submit]$ ls -l
-rw-rw-r-- 1 user user 581724 Feb 19 14:21 mccExcludedFiles.log
-rwxrw-r-- 1 user user 94858 Feb 19 14:21 foo
-rw-rw-r-- 1 user user 3092 Feb 19 14:21 readme.txt
-rw-rw-r-- 1 user user 581724 Feb 19 14:21 requiredMCRProducts.txt
-rwxrw-r-- 1 user user 1195 Feb 19 14:21 run_foo.sh
foo is the compiled Matlab binary. You will not
readme.txt to run your jobs.
Note that sometimes the compiled Matlab binary will lose its "executable"
permissions. When that happens, they can be restored by running the following command:
[alice@submit]$ chmod +x foo
foo is the name of your own compiled binary.
2. Running Matlab Jobs
This section shows the important elements of creating
a submit file for Matlab jobs. The submit file for your job will be
different than the one used
to compile your code. As a starting point for a submit file,
see our "hello world" example: http://chtc.cs.wisc.edu/helloworld.shtml.
In what follows, replace our example
with the name of your binary and scripts.
run_foo.sh as the executable:
executable = run_foo.sh
- In order for your Matlab code to run, you will need to use
a Matlab runtime package. This package is easily downloaded from CHTC's web proxy; the
version must match the version you used to compile
your code. Options available on our proxy include:
To send the runtime package to your jobs, list a link to the appropriate version
transfer_input_files line, as
well as your compiled binary and any necessary input files:
transfer_input_files = http://proxy.chtc.wisc.edu/SQUID/r2015b.tar.gz,foo,input_files
- Include the appropriate arguments for
(as described in
readme.txt). This will be the name of the Matlab runtime
directory and any arguments your Matlab code needs to run. The name of the Matlab
directories for the different versions are as follows:
|Matlab version ||Runtime directory name|
So to run a Matlab job using
r2015b and no additional arguments, the
arguments line should read:
arguments = v90
If you are passing additional arguments to the script, they can
go after the first "runtime" argument:
arguments = v90 $(Cluster) $(Process)
As always, test a few jobs for disk space/memory usage in order to
make sure your requests for a large batch are accurate! The runtime package is
large (at least 1.5 GB). Disk space and
memory usage can be found in the log file after the job completes.
This section is only relevant for Matlab scripts that
use Matlab's random number functions like
Whenever Matlab is started for the first time on a new computer,
the random number generator begins from the same state. When you
run multiple Matlab jobs, each job is using a copy of Matlab that
is being used for the first time -- thus, every job will start with
the same random number generator and produce identical results.
There are different ways to ensure that each job is using different
randomly generated numbers.
Mathworks page describes one way to "reset" the random number
generator so that it produces different random values when Matlab
runs for the first time. Deliberately choosing your own different
random seed values for each job can be another way to ensure different