Powered by:
Open Science Grid
Center for High Throughput Computing

Installing Software to Gluster

This guide describes when and how to install software to our Gluster file system.

To best understand the below information, users should already have an understanding of:

Overview

Using our Gluster file system to install software is always a last resort -- software performs more slowly when installed to Gluster and otherwise negatively impacts the performance of the file system overall. Always email the CHTC facilitators at chtc@cs.wisc.edu before installing software to Gluster

Once your interactive job begins on one of our compiling servers, you can find which OpenMP/MPI modules are available to you by typing:

[alice@build]$ module avail

Choose the module you want to use and load it with the following command:

[alice@build]$ module load mpi_module

where mpi_module is replaced with the name of the OpenMP or MPI module you'd like to use.

After loading the module, compile your program. If your program is organized in directories, make sure to create a tar.gz file of anything you want copied back to the submit server. Once typing exit the interactive job will end, and any *files* created during the interactive job will be copied back to the submit location for you.

Script For Running OpenMP/MPI Jobs

To run your newly compiled program within a job, you need to write a script that loads an OpenMP or MPI module and then runs the program, like so:

#!/bin/bash

# Command to enable modules, and then load an appropriate MP/MPI module
. /etc/profile.d/modules.sh
module load mpi_module

# Command to run your OpenMP/MPI program
# (This example uses mpirun, other programs
# may use mpiexec, or other commands)
mpirun -np 8 myprogram

Replace mpi_module with the name of the module you used to compile your code, myprogram with the name of your program, and X with the number of CPUs you want the program to use. There may be additional options or flags necessary to run your particular program; make sure to check the program's documentation about running multi-core processes.

Submit File Requirements

There are several important requirements to consider when writing a submit file for multicore jobs. They are shown in the sample submit file below and include:

  • Require Gluster, for OpenMP or MPI modules. Make sure that you include a requirements statement for our Gluster file share - this is where the OpenMP/MPI modules live.
  • Request *accurate* CPUs and memory Run at least one test job and look at the log file produced by HTCondor to determine how much memory and disk space your multi-core jobs actually use. Requesting too much memory will cause two issues: your jobs will match more slowly and they will be wasting resources that could be used by others. Also, the fewer CPUs your jobs require, the sooner you'll have more jobs running. Jobs requesting 16 CPUs or less will do best, as nearly all of CHTC's servers have at least that many, but you can request and use up to 36 CPUs per job.
  • The script you wrote above (shown as run_mpi.sh below) should be your submit file "executable", and your compiled program and any files should be listed in transfer_input_files.
  • Use the getenv = true statement to set up the job's running environment.

A sample submit file for multi-core jobs is given below:

# multicore.sub
# A sample submit file for running a single multicore (8 cores) job

universe = vanilla
log = mc_$(Cluster).log
output = mc_$(Cluster).out
error = mc_$(Cluster).err

executable = run_mpi.sh
# arguments = (if you want to pass any to the shell script)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = input_files, myprogram

requirements = ( Target.HasGluster == true )
getenv = true

request_cpus = 8
request_memory = 8GB
request_disk = 2GB

queue

After the submit file is complete, you can submit your jobs using condor_submit.

Using Gluster for Software

  • Request a Gluster directory from CHTC, by emailing chtc@cs.wisc.edu with a description of why you need it.
  • When compiling, copy all of your files into your Gluster directory before submitting the interactive job for compiling. When submitting the interactive job, don't include your source files in the transfer_input_files line. Instead, once the interactive job starts, move into your Gluster directory:
    $ cd /mnt/gluster/NetId/path/to/source_code
    From there, follow the commands about loading modules and compiling as above. Once you're finished, type exit to end the interactive job.
  • In your job's executable script, instead of listing the name of the compiled program after your mpirun or mpiexec command, list the full path to that program, like so:
  • With file transfer From Gluster
    #!/bin/bash
    # Activate modules and load the appropriate module
    . /etc/profile.d/modules.sh
    module load mpi_module
    mpirun -np 8 myprogram
    
    #!/bin/bash
    # Activate modules and load the appropriate module
    . /etc/profile.d/modules.sh
    module load mpi_module
    mpirun -np 8 /mnt/gluster/NetId/path/to/myprogram
    
  • In your submit file, do not include the name of your compiled executable in "transfer_input_files". It will be referenced directly from your script, as seen above on the right.

    With file transfer From Gluster
    transfer_input_files = input_files,myprogram
    
    transfer_input_files = input_files
    
  • Once you've compiled your code within Gluster and written your script and submit file, you can submit your job from your /home/ directory as normal, using condor_submit.