Using MPI

From Wiki

Jump to: navigation, search


Setting up your environment


Some versions of the Message-Passing Interface (MPI) we are using use secure-shell (SSH) connections to launch the parallel jobs. This means that you need to set up an SSH key pair so that you can log on from one grid node to another without a password. This is done using the ssh-keygen program, run man ssh-keygen for details on how this works. The basic approach is as follows:

mkdir .ssh                     (Don't worry if this fails)
chmod 0700 .ssh
ssh-keygen -t dsa              (Press return to accept all defaults)
cd .ssh
touch authorized_keys
cat >> authorized_keys

Also, you need to be using the bash shell to run large MPI jobs; tcsh and other shells have too-small limits on the environment for this to work.

Selecting an MPI environment

There is a locally-built utility called pathmunge that can be used to set up your environment in several ways, one of which is to select an MPI version. To use pathmunge, you must be using BASH and you must make the command available in your shell by running:

source /usr/local/bin/

This line should run placed in your ~/.bash_profile file or executed at some point before using the pathmunge command. At the beginning of a job script is another good location.

To set the MPI version to openmpi-1.3.2, run the following command:

pathmunge usempi openmpi-1.3.2 

If you always want to use openmpi-1.3.2, you can place the previous command in your ~/.bash_profile file after the line beginning with source. If you always want to use a particular MPI version for a job, you can place the previous command at the top of the job script.

To see which MPI environments are currently available, run:

pathmunge usempi list

Available MPI Environments

As of December 4th, 2009, the following MPI environments are available:

  • mvapich_gcc-1.1.0
  • mvapich2_gcc-1.2p1
  • openmpi_gcc-1.2.8
  • openmpi-1.3.2 (*)
  • openmpi-1.3.3 (*)
  • sunhpc-8.1-gnu.x86_64 (*)
  • sunhpc-8.1-sun.x86_64 (*)


If you've setup your environment properly, library and executable paths should be configured for compiling MPI programs. To compile your program, use the following commands instead of your normal compiler: mpicc, mpicxx, mpiCC, mpif77, mpif90.


qsub Parameters

When running your job through the grid, you need to specify that you're executing an MPI job by passing -pe <parallel environment> <# of slots> to qsub. There are two MPI parallel environments available. The MPI parallel environment tries to fill up the slots of a single node before putting anything on the next node. The rrMPI parallel environment fills nodes up in a round robin fashion, attempting to fill the slots of all available nodes equally. The -pe argument can also be placed in your job script in a #$ comment, as shown in the Howto.

Sample Scenarios

Note: These scenarios assume an empty cluster. If there are other jobs running, the grid software will attempt to schedule slots as described below.

Assuming you're running on the comp.q, which has 15 nodes available with 16 slots each, the following qsub invocation would result in one node being filled completely with instances of the job:

qsub -pe MPI 16

The next qsub invocation would result in 2 instances the job being placed on each node.

qsub -pe rrMPI 30

The next qsub invocation would result in the first 4 available nodes being filled with instances of the job.

qsub -pe MPI 64

mpirun Invocation

Inside your job script somewhere, you will need to invoke mpirun to execute your job. If you're using one of the starred MPI environments (see above), then the mpirun invocation is relatively simple:

`which mpirun` -np $NSLOTS my_executable

`which mpirun` is used to ensure that the correct mpirun executable is used on all nodes across the cluster. It may work to just use mpirun, but we've had consistent results using this approach.

If you're using on of the non-starred MPI environments, you should use the following mpirun invocation:

`which mpirun` -np $NSLOTS -hostfile $TMPDIR/machines my_executable

The arguments to mpirun tell it how many instances of your executable to run and on which hosts they should be run. When using the starred MPI environments, this information can be determined from the execution environment. Unfortunately, if these arguments are provided in the starred environments, all instances of your job will end up on one node. Make sure you're using the correct mpirun parameters for your environment.

Personal tools