Slaton Lipscomb

Slaton's Tips:
Configuring IMAGIC/MPI On A Linux Cluster With A Queueing System
(SGE, Torque, PBS, ...)

Last updated November 21, 2006
Comments/corrections welcome.

IMAGIC is a product of Image Science Software GmbH in Berlin, Germany.

IMAGIC is not distributed as FORTRAN source code. Rather, it is distributed as a set of precompiled object files, which the user then links to build the executables. Because your system architecture or layout may differ from that used by Image Science, installing IMAGIC is often much more troublesome than compiling a package from source. Before continuing, make sure you have the right IMAGIC package for your platform.

For the purposes of these instructions, I will install IMAGIC version 040812 onto a machine running Red Hat Linux, with hostname node01.cluster.edu, that is part of a Linux cluster. When you see these values referred to, you will need to substitute the IMAGIC version you are actually installing, and the hostname of the computer you are installing on. The IMAGIC version number is always indicated by an empty file entitled version_xxxxxx in the source directory.

It is assumed that the MPICH1 package is already installed and configured to use the Torque/PBS job scheduling system used on the cluster, as well as any special transport used in the cluster (Myrinet, InfiniBand, etc). MPICH1 should also be configured to use ssh instead of rsh for launching threads. For this example, MPICH1 has been installed to /usr/local/mpich. On your system, it may be installed elsewhere.

The ssh-related configuration here is specific to OpenSSH. If you use another ssh implementation, such as SSH Security Corporation's commercial product, you will need to modify these instructions accordingly.

It is assumed that the Modules system is used for environment management. Initial installation and configuration of Modules is beyond the scope of this document. More information about Modules can be found at these sites:

   « http://modules.sourceforge.net »
   « http://hpcf.nersc.gov/software/os/modules.html » 

If you do not yet have ifort installed, please see these instructions.

1   Untar the IMAGIC source to a temporary directory. Rename this temporary directory to imagic-040812 where 040812 is the IMAGIC version number.

2   Move the imagic-040812 source directory to /usr/local.

1   Create the modulefile for the IMAGIC environment variables. The following variables and aliases should be defined.

setenv IMAGIC_ROOT      /usr/local/imagic-040812
set-alias i             /usr/local/imagic-040812/imagic.e
set-alias disp          /usr/local/imagic-040812/display/display.e
set-alias plot          /usr/local/imagic-040812/plot/plotall.e
set-alias em2em         /usr/local/imagic-040812/stand/em2em.e

2   The following MPICH1 environment variables and paths should also be defined.

set mpichroot           /usr/local/mpich
setenv MPIHOME          $mpichroot
setenv MPIBIN           $mpichroot/bin
setenv MPIF90           $mpichroot/bin/mpif90
prepend-path PATH       $mpichroot/bin
prepend-path MANPATH    $mpichroot/man
append-path PATH        .

It's important to prepend the MPICH bin directory to PATH, rather than append it, in case the system has other MPI packages installed.

Unfortunately, dot (.) must be added to PATH, because IMAGIC accumulates your commands for an mpi job in a batch in the user's current working directory. It then attempts to execute this file, without specifying the full path.

1   Make sure the Intel compiler and IMAGIC modules are loaded. Then build the IMAGIC programs:

$ cd $IMAGIC_ROOT
$ ./install.b

Check for failures in compilation with:

$ find . -name \*.err

2   Verify that the program paths listed in $IMAGIC_ROOT/lognames.drv are correct. The mpi and MPI paths in particular may need to be fixed if you have decided to use an externally installed mpi package.

3   Make sure the IMAGIC license file is present in $IMAGIC_ROOT/imagic.drv.

Edit the license file to make sure the # of CPUs listed is correct. This is important for the MULTI-REFERENCE-ALIGNMENT program, which uses OpenMP parallelization for machines with more than one CPU.

4   Now build the IMAGIC MPI programs:

$ cd $IMAGIC_ROOT/align
$ ../fori.b mralign mpi
$ cd ../angrec
$ ../fori.b ar mpi
$ cd ../threed
$ ../fori.b true3d mpi

IMPORTANT fori.b does some tests to determine that the correct MPI version is indicated in the imagic.drv file. These tests may break if you 1) have other MPI packages installed, or 2) are attempting to use a prebuilt MPI package located outside the IMAGIC directory. If this occurs, first verify that in the IMAGIC module, the mpi/bin directory is being prepended to the user PATH, rather than appended. If this is the case and the script still fails, you may need to edit fori.b to disable the tests that occur between:

# ----- Try to find the MPI (message passing interface) stuff

and

# ----- Check MPI entry in IMAGIC.DRV file

If you edit install.b, make sure the mpidir, mpincl and mpif_h variables are still set correctly.

1   Each user will need to generate an ssh key with a blank passphrase so that mpich can use ssh instead of rsh for launching parallel threads. Then this key needs to be added to authorized_keys. Red Hat Linux comes with OpenSSH, so we need to do this the OpenSSH way.

$ ssh-keygen -q -C BatchModeKey -t rsa -f ~/.ssh/batchmode -N ""
$ cat ~/.ssh/batchmode.pub >> ~/.ssh/authorized_keys

2   You also need to edit your user ssh config file, ~/.ssh/config, and specify that this batchmode key is to be used for connections to the local machine (only). This is done with the Host keyword. Add a section like the following to the TOP of the config file. It must be above the Host * wildcard section, if it exists.

Host toro.university.edu toro
  IdentityFile ~/.ssh/batchmode

IMAGIC determines the local machine's hostname according to the output of uname -n, so make sure the same name is used here. In this case toro is just a short convenience alias.

3   To test the ssh configuration, the following command should give a listing of the user's home directory, without asking for a password or passphrase.

$ ssh toro ls

If a password is requested, or an error results, the configuration is incorrect. Start over with step one and try again.

You're ready now to submit IMAGIC jobs to the PBS queueing system. See Using IMAGIC/MPI On A PBS Linux Cluster to learn more.