Faq

From Wiki

Jump to: navigation, search

Contents

Administrative

How do I get an account?

See Getting an account.

Compiling

Unresolved MPI/CUDA Symbols in Fortran

Newer versions of the gfortran compiler require a -fsecond-underscore option to be added to the command line. To compile the WRF codes from NCAR, a -DF2CSTYLE flag also had to be added to ARCHFLAG. If you don't use these parameters, then even if you compile with mpif90, you don't get symbols from the library.

I can't find the CUDA libraries or binaries

The 'locate' command is helpful in general, and specifically 'locate cuda' turns up /usr/local/cuda as the home of cuda. Also, there are libraries in /usr/local/lib64 (libcuda.so and friends).

No -fPIC option on MPI libraries

Some C++ codes (like VTK) don't want to compile without position-independent code (PIC), so the MPI library needs to be compiled with -fPIC. There is a problem when building shared libraries. The linker complains that a relocation in libpmpich++.a cannot be used when making a shared object. We're working on it.

MPI jobs

My MPI job crashed and had all sorts of long messages about timeouts and bad system calls from a bunch of jobs.

Check earlier in the log file for error messages from your job itself. Also check for reports of segmentation faults or other unusual exits. If one of the tasks exits without closing MPI properly, this causes the entire set of jobs to come crashing down with all sorts of error messages.

My multi-threaded MPI job crashes, especially when sending large blocks.

Not all MPI implementations are thread-safe. Try having one thread do all of the MPI communication, or use the MPI parallelization without multiple threads per process.

Grid Engine

My job isn't running! My job is stuck in the *qw* state!

To figure out why your job hasn't been scheduled, run the following command:

qalter -w p <JOB_ID>

This reports whether a suitable queue was found for your job and what factors led to that decision.