Faq
From Wiki
Contents |
Administrative
How do I get an account?
See Getting an account.
Compiling
Unresolved MPI/CUDA Symbols in Fortran
Newer versions of the gfortran compiler require a -fsecond-underscore option to be added to the command line. To compile the WRF codes from NCAR, a -DF2CSTYLE flag also had to be added to ARCHFLAG. If you don't use these parameters, then even if you compile with mpif90, you don't get symbols from the library.
I can't find the CUDA libraries or binaries
The 'locate' command is helpful in general, and specifically 'locate cuda' turns up /usr/local/cuda as the home of cuda. Also, there are libraries in /usr/local/lib64 (libcuda.so and friends).
No -fPIC option on MPI libraries
Some C++ codes (like VTK) don't want to compile without position-independent code (PIC), so the MPI library needs to be compiled with -fPIC. There is a problem when building shared libraries. The linker complains that a relocation in libpmpich++.a cannot be used when making a shared object. We're working on it.
MPI jobs
My MPI job crashed and had all sorts of long messages about timeouts and bad system calls from a bunch of jobs.
Check earlier in the log file for error messages from your job itself. Also check for reports of segmentation faults or other unusual exits. If one of the tasks exits without closing MPI properly, this causes the entire set of jobs to come crashing down with all sorts of error messages.
My multi-threaded MPI job crashes, especially when sending large blocks.
Not all MPI implementations are thread-safe. Try having one thread do all of the MPI communication, or use the MPI parallelization without multiple threads per process.
Grid Engine
My job isn't running! My job is stuck in the *qw* state!
To figure out why your job hasn't been scheduled, run the following command:
qalter -w p <JOB_ID>
This reports whether a suitable queue was found for your job and what factors led to that decision.
