In most cases, you will run ADF in batch mode.
Production jobs are submitted to our systems via the
Grid Engine, which is a load-balancing software. To obtain
details, read our Grid
Engine FAQ.
For an ADF batch job, this means that rather than issuing the
above commands directly, you wrap them into a Grid Engine batch
script. For an example for such a batch script please click here .
This script needs to be altered by replacing all the relevant
items enclosed in {} by the proper values. It will set all the
necessary environment variables (make sure you issued a use
adf statement before using this), and then starts the
program. The lines in the script that start with #$ are
interpreted by the operating system as a mere comment, but by the
Grid Engine load balancing software as directives for the
execution of the program.
For instance the line "#$ -m be" tells the Grid Engine to notify
the user via email when the job has started and when it is finished,
while the line beginning with "#$ -M" tells the Grid Engine about the
email address of the user.
The lines starting with #$ -o and #$ -e determine
whence the standard input and the standard error, respectively are to
be redirected. Since the job is going to be executed in batch, no
terminal is available as a default for these. Note that no further
redirection using > is therefore necessary. All file names and
directory names appearing in the script should be given in full to
avoid ambiguities. One of the most common mistakes when writing Grid
Engine scripts is redirecting output to inaccessible files.
The ADF package is able to execute on several processors
simultaneously in a distributed-memory fashion. This means that
some tasks such as the calculation of a large number of matrix
elements, or numerical integrations may be done in a fraction of the
time it takes to execute on a single CPU. For this, the processors on
the cluster need to be able to communicate. To this end, the
SUN version of ADF uses the MPI (Message Passing Interface), a
well-established communication system.
Because ADF uses a specific version of the parallel system MPI
(ClusterTools 7), executing the use adf command will also cause
the system to "switch" to that version, which might have an impact on
jobs that you are running from the same shell later. To undo this
effect, you need to type use ct8 when you are finished using
ADF and want to return to the production version of MPI (ClusterTools
8).
ADF parallel jobs that are to be submitted to Grid Engine will use
the MPI parallel environment and queues already defined for the HPCVL
users.
Our sample script contains a line that determines the number of
parallel processes to be used by ADF. The Grid Engine will start the MPI
parallel environment (PE) with a given number of slots that you
specify by modifying that line:
#$ -pe dist.pe {number of processes}
where the number of processes requested replaces the expression in {}.
Once properly modified, the script can be submitted to the Grid
Engine by typing
qsub batch_file_name
The advantage to submit jobs via a load balancing software is that
the software will automatically find the resources required and put
the job onto a node that has a low load. This will help executing the
job faster. Note that the usage of Grid Engine for all production
jobs on HPCVL clusters is mandatory. Production jobs that are
submitted outside of the load balancing software will be terminated by
the system administrator.
Luckily, there is an easier way to do all this: We are
supplying a small perl script called that can be called directly, and
will ask a few basic questions, such as the name for the job to be
submitted and the number of processes to be used in the job. Simply
type
ADFSubmit
and answer the questions. The script expects a ADF input file with
"file extension" .adf to be present and will do everything else
automatically. This is meant for simple ADF job submissions. More
complex job submissions are better done manually.
Back to Top...
Where
can I get further help?
ADF
is a complex software package, and requires some practice
to be used efficiently. In this FAQ we can not explain its
use in detail. A User's Guides for ADF
and BAND
can be downloaded here. The software provider SCM operates
a very informative website
with lots of information, including examples, manuals,
FAQ's, etc. There is also a User Email Group, and we
encourage people who use the software regularly to join.
If you have problems with the Grid
Engine, read
our Grid Engine FAQ on that subject, and maybe consult
the manual for that software which is accessible as a
PDF file. HPCVL also provides user support in the case
of technical problems. Contact
us here, we might be able to help, or pass you on to
someone who can.