Abstract:
This is a short introduction to the HPCVL Working Template (HWT), a
toolkit that was developed by Gang Liu to facilitate parallel
programming tasks. The HWT supplies tools that allow the efficient
maintenance of multiple versions of code from a single source file,
as well as automatic relative debugging and simple timing
experiments. The Fortran, C, and C++ programming languages are
supported, as are the MPI and OpenMP parallel-programming
packages. The emphasis is on portability and simplicity. The HWT is
accessible on the HPCVL Sunfire Cluster.
Frequently Asked Questions:
What is the HPCVL Working Template (HWT) ?
What functionality does the HWT offer?
How do I use the HWT on the Sunfire cluster?
How do I maintain multiple program
versions using the HWT?
How does automatic relative
debugging work in the HWT?
How do I use the HWT for timing my code?
Can I get the HWT for my computer?
Where can I learn details about the HWT?
It doesn't work. Where can I get help?
Answers:
What is the
HPCVL Working Template (HWT) ?
The HPCVL Working Template is a toolset
designed to facilitate code development and testing, with a specific
emphasis on parallel programming.
The HWT supports the Fortran, C, and C++ programming
languages. Most of the packages is written in these and can be
recompiled on any (Unix-based) platform. At present, a version
generated on Solaris 9 using the native Sun compilers is installed
on the HPCVL Sunfire cluster and accessible for its users. Versions
for other platforms may be produced on request.
The HWT consists of a number of script files that can be invoked by
the user, as well as a library that contains essential routines, a
set of modules, and some auxiliary files. It is only available in
pre-compiled form, i.e. no source-code is supplied.
What
functionality does the Working Template offer ?
The HWT was designed to facilitate common programming tasks. It has
a distinct emphasis on parallel programming. The package offers
three distinct functionalities:
- The maintenance of multiple program versions is enabled
by means of pre-processor constructs (see Question 4). Several pre-defined default
versions are available. This allows the user to keep only one
original source code, and automatically generate various versions
from it. The HWT generates the associated version source code in
separate directories, constructs a makefile and compiles the
code. The insertion of the pre-processor constructs, as well as the
definition and control of the versions is left to the user.
- Automatic Relative Debugging allows the user to produce
error reports by running a specific debugging version of the
program (see Question 5). Automatic
implies that the HWT is performing error
checks without the involvement of the user, thus making it suitable
for the processing of large amounts of data. Relative refers
to the usage of user-specified reference data to determine the
correctness of a program. Commonly these data come from another
version of the program. For instance, this offers the possibility
to use a serial version of a program (which is assumed to be
correct) to debug a parallel version.
- Timing facilities are supplied in the HWT in the form
of a few simple routines (see Question
6). This allows the user to automatically generate tables of
CPU times and speedup values. This is especially useful for the
determination of scaling properties of parallel code with respect
to its serial counterpart.
We discuss these main functionalities in more details below in this
FAQ. Please note that for the proper usage of the HWT it is
necessary to consult the User's
Manual. This FAQ file is not meant as replacement.
How do I use the HWT on the HPCVL
Sunfire Cluster?
The HPCVL Working Template is installed on the Sunfire Cluster in
the /opt/hwt directory. Access is simple.
To run the HWT it might be useful to include the install directory
in the $PATH variable, although this is not required:
setenv PATH $PATH":/op/hwt" (for csh)
PATH=$PATH":/opt/hwt"; export PATH
(for ksh, bsh, bash)
The WT may be used simply by executing the main script,
/opt/hwt/hwt, or if the $PATH variable is
set as above, simply type hwt.
Note that this is only the initial run of the WT, and will create a
number of other script files in the current working
directory. These may then be used for specific
tasks. How this is done will be discussed in more detail in the
following.
The initial run of hwt generates a script called
call.hwt. This script can be used for subsequent calls of
the HWT without the need for spelling out the hwt root
directory, or to set the $PATH variable. Whenever a source
file was modified, or other changes were made, call.hwt will
regenerate all files if necessary, and back up the original
ones.
How do I maintain multiple program
versions using the HWT?
The HWT's first function is to process source code. It is often
desirable to produce several versions of a program, and quite
frequently these versions have a large degree of overlap, ie,
parts of the code are used by several of them. The HWT allows
the user to keep code for multiple versions in a single
"original" source code file, which is then used to
generate code corresponding to each version separately.
This is done by means of so-called pre-processor
constructs. These commonly have the following form:
(...Code A...)
#ifdef KEYWORD1
(...Code Block 1...)
#else
(...Code Block 2...)
#endif
(...Code B...)
The directives #ifdef, #else, and #endif are not used by a
standard compiler, but by a pre-compiler such a
cpp for C and C++, or fpp for Fortran. The HWT
uses such pre-compilers to generate multiple versions of the
code. In the above example, Code A and Code B would appear in
all code versions. In contrast, Code Block 1 would only appear
in code for which KEYWORD1 is defined, and Code Block 2 only in
code for which it is not. The definitions for which KEYWORDs
are associated with which version of the code is left to the
user.
The user is asked to specify this and other details in a file
called control. A template of such a file will be placed
into the working directory the first time the HWT is
executed. After the user has edited the control file and
inserted the necessary pre-processor contructs, the HWT
performs the following tasks when executed (by typing
call.hwt):
- It detects all source code in the current working
directory on the basis of file extensions.
- It pre-processes the code using cpp or fpp
to generate the source code corresponding to different
versions.
- It places the different version codes into separate
directories and creates a makefile for each of them.
- It compiles the versions to create object files, modules,
and executables.
Most of these actions are performed automatically. Details are
specified in the control file. For a detailed
description of these features, please consult the manual.
Note that the HWT installation on the HPCVL Sunfire Cluster
was compiled with the Sun Studio 10 compilers, which are the
current native compilers on that system. This installation will
only work properly with the Sun Studio 10 compilers. If
you are still using earlier versions, please consider
migrating, as there will be incompatibilities in the HWT
modules.
How does automatic relative debugging work in the HWT?
The second functionality of the HWT is Automatic
Relative Debugging (ARD). This is a debugging
method in which reference data, often from another version of
the program, are used to determine automatically if
intermediate data in a program execution are correct or
not. While the reference data may be taken from any source, it
is more common that this method is used to compare a
"correct" version of the program with the one that
needs to be debugged. For instance, if the goal is to
parallelize a program, the serial program version may serve as
a reference for the parallel one under development.
To achieve this it is necessary to define which data need to be
compared with which. The HWT solves this problem by means of a
set of library routines that are supplied with the
package. Calls to these routines are used to define a unique Data
Identifier, which is then printed out with the data that
need to be compared. The comparison only takes place if the
data identifier matches between the reference data and the
debugging data. The data identifiers used in the HWT are of a
specific standard form that consists of three components,
namely:
- A Principal Component which is usually a
descriptive string that is indicative of the physical
meaning of the data to be compared.
- An Instance Component that consists of both
strings and integer variables, and indicates the instance
of those data in the code. This is often used to indicate
possibly multiple loop indices.
- An additional Physical Index (integer) is used to
uniquely label simple data inside a data structure. This
enables the comparison of ordinary arrays with distributed
ones in parallel programming.
Since it is often whole arrays that need to be compared, the
Physical Index becomes important if the internal structure of
the array in the debugging version of the program differs
from the one in the reference. This is the case when
non-distributed arrays in a serial program serve as reference
for distributed arrays in a parallel program.
Space does not allow us to explain the usage of routine calls
for the construction of data identifiers here. This is
explained in the manual. Here we
can only outline the basic steps of a debugging run:
- The user inserts calls to debugging routines into the
code. These appear in pre-processor constructs, and
therefore are specific to a debugging version of the
code. This is done in both the version to be debugged and
the reference version. These calls construct data
identifiers and initiate the output of the debugging data
(into intermediate files).
- Both the debugging version and the reference version are
executed after using the HWT to compile them (which links
in the HWT library). This causes the generation of
intermediate files that contain the data used for
debugging.
- A call to the script call.debugger.hwt causes the HWT
to compare the corresponding data items and detect
deviations that exceed a certain tolerance limit
(user-defined). Error reports are generated that contain
information necessary to locate the problems.
For many cases, the use of 4 or 5 different routines is
sufficient to debug relatively large data structures and locate
errors. Since any comparison is done automatically, no
"manual" comparison of single data values is
required.
How do I use the HWT for timing my code?
The HWT also offers a simple way to introduce timing into
your code to optimize it for execution speed, and - in the
case of parallel programs - to determine its scaling
properties. This is done using calls to two routines that are
part of the HWT library. One of them indicates the beginning
of a timing region in the code, the other its end. The
second routine also serves to label a region to distinguish
it from others.
If a region of code that is contained in a timing region is
executed multiple times, timings are added up, i.e. they are
cumulative. The HWT also keeps track of how many times the
region was executed. Here is how a timing experiment
can be performed:
- The user inserts calls to timing routines into the
code, bracketing the regions that are to be timed. These
calls are usually placed within pre-processor constructs
to restrict them to specific timing versions.
- All versions that are to be included in the timing
experiment are executed. This might include a serial
version, and multiple runs of a parallel version,
differing by the number of processors employed. This will
produce several intermediate files with timing information.
- Executing the script call.cputimer.hwt will retrieve
the timing information and compare it. A report in table
format will be printed, including CPU times and speedups
(usually with respect to a serial or one-processor run).
For multiple-processor runs this information is printed
separately for each processor.
Details about the usage of the timing routines may be found
in the HWT manual.
Can I get the HWT for my computer?
Yes. We are offering the HWT in a pre-compiled form
for personal use to both HPCVL members and other academic
individuals, free of charge. However, we require that a license
agreement be signed and that the use be restricted to the
individual who signed it. For details, please contact us.
Where can I learn details about the HWT?
The most important source of information about the HPCVL
Working Template is, of course, the
HWT Manual. The present documented version is 5.3.
HPCVL also provides Workshops on a regular basis, and one
of these is partly devoted to the usage of the HWT. Check
out our web page at http://www.hpcvl.org to
see if one is scheduled in the near future. For HPCVL
members, we are also supplying user support; we are always
glad to answer any questions that you might not find
answered in the manual.
It doesn't work. Where can I get
help?
For HPCVL member, we supply user support; you can call or send
email to one of our
support staff, who include several scientific programmers. Keep
in mind that we support many people at any given time, so we cannot
do the coding for you. But we can do our best to help you solve
your problems with our multi-processor machines.
|