|
This is a short FAQ on using the Finite-Element Analysis (FEA)
code "Abaqus" on HPCVL machines. This software
is only licensed for academic researchers who work at a
university that is already covered by an Abaqus license. The
software is only made available to persons who belong to a
specific Unix group. See details below.
Questions:
What is "Abaqus"?
Where is "Abaqus"
located and how do I access it??
How do I use "Abaqus" on
the workup node?
How do I set up and submit
"Abaqus" batch jobs?
How do I setup and execute a "Abaqus"
parallel batch job?
How do I execute a "Abaqus"
job on the "mini-cluster"?
Where can I get further help?
Answers:
What is "Abaqus"?
The ABAQUS suite of software for finite element
analysis (FEA) has the ability to solve a wide variety
of simulations. The ABAQUS suite consists of three
core products - ABAQUS/Standard, ABAQUS/Explicit and
ABAQUS/CAE.
ABAQUS/Standard is designed to solve traditional
implicit finite element analyses, such as static,
dynamics, and thermal. It is equipped with a wide
range of contact and nonlinear material
options. ABAQUS/Standard also has optional add-on and
interface products, as well as integration with third
party software.
ABAQUS/Explicit is focused on transient dynamics and
quasi-static analyses using an explicit approach
appropriate in many applications such as drop test,
crushing and many manufacturing processes.
ABAQUS/CAE provides a modeling and visualization
environment for ABAQUS analysis products. It offers access
to CAD models, advanced meshing and visualization, and an
exclusive view towards ABAQUS analysis products. ABAQUS/CAE
is used mainly for pre- and post-processing. Note that this
part of the ABAQUS software is not running on HPCVL
machines, but on a user client machine.
Back to
Top...
Where is "Abaqus " located and how
do I access it?
The
present version of Abaqus is 6.5. The programs in the
Abaqus package are located in the directory
/opt/abaqus-6.5
Note that only the ABAQUS/Standard and
ABAQUS/Explicit packages are installed on the HPCVL
clusters. This is because these components are the
"number crunching" elements of Abaqus, whereas the
ABAQUS/CAE component is used interactively for pre-
and post-processing. The latter runs only on PC
systems under Windows or Linux.
It is a good idea to include the directory
/opt/abaqus-6.5/Commands in your PATH or
set an environment variable before using Abaqus:
setenv ABAQUS_HOME /opt/abaqus-6.5/Commands (for csh)
export ABAQUS_HOME=/opt/abaqus-6.5/Commands (for bash)
If the directory was included in the PATH, the
software is called by simply invoking
"abaqus"; if the environment
variable was set, the call would be to
"$ABAQUS_HOME/abaqus".
Alternatively, you can take advantage of the
usepackage facility on our cluster, and simply type
use abaqus65 from the command prompt or
include that line in your .login or .bash_profile
setup file.
To use Abaqus on the HPCVL machines, you have to
be covered by an academic Abaqus
license outside of HPCVL, i.e. you have to be a
"licensed University User of Abaqus".
It is furthermore required that you read our
licensing
agreement, and
sign a
statement. Note that our license does count as a
license for Queen's University. We will confirm your
statement, and you will then be made a member of a
Unix group "abaqus", which enables
you to run the software. Contact us if you are in
doubt of whether you will be able to run Abaqus on our
system. We also will submit your name and affiliation
to Abaqus Inc. for a check if a prior university
license exists.
If you need to use the pre- and post-processing software
ABAQUS/CAE, you have two options. Either you use your local
university license to install and use the software on your
PC. Or, you contact us and we make our license available to you
on a single machine with a fixed IP address only. Note
that the number of simultaneous CAE sessions supported by our
license is presently limited to 3 (three).
Our Abaqus license is "seat limited" and "process
limited". The licensing scheme utilizes so-called
"tokens". At present, there are 150 tokens available. A
single-process run of Abaqus (Standard or Analysis) uses 5 tokens,
multiple process-runs use more according to the formula:
Tokens = Int (5 * Processes^0.422)
To check how many license tokens are available, you can use the
following command:
/opt/abaqus-6.5/License/lmstat -a -c /opt/abaqus-6.5/License/license.dat
which will tell you how many of the 150 tokens are presently in use.
Back to Top...
How do I run
"Abaqus" on the workup node?
The following instructions assume that you are a member of the
Unix group "abaqus". They pertain only to the
Standard and Explicit components of the
software. The instructions in this section are only useful if
you want to run a test job of Abaqus on the login node
sfnode0. If you want to run a production job, please refer
to to instructions on how to start a Abaqus batch job
(see next section).
The Abaqus program uses a sophisticated syntax to set up a job
run. Instructions to the program are written into an input file
which is specified when the program is evoked. While an input
file can be written "from scratch", it is also
possible to use the ABAQUS/CAE component to generate such a
file. Both techniques a outside the scope of this FAQ. You also
can have a look at a simple example input
file here . Documentation for Abaqus is extensive, and
available both electronically and in print. There is no
substitute to consulting it.
Assuming that we have an input file called
testsys.inp, we can initiate a run (using
enivronment variable ABAQUS_HOME:
export ABAQUS_HOME=/opt/abaqus-6.5/Commands
$ABAQUS_HOME/abaqus job=test001 inp=testsys.inp scratch=/scratch/hpcXXXX
The job= option specified what the output files are to be
called. They have various different "filename
extensions" but share the name specified here (in our case
test001). With the inp= option, we specify which
input file to use. There is more otions, such as cpus=
and mp_mode= for running parallel jobs, but the two used
above should get a simple serial job running.
Note that the above sequence starts the job in the
background, i.e. after an initial setup phase, your terminal
returns although the job is still running. If you want to avoid
this, you can include the interactive option in the
command line.
Note that the Abaqus software uses a directory in
/tmp (which is local to the nodes on which the
software is executing) as scratch space. This is the
default setting and causes some Abaqus jobs to
fail. It must therefore be changed to the
standard scratch space /scratch/hpcXXXX (XXXX being
the numbers in your userid). This can be done in one
of two ways:
- Either include the option "scratch=/opt/hpcXXXX" (not
including the double quotes) in your command line.
- Or put a file called abaqus_v6.env into your home
directory ($HOME) that contains the line:
scratch="/opt/hpcXXXX" (including the double quotes).
The second option is probably preferable. Note that the
scratch directory has to be creaed manually, for instance
for user hpc1005, type:
mkdir /scratch/hpc1005 chmod 700 /scratch/hpc1005
Also, do not forget to occasionally check the contents of this scratch
directory by typing (sticking with the hpc1005 example):
ls -lt /scratch/hpc1005
and removing any files that might be left over from old
Abaqus runs. This is necessary because Abaqus will not
remove these files if a job was terminated before it ran to
completion.
The abaqus_v6.env file in the home directory
can also be used to "fix" a memory problem that sometimes
arises when large jobs are run. If the .msg file of an
Abaqus run shows errors because of not enough "Standard
Memory", you can reset this by including the line
standard_memory="512 mb" (including the quotes) to
reset it to 512 MB. The default is 256 MB.
More about changing the Abaqus environment may be learned
from the "Installation and Licensing Guide" (chapter 4) of
the Abaqus documentation. Please contact us if you need
assistance.
Back to
Top...
How do I set up and run a
Abaqus batch job?
In most cases, you will run Abaqus on the HPCVL machines in
batch mode. Since you have to have access to Abaqus outside of
the HPCVL license, most interactive work can be done elsewhere,
whereas the computationally intensive runs can be executed on
the cluster.
Production jobs are submitted on the systems via the
GridEngine, which is a load-balancing
software. To obtain details, read
our
Gridengine FAQ.
For a Abaqus batch job, this means that rather
than issuing the above commands directly, you wrap it
into a GridEngine batch script. For an example for
such a batch script please
click here. This script needs to be altered by
replacing all the relevant items enclosed in {} by the
right values. The interactive option is
necessary; without it the program will not start
properly. The script can be submitted to the
GridEngine by typing, e.g.
qsub abaqus_serial.sh
Note that Abaqus needs to be set up correctly before
submitting this script, as it inherents the settings
of the submitting shell.
The advantage to submit jobs via a load balancing
software is that the software will automatically find
the resources required and put the job onto a set of
processors that have a low load. This will help
executing the job faster. Note that the usage of
Gridengine for all production jobs on the HPCVL
clusters is mandatory. Production jobs with
a running time of more than 3 hours that are submitted
outside of the load balancing software will be
terminated by the system administrator.
Back to Top...
How do I setup and execute
a "Abaqus" parallel batch
job?
The Abaqus jobs that you will want to run on the HPCVL
machines are likely to be quite large. To utilize the
parallel structure of a cluster such as ours, Abaqus offers several
options to execute the solver in a parallel environment,
i.e. on several CPU's simultaneously.
HPCVL clusters consist of several interconnected
nodes, each of which is a shared-memory machine with
up to 512 cores or processors. The cluster is able to
execute both distributed-memory parallel
programs (usually employing MPI),
and shared-memory (multi-threaded)
programs. The Abaqus software achieves a certain
degree of parallel scaling using both of these
methods. The parallel portions of Abaqus are
restricted to the solver and operations on the
elements. Here is a list of operations with the
corresponding parallel mode that Abaqus supports:
Element operations - MPI only
Iterative solver - MPI or threads
Direct solver - Threads only
Lanczos solver - Threads only
Note that at present only the shared-memory
parallelism is in use on our clusters. It is
necessary to decide before a parallel Abaqus run which
parallel mode (if any) is to be used (on our clusters,
use "threads"), and how many processes are to be
started.
Production jobs on the HPCVL Clusters must
be submitted via the Grid Engine scheduling
software. Since most parallel Abaqus jobs fall into
this category, we have made a
sample script for
Gridengine submission. Note that Grid Engine
allocates all processors on a single node.
Processes are not the only resources that need to
be allocated when a parallel Abaqus job is
submitted. Since the Abaqus license is limited, a
scheme must be applied that determines if there are
still enough license tokens available. Therefore a
special parallel environment abaqus.pe is used.
This is expressed in the "#$ -pe" line in the
above sample scripts. Note that the following
limitations apply for Abaqus production jobs:
- Up to two Abaqus job per user can be executed at any time.
- A parallel Abaqus job must use no more than 20
processes.
This is to ensure fair access to the limited number
of tokens and to avoid shared-memory problems that
occur on some nodes if too many processes are used for
a single Abaqus job.
Grid Engine is able to interact with the Abaqus
license manager to check if sufficient licenses are
available for running. This will keep the scheduler
from starting jobs because enough processors are
available, just to be stopped again because there are
not enough licenses. Grid Engine keeps an internal
counter of available "token slots" which gets updated
frequently. Everytime Grid Engine attempts to schedule
an Abaqus job and is kept from doing so because not
enough licenses are available, it will "requeue" the
job. Since this causes the issue of an email if the
email notification line (#$ -m) is present,
this line should be omitted. Instead, Grid Engine was
configured to send notification at the beginning and
end of job execution, whenever the email definition
line (#$ -M) is present. Therefore, if you
want to be notified include the #$ -M,
otherwise omit it. Do not include the #$ -m
line because it floods your email with
notifications.
After altering the script by substuting the items enclosed
in {}, it in can be submitted to the Gridengine by
qsub batch_file_name
from sfnode0 (which is the GridEngine submit host). Note
that the job will appear as a parallel job on the
GridEngine's qstat or qmon. Note also that
submission of a parallel job in this way is only profitable
for large systems that use many CPU cycles, since the
overhead for assigning processes, preparing nodes, and
communication between them is
considerable. Back to
Top...
How do I execute an Abaqus job on the "mini-cluster"?
HPCVL supplies a small cluster of AMD-Opteron
based
Sunfire X4140 machines running on the Linux
platform to support Abaqus versions higher than
6.5. This "mini-cluster" presently consists of 4 nodes
with 8 cores each, running Abaqus 6.7. This cluster is
only to be used if the newer Abaqus version is
necessary. Because of the limited number of cores per
node, only 8-process jobs can be run on it. Also, the
total memory of each node is 32 GB, meaning that jobs
with large memory requirements cannot be submitted to
this cluster.
To submit a job to the mini-cluster,
a modified submission script
must be used. The only difference in the script is that
a special Abaqus queue is used for the mini-cluster. For
this the lines
#$ -clear
#$ -q abaqus.q
have been inserted at the top. It is also important to
use a different "usepackage" setup line to request the
correct version of Abaqus:
use abaqus67
before submitting the script. If the standard Abaqus 6.5
setup is used, jobs submitted to the mini-cluster will
fail. Likewise, standard jobs submitted to the default
clusters will fail if Abaqus was set up to run version
6.7.
Back to Top...
Where can I get further
help?
Abaqus is a very complex software package, and requires
some practice to be used efficiently. In this FAQ we can
not explain it use in any detail. Online documentation for
the programs is available on machines where Abaqus is
installed. On the login node, it can be accessed by a
webbrowser under
/opt/abaqus/Documentation/docs/index.html .
Note that you have to start the browser on the
login node (use firefox) because we do not
have a webserver running on the cluster (for security
reasons). A pdf version of the Abaqus
documentation can be found
in /opt/abaqus/Documentation/pdf
If you have problems with the GridEngine, read our
FAQ on that subject, and maybe consult the manual for
that software which is accessible as a PDF file. HPCVL also
provide user support in the case of technical problems. Contact us here, we might be able
to help, or pass you on to someone who
can. Back to
Top...
|