Usage of the SunFire 15K machines
Our main production environment consists of 7 Sun Fire 25K machines.
When you submit jobs, by default this is the set of machines on
which your job will run.
The 25K machines contain Ultra Sparc IV+ chips, so natively optimized
code generated by the Studio compilers is often tuned specifically
for those chips to get the best performance.
We also have 3 Sun Fire 15K machines with Ultra Sparc III chips
(these are a bit slower than the IV+ chips, 1.2 GHz vs. 1.5/1.8
GHz) that have been retained from our previous setup.
It is possible that code optimized for the US IV+ chips will not
run properly on the US III chips. A job submitted to Grid Engine
can often run anywhere on the compute grid, so one day your US IV+
code will run perfectly on a 25K, but the next day it could end up
on a 15K and might crash for no obvious reason.
For this reason, the 15Ks are not included in the default production
queues.
Default Production Queues
All jobs start with a default request for
production.q@@us4plus
@us4plus is a hostgroup that currently contains all the machines
with US IV+ chips (right now, that means the 25Ks).
Submitting jobs to these machines
Grid Engine provides a number of ways to select potential target
machines for jobs. In particular, we have set up a "hostgroup" and
a queue.
The hostgroup @us3 is just a short-hand container name for machines
with US III chips (currently the 15Ks). The production.q queue is
also available on this hostgroup but is not part of the default
request configuration within jobs.
How to add the 15Ks to your job request
Only do this if your code can run on these US III machines!
- (simplest) the job can run on any machine that is part of production.q:
#$ ... other directives ...
#$ -q production.q
- the job can also run somewhere in the us3 hostgroup:
#$ ... other directives ...
#$ -q *@@us3
- ensure the job must run somewhere in the us3 hostgroup:
#$ -clear
#$ ... other directives ...
#$ -q *@@us3
Notes:
The -clear removes any defaults for subsequent Grid Engine
directives in this job (and only in this job), in particular
the default production queue setup.
There really are 2 "@" symbols in examples #2 and #3.
The "-q" line means:
* @ @us3
any queue containing the hostgroup
|