COCOS Cluster PBS primer


DESCRIPTION
       In  order  to  add  your job to a queue one of two things it is needed.
       Either read a qsub manual page and use it to send a  job  from  command
       line  or  write  a  script  doing this - all the rest of this manual is
       about the later solotion. Below find an example script  adding  program
       called program to the queue.



       #!/bin/bash

       #PBS -N program_name

       #PBS -l cput=10:20:30

       #PBS -q queue_name

       #PBS -m abe

       cd $HOME/path_to_the_program/

       how_to_run_your_program



       Instead  of  program_name provide a name for this application. You make
       it somewhat informative, at least for yourself as this name will appear
       in the queue, but beware that this name cannot contain spaces nor other
       special characters :) Instead of queue_name provide a name for a  queue
       to  which  the job should be added. Proper setting of this name is very
       important as different queues have different properties, mainly  execu‐
       tion  time.  For  example in a normal queue the job intended to run for
       20h will not finish its execution as there is 12h execution time  limit
       for  this  queue.   Full  info about all configured queues can be found
       running command line qstat -q or a bit less formal queues

       Queue time limit is a CPU Time parameter in the table which  will  show
       up when running qstat -q.  Another important parameter is -l cput=h:m:s
       It sets our estimated maximal time of execution with the following rep‐
       resentation:  h  -  time  in  hours, m - minutes, s - seconds. Each one
       should rather appear, even if 0, as for example just 2 will mean only 2
       seconds  instead  of anything else while 5:0 will mean 5 minutes 0 sec‐
       onds which still might not be the intended thing.  If this parameter is
       ommited  altogether then some system default will apply which will fin‐
       ish the job very quickly indeed.

       After setting all of the PBS starting parameters the  server  needs  to
       learn  a  way  to run the application itself (later on be passed to the
       executing node).  To do so in our small example we cd to  our  applica‐
       tion    directory    cd    $HOME/path_to_the_program/   where   instead
       path_to_the_program a path to our application is set, realtive  to  our
       $HOME  dir.  Next  line  how_to_run_your_program  runs  the application
       itself (basically ./application_name or some variation of it  with  I/O
       redirections  and  the  like  will  do).  Of course we can also run the
       application from somewhere else (useful when running the same  applica‐
       tion  for  different data sets) but this is not the place to learn such
       things :)

       Additional parameter set in our example -m abe is responsible for send‐
       ing  an  e-mail  in  case  of: starting and ending of the job or errors
       within it. More on all such topics in man qsub.


EXAMPLE
       User ukasz wants to enqueue a program being in  /home/ukasz/temp.   the
       program name is p_npt.  To do so the user creates in the same directory
       a script run_it.sh. In the console she/he writes:


       ukasz@shiva> cd /home/ukasz/temp

       ukasz@shiva> touch run_it.sh

       ukasz@shiva> mcedit run_it.sh


       Now opens an mc editor window (of course one can  use  whatever  editor
       he/she likes :) The inputed content of this file is like this:

       #!/bin/sh

       #PBS -N p_npt

       #PBS -q long

       #PBS -l cput=40:00:00

       #PBS -m abe

       cd $HOME/temp

        ./p_npt

       This  is  basically  all  what is needed, do not to forget to write the
       file though (in that mcedit it is F2 key :).  Now the job can be  added
       to the queue. It can be done by a console command:

       ukasz@shiva> qsub run_it.sh

       And  that’s all folks. Now we can happily check if it is already in the
       queue with another console command:

       ukasz@shiva> qstat

       And we look there for a job with the name given in our script. If it is
       there  then all went well, if not then something went wrong and we need
       to: read this manual back looking for clues. If  we  cannot  manage  by
       ourselves   and   using   man   qsub   we   can   ask   for   help   at
       root@shiva.if.uj.edu.pl - given time and  will  (not  to  mention  some
       detailed  description of the problem and/or direction where to look for
       all of it) one might even get the help :)

       Of course one can always delete a job from a queue (this will not touch
       any  user  files, only the system info about the job) using qdel job_no
       where job_no is a job number gotten from the  queue  (with  qstat  com‐
       mand).


MPI
       Althought  LAM  is installed but it’s usage is STRONGLY discouraged for
       the following reasons: - it does not work well with our PBS/torque  due
       to  it’s  own interprocess communication instead of usage of PBS-native
       one. On top of that it is believed to be less computing effective  than
       it’s    replacement:    Open-MPI.    -   usage   of   MPI   with   more
       cores/threads/processes than fit in a single node (4 at the  moment  in
       most of our nodes) is possible but difficult as one needs to be able to
       run a job (or more preciselly to be able to login via ssh without  giv‐
       ing  password)  from  one node on another one - lamboot and the rest of
       it.  At the moment it is impossible (or is it?) without my (root)  help
       so  when  planning  such jobs please come to me for help when you check
       you cannot do it by yourself :) But be warned that you  should  not  be
       using it at all.

       Open-MPI  (newest version) is installed (both development and run-time)
       in /usr/local tree. It  uses  native,  PBS  interprocess  communication
       mechanisms and what’s more it is reported by some to be more computing-
       effective than LAM. To use it add  /usr/local/include  to  your  header
       search  path,  /usr/local/lib  to  the libs path and either use orterun
       instead of mpiexec/mpirun or otherwise make sure /usr/local/bin  is  in
       front  of  your  PATH. Open-MPI will auto-magically learn how many pro‐
       cesses/threads it can run depending on the info got from PBS :)

       We have defined separate queues for MPI jobs - such jobs will  not  RUN
       outside  those  queues,  even if not rejected on spot. For all the real
       MPI problems our queues, even those dedicated to MPI, have a hard limit
       of  one  node  per one job which means that no job can run more threads
       than there are cores on a single motherboard.  It  also  favours  using
       openmp  compiler  extension  over  real open-mpi. If, for some or other
       reason, it is not enough try negotiating but NOT with the cluster admin
       but  with  an advisory board instead (PB will be good starting man) and
       better be prepared for hard discussion.


SOME LESS GENERAL REMARKS
       Normal, well behaving user should NOT ask for  more  cluster  resources
       than needed by the job (maybe except cput which will anyway not be used
       if not needed :) but sometimes (specially when cluster is not  full  or
       overcorwded)  this  attitude  might be reconsidered profitably for all.
       Such circumstances are for example:

       - usage of full machine memory (four-cores have 8 gigs of  RAM)  -  one
       does not need suffer and make other suffering the same with swapping of
       the jobs, no?  Exclusice reserving whole machine for him-/her-self will
       not be punishable in such case with cluster running part time only :)

       -  also  running  OpenMPI  with 4 cores occupied by 4 threads calls for
       reserving whole single machine for the job only :)

       Try to suggest some other sensible reasons and I  will  indicated  them
       here as well :)


SEE ALSO
       qsub - everything what can be in the script isubmitted to PBS , some of
       it may be inactive, see the man page for details

       qdel - how to remove the job from a queue

       qstat - how to use server/queue statistics and other PBS info

       qmgr - PBS configuration reader, queues etc.

       pestat - external to PBS script showing similar info in different  way,
       basically grouped according to hosts/nodes not other parameters

       http://shiva.if.uj.edu.pl  -  only  from allowed hosts, shows nodes and
       jobs statistics on a web page