Parameter Passing

Passing job parameters from the JDL, such as requirements on memory, or available CPU time, benefits both users and sites. See the presentation by Douglas McNab from the GDB, 2 December 2009

These requirements have to pass through several systems before they reach the batch system, such as the WMS and the CE. Below you will find the specifics for CREAM.

The way for CREAM to pass JDL requirements to the batch systems is by the CERequirements field, and use a batch system specific filter script to process selected requirements into batch system submission statements.

An example filter script (or BLAH hook) has been developed for Torque and will be included by default in an upcoming release of the TORQUE_utils meta-package. Currently (13-Jan-2010) only Torque and LSF have support for this hooking mechanism, Condor and SGE should add this as well (the developers have been contacted).

Parameters to pass along

The following set of parameters from the Glue 1.3 schema are deemed both generic and useful. With the arrival of Glue 2.0, the names will change somewhat.

Glue Parameter Description Unit Torque field Unit
MainMemoryRAMSize The amount of RAM MB mem MB
MaxWallClockTime The default maximum wallclock time allowed to each job by the batch system if no limit is requested. Once this time has expired the job will most likely be killed or removed from the queue minutes walltime seconds
MaxObtainableWallClockTime The maximum obtainable wall clock time that can be granted to the job upon user request minutes walltime seconds
MaxCPUTime The default maximum CPU time allowed to each job by the batch system minutes cput seconds
MaxObtainableCPUTime The maximum obtainable CPU time that can be granted to the job upon user request minutes cput seconds
SMPGranularity This is a special parameter (actually not a glue parameter) to indicate how many processes per node an MPI job wants # ppn #
WholeNodes This parameter indicates that the job wants exclusive access to the node(s) it's scheduled on boolean ?  

The SMPGranularity and WholeNodes come from the MPI working group recommendations, see also bug #58968 and bug #58878.

Deployment

For Torque, the pbs_local_submit_attributes.sh will be packaged in an RPM and included in the next TORQUE_utils patch.

For the other batch systems LSF_utils the same thing should be done.

A new YAIM variable, e.g. INCLUDE_BLAH_HOOK=yes/no will toggle the installation of a symbolic link as /op/glite/bin/pbs_local_submit_attributes.sh which will be picked up automatically by pbs_submit.sh.

Point of discussion is whether this variable should default to yes or no.

For the other LRMSs, LSF, Condor and SGE, the same thing needs to be done. Savannah feature requests have been submitted or will be shortly.

Open issues

For direct submission to CREAM the above two points are no issue.

  • Condor has no hook mechanism, bug #57307 (fixed with CREAM 1.6), bug #57307 (fixed with CREAM 1.6) or the condor_local_submit_attributes.sh bug #61359 (ready for test).
  • SGE has no hook mechanism; bug #61355 or the sge_local_submit_attributes.sh bug #61353
  • LSF has no lsf_local_submit_attributes.sh bug #61358

It should be relatively easy to add as the examples from pbs_submit.sh and lsf_submit.sh show.

  • With Glue 2.0 the list of parameters to pass is going to change; it is not clear at the moment how to treat this.

-- DVanDok - 13-Jan-2010

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2010-03-23 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback