Job Priority Preliminary Test

This wiki page describes how to deploy the job priority stuff with preliminary YAIM in which the configurations for job priorities are integrated.

Torque + Maui

You can find the preliminary YAIM from http://grid-deployment.web.cern.ch/grid-deployment/yaim/devel, please use the latest version. You need YAIM core (glite-yaim-core) and lcg-ce (glite-yaim-lcg-ce). Besides, since this is the version 4 of YAIM, it is fully modularized, thus you need other YAIM modules in order to correctly configure your CE as follows:

  • CE: glite-yaim-core, glite-yaim-lcg-ce and glite-yaim-torque-utils
  • CE+Torque server: glite-yaim-core, glite-yaim-lcg-ce, glite-yaim-torque-utils and glite-yaim-torque-server
  • CE+Torque server+BDII site: glite-yaim-core, glite-yaim-lcg-ce, glite-yaim-torque-utils, glite-yaim-torque-server and glite-yaim-bdii

You can find other YAIM modules listed above from http://grid-deployment.web.cern.ch/grid-deployment/glite/public/pps/R3.1/generic/sl4/i386/RPMS.release/ . Since SL3 LCG CE meta package is still depending on glite-yaim, you need to uninstall it at first by "force" for SL3 LCG CE. You also need the latest dynamic scheduler to correctly publish VOview information, they are available from http://www.nikhef.nl/user/templon/lcg-info-dynamic-scheduler-generic-2.2.2-1.noarch.rpm and http://www.nikhef.nl/user/templon/lcg-info-dynamic-scheduler-pbs-2.0.1-1.noarch.rpm. "VONAME_GROUP_ENABLE" should be defined in site-info.def before running YAIM, for example, for atlas if you want to configure production and lcgadmin(or sgm) role, you can define:

ATLAS_GROUP_ENABLE="atlas /VO=atlas/GROUP=/atlas/ROLE=production /VO=atlas/GROUP=/atlas/ROLE=lcgadmin"

The CE can be configured as follows:

for CE only,

/opt/glite/yaim/bin/yaim -c -s site-info.def -n lcg-CE -n TORQUE_utils

for lcg CE + TORQUE_server

/opt/glite/yaim/bin/yaim -c -s site-info.def -n lcg-CE -n TORQUE_server -n TORQUE_utils

for lcg CE + TORQUE_server + BDII site

/opt/glite/yaim/bin/yaim -c -s site-info.def -n lcg-CE -n TORQUE_server -n TORQUE_utils -n BDII_site

Fair-sharing configuration of batch system is the responsibilities of side administrators, the following is one example of MAUI configuration for fair-sharing:

QUEUETIMEWEIGHT       2
XFACTORWEIGHT        10
XFACTORCAP       100000
RESWEIGHT            10

CREDWEIGHT           30
USERWEIGHT           10
GROUPWEIGHT          10

FSWEIGHT             20
FSUSERWEIGHT          1
FSGROUPWEIGHT        10
FSQOSWEIGHT         100

FSPOLICY             DEDICATEDPES%
FSDEPTH              24
FSINTERVAL           24:00:00
FSDECAY              0.99
FSCAP                100000

FSUSERWEIGHT          1
FSGROUPWEIGHT        10
FSQOSWEIGHT         100

USERCFG[DEFAULT]     FSTARGET=7       MAXJOBQUEUED=350

GROUPCFG[atlas]      FSTARGET=10      PRIORITY=100   MAXPROC=15   QDEF=lhcatlas
GROUPCFG[atlasprd]   FSTARGET=40      PRIORITY=200   MAXPROC=15   QDEF=lhcatlas
GROUPCFG[atlassgm]                    PRIORITY=300   MAXPROC=1   QDEF=lhcatlas
QOSCFG[lhcatlas]     FSTARGET=50                     MAXPROC=15
Above lines should be appended at the end of /var/spool/maui.cfg on TORQUE server. In this example, the ATLAS share of the entire farm is 50%, the general ATLAS user share 20% of that or 10% of the whole farm and the production ATLAS gets 80% of the total ATLAS share, resulting in 40% of the farm. ATLAS can use maximum 15 CPUs. lcgadmin(sgm) can use maximum only 1 CPU but it has highest priority. If you configure TORQUE server by YAIM as well, every time after you run YAIM, you should append the fair-sharing configuration again since it is overwritten by YAIM.

Please notice that currently the YAIM works for both SL3 and SL4 LCG CE.

PBSpro

In principle, the information system still can be configured by the same YAIM as "Torque + Maui", but the configuration of scheduler should be changed, please refer to PBSpro scheduler documentations.

LSF

-- DiQing - 22 Oct 2007

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2007-11-12 - DiQing
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback