Job Scheduler for Panda

Introduction

The PandaJobScheduler is a component of PanDA, the OSG executor that is being developed to replace CaponE in the ATLAS production system ProdSys. Please refer to PanDA or ProdSys wiki pages for further details about the projects. This document describes the status and progress of the job scheduler for the new system.

Here you may find the explanation of words or abbreviations used in this page

News

Releases

Goals

  • Send Panda pilots (PandaPilot) to local or grid resource
  • The resource (cluster) should always have queue filled with pilot jobs
  • The pilot jobs should return exit code, based on which the scheduler will slow down submission if too many problems are found at a site, and notify appropriate people by automatic email

Implementation

  1. use condor-G to submit pilot jobs
  2. alternately use local script to submit pilot jobs to batch queue
  3. pilot jobs will post and get from https service provided by PandaJobDispatcher
  4. pilot jobs will run scripts sent by PandaJobDispatcher in subprocess
  5. pilot jobs will return error code to job scheduler
  6. job scheduler will throttle submission if failure rate is too high
  7. job scheduler will implement maximum limit of concurrent queued jobs at a site
  8. job scheduler will send notifications by email when site problems are found

Design

Current design ideas.

Sending out pilots

Pilots are sent out by a program called pusher, part of the PandaJobScheduler. The underlaying mechanism to send the pilots is Condor-G. Current release (v0.3.1) installation and use instructions are available at PandaJobSchedulerV000301.

Pusher is sending pilots to all the available CEs. Information about the CEs is stored in a configuration file called siteinfo.py and loaded in proxy classes defined in Site.py. In Scheduler.py the user can choose the scheduling algorithm (WRR with static or load based weights, periodic scheduling) and its parameters. Pusher supports different kind of pilot, from unknown scripts to better documented pilots that can have a proxy class in Job.py allowing more complex interactions and constraints verification.

Steps for setting up atlas environment before running atlas jobs:

This section is obsolete. The basic principles are the same but the new pilot pilot2 is more complex and described in PandaPilot.

1) copy input files to workdir;

2) create PoolFileCatalog.xml for pfs-guid pairs of input files (expect the above 2 steps are to be done by dq2 tool later)

3) set up some environment variables:

RELEASE SITEROOT T_RELEASE T_DISTREL WORKDIR

4) check T_MYSQLSERVERNAME environment variable, for local mysql replica on the site and change the my.cnf file accordingly, so atlas jobs running on this site will use this mysql server. This is for sites that don't have access to cern DB from worker node. (pilot job doesn't do this step yet now)

5) source setup files

source /usatlas/projects/OSG/atlas_app/atlas_rel/9.0.4/setup.sh; source /usatlas/projects/OSG/atlas_app/atlas_rel/9.0.4/dist/9.0.4/AtlasRelease/*/cm t/setup.sh -tag_add=DC2;

6) run executable:

$APP/atlas_app/atlas_rel/kitval/KitValidation/JobTransforms/JobTransforms-09 -00-04-05/share/rome.g4sim.standard.trf rome.004100.evgen.T1_McAtNLO_top._00143.pool.root rome.004100.simul.T1_McAtNLO_top._14268.pool.root.1 1 2 14268

Right now, I check the "release=Atlas-9.0.4" tag on the message passed from dispatcher, if it doesn't have "Atlas" in it, assume this is not atlas job, then just run whatever passed in the jobPars string. We may want to add another TAG to the schema describing if the job needs Atlas environment or not. Of course, we won't need to do all this if we can make the messages completely neutral.

Design Q and A

This section contains questions and answers about PandaJobScheduler that will help the project development. Most of the questions are now obsolete. This section is more here for historic reason. A better descriprtion of the pilots is now available at PandaPilot. Anyway, feel free to add answers and comments. Thank you
  • Q: The term pandaPilotJob" has been used freely in the PANDA twiki pages but every reference I've seen has a question mark"?" after it. Where do I find this description? - Jerry - 06-sep-2005
  • A: Here is some information supplied via email. I will extend those twiki pages for more information on the pilot job. - Xin - 02-sep-2005

After more discussion with Wensheng and Yuri, here is another version of the "job definition" message I would expect jobdispatcher to send to the pilot job:
            PandaID=214234
            swRelease=9.0.4
            trfName=JobTransforms-09-00-04-05/share/rome.g4sim.standard.trf
            inFiles=inf1,inf2,inf3
            outFiles=outf1,outf2
            jobPars=inf1 inf2 inf3 outf1 outf2 outf3 50 1550 9432
where:
    1. inFiles and outFiles are LFNs of the input and output files for one job;
    2. jobPars is the exact argument list to invoke the .trf executable. Yuri will add jobPars into the schema, so jobdispatcher can retrieve it.
    3. the only item above that's not directly available from DB schema is the outFiles. DN schema gives outputJobDBlock, somehow the brokerage/dispatcher/DDM will need to figure out the outFiles from outputJobDBlock. We should add the outFiles to the DB schema, exactly the same way as the input files. After receiving the messages, pilot job will call DDM client to move the inFiles from local DDM area to the workdir of the job (on local worker node most likely). After the job is done, DDM client will be called again to move outFiles back to local DDM area. If we have DDM client tools that can hide all the details of the site-specific SE information and take the command line arguments like :
             ddm-get-file <LFN> /localnode/tmp/PandaID/<LFN>
             ddm-put-file /localnode/tmp/PandaID/<LFN> <LFN>
The format may be dsn:lfn. We need to discuss this with Miguel. In principle, the functionalities are already there in ddm. We will have to check with him about implementing the specific version needed by Panda. Then pilot job doesn't need to know the location of the local DDM area (SRM SE or NFS), just use the LFN name to contact to local DDM "LRC". We will have to install the DDM clients on $APP so that they are accessible from the worker nodes.
  • Q: What is a PilotJob supposed to do? - Jerry - 06-sep-2005
  • A: It will get the site information like $APP, workdir from jobschedular when it's launched, then check the cpu/memory/disk space on the worker node, send these information to jobdispatcher, get a new job definition from the jobdispatcher:
        StatusCode=0
        PandaID=903
        swRelease=Atlas-10.0.1 trfName=JobTransforms-10.0.1.5/share/rome.1001.reco.MuonDigit.OverrideRE.trf
        inFiles=inf1,inf2,inf3
        outFiles=outf1,outf2
        jobPars=inf1 inf2 inf3 outf1,outf2 50 1023 233
Note: the jobPars is the exact argument list for the trf to run, it's a string, converted from the jobDef xml by task buffer and put into the DB. The jobdispatcher will retrieve it from the DB and pass it to pilot job, then pilot job does a "run ". It will then probably will do the following steps:
    1. transfer the input files to local workdir
    2. fork a child process to run the alas job
    3. transfer the output files back to local DDM area

then send a message to jobdispatcher reporting the job status. During the job execution, it will update job status periodically to jobdispatcher or heartbeat monitor. Xin - 06-sep-2005

  • Q: Does the PilotJob have any external dependencies? - Jerry - 06-sep-2005
  • A: Since it's a script sent by condor-g, it should have little dependence on external methods. But there should be DDM client tools installed on sites $APP, which will be invoked by pilot job. - Xin - 06-sep-2005
  • Q: How does the PilotJob get launched? - Jerry - 06-sep-2005
  • A: By jobscheduler via condor-g. - Xin - 06-sep-2005
  • Q: What jobmanager does it run under? - Jerry - 06-sep-2005
  • A: Any - Marco
  • Q: What environment does it expect to see on the compute node? - Jerry - 06-sep-2005
  • A: The one set up by Globus - Marco
  • Q: Does the PilotJob expect any type of site wrapper script to be executed if an ATLAS release environment is expected? - Jerry - 06-sep-2005
  • A: It will run the .trf files installed in the kitval area, as we are doing right now. - Xin - 06-sep-2005
    • Q: In the case of ATLAS scripts, the CaponE wrapper script "wnexe" was responsible for sourcing all of the appropriate ATLAS setup scripts, generating the file PoolFileCatalog.xml, and generating a local verison of my.conf when necessary. This will still need to be done since the installed "trfs" don't do any of this. Will we try to use most of what the old wnexe script did before calling the designated trf? - Jerry - 07-sep-2005
    • A: [no answer yet]
  • Q: Do we really want DDM to move the input files to the working directory for every job? I think we should do a copy (not a move) since other jobs may need the same input files. - Jerry - 07-sep-2005
  • A: Move - Marco

Design Team

MarcoMambelli, XinZhao


Major updates:
-- KaushikDe - 01 Aug 2005 -- MarcoMambelli - 24 Aug 2005



Responsible: KaushikDe

Edit | Attach | Watch | Print version | History: r12 < r11 < r10 < r9 < r8 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r12 - 2007-06-18 - MarcoMambelli
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PanDA All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback