WMAgent Scripts

Setting Up Scripts

Download the scripts

  • The easiest way to download the WmAgentScripts is using git on lxplus or your own machine:
    git clone https://github.com/CMSCompOps/WmAgentScripts.git

Creating Proxy

Most of the scripts need to load a proxy, so first you need to setup a certificate: New Operator Setup

  • Generate your proxy:
  • voms-proxy-init -voms cms
  • type your key password and this should display something like this:
     Contacting voms.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=voms.cern.ch] "cms"... Remote VOMS server contacted succesfully. Created proxy in /tmp/x509up_uXXXX. Your proxy is valid until Thu Oct 09 21:53:28 CEST 2014 
  • Export the X509_USER_PROXY variable to the environment (so it can be used by python), use proxy location in the previous step.
     export X509_USER_PROXY=/tmp/x509up_uXXXX 
  • This is a one line command for all this procedure:
     export X509_USER_PROXY=$(voms-proxy-init -voms cms | grep Created | cut -c18- | tr -d '.') 

Loading WMAgent Environment

  • Some of the scripts need the WMAgent libraries.
  • These scripts must run from a WMAgent machine (i.e. vocms049 - vocms174).
  • Log in the machine and type:
    source /data/srv/wmagent/current/apps/wmagent/etc/profile.d/init.sh

Scripts that interact with the Request Manag


The script allows us to reject or abort (regarding its state) a workflow, or a set of them.


python reject.py [options] [WORKFLOW_NAME]
WORKFLOW_NAME: if the list file is provided this should be empty

  -h, --help            show this help message and exit
  -f FILE, --file=FILE  Text file of workflows to Reject and Clone
  -c, --clone           Are the workflows going to be cloned? The default
                        value is False
  -i, --invalidate      Invalidate datasets? The default value is False
  -u USER, --user=USER  The user for creating the clone, if empty it will use
                        the OS user running the script
  -g GROUP, --group=GROUP
                        The group for creating the clone, if empty it will,
                        use 'DATAOPS' by default
  -m MEMORY, --memory=MEMORY
                        Set max memory for the clone. At assignment, this will
                        be used to calculate maxRSS = memory*1024

Some examples:


 python setCascadeStatus.py  -h
Usage: setCascadeStatus.py [options]

  -h, --help            show this help message and exit
  -u URL, --url=URL     Which server to communicate with
  -w WF, --wf=WF        Filelis of coma separated list of workflows
  -s STATUS, --status=STATUS   The new status
  • example: python setCascadeStatus.py -w dmason_BoogaBoogaBooga_151218_185755_7621 -s announced


Complete: 0

This script encapsulates all requests that can be done to request manager through the API. All the other scripts must use this script to maintain consistency.


This is on of the most powerful and flexible scripts that interact with ReqMgr throgh the REST API, using this script you can:

  • create new requests
  • clone an existing request
  • approve and assign requests already created.
  • change splitting
  • delete requests

This script is originally maintained in the WMCore code: https://github.com/dmwm/WMCore/blob/master/test/data/ReqMgr/reqmgr.py ; however we keep a copy in WmAgentScripts. This script uses an input json file with the request parameters, that can also be overridden by the correct parameters. You can see examples of different templates here: WMCore If you need to use this script check the script help:

[vocms174] /afs/cern.ch/user/j/jbadillo > python WmAgentScripts/reqmgr.py --help
Processing command line arguments: '['WmAgentScripts/reqmgr.py', '--help']' ...
  reqmgr.py options

--help, -h              Display this help
--cert=CERT, -c CERT    User cert file (or cert proxy file). If not defined,
                        tries X509_USER_CERT then X509_USER_PROXY env.
                        variables. And lastly /tmp/x509up_uUID.
--key=KEY, -k KEY       User key file (or cert proxy file). If not defined,
                        tries X509_USER_KEY then X509_USER_PROXY env.
                        variables. And lastly /tmp/x509up_uUID.
                        Request Manager service address (if not options is
                        supplied, returns a list of the requests in ReqMgr)
                        e.g.: https://maxareqmgr01.cern.ch
                        Request create and/or assign arguments config file.
--json=JSON, -j JSON    JSON string to override values from --configFile. e.g.
                        --json='{"createRequest": {"Requestor": "efajardo"},
                        "assignRequest": {"FirstLumi": 1}}' e.g. --json=`"cat
                        Request name or list of comma-separated names to
                        perform actions upon.
--queryRequests, -q     Action: Query request(s) specified by --requestNames.
--deleteRequests, -d    Action: Delete request(s) specified by --requestNames.
--createRequest, -i     Action: Create and inject a request. Whichever from
                        the config file defined arguments can be overridden
                        from command line and a few have to be so (*-OVERRIDE-
                        ME ending). Depends on --configFile. This request can
                        be 'approved' and 'assigned' if --assignRequests.
--changeSplitting, -p   Action: Change splitting parameters for tasks in a
--assignRequests, -g    Action: Approve and assign request(s) specified by
                        --requestNames or a new request when used with
                        --createRequest. Depends on --requestNames and
                        --configFile when used without --createRequest
                        Action: Clone request, the request name is mandatory
--allTests, -a          Action: Perform all operations this script allows.
                        --configFile and possibly --json must be present for
                        initial request injection and assignment.
--userGroup, -s         Action: List groups and users registered with ReqMgr
--team, -t              Action: List teams registered with a Request Manager
--verbose, -v           Verbose console output.

Some examples:

  • Create a request using the file julian.json
python WmAgentScripts/reqmgr.py -u https://cmsweb.cern.ch -i -f julian.json 
  • Assigning an existing request in ReqMgr (jbadillo_StoreResults_51816_v1_140826_100602_3071) changing splitting according to julian.json:
python WmAgentScripts/reqmgr.py -u https://cmsweb.cern.ch -p -g -f julian.json -r jbadillo_StoreResults_51816_v1_140826_100602_3071


This script allows to change the splitting of a request, on a given task name.

Usage: python changeSplittingWorkflow.py [-t TYPE| -e | -l | -a | -m] WORKFLOW TASKPATH VALUE
TASKPATH: The full task path.
VALUE: Should be an integer value allowed for splitting.
  -h, --help            show this help message and exit
  -t TYPE, --type=TYPE  Type of splitting (event, lumi, eventaware or merge),
                        or use the other options
  -e                    Use EventBased splitting
  -l, --lumi            Use
  -a, --eventaware      Use EventAwareLumiBased
  -m, --merge           Splitting for Merge tasks


  • The TASKPATH should be the full task path in which you want to change the splitting, i.e. StepOneProc, StepOne /StepOneProcMerge, Production, etc.
  • The TYPE is the algorithm for splitting.


Moves a workflow or list of workflows from running-closed to force-complete. This causes every production job to be aborted leaving only log-collect jobs and cleanups.

  • Usage:
Usage: python forceCompleteWorkflows.py [WF1 WF2 ... | -f FILE]

  -h, --help            show this help message and exit
  -f FILE, --file=FILE  Text file


Gives the list of sites that contains the input of a given workflow. It has additional options to locate the pileup and also give the sites in which any block of the input is present:

  • Usage:
Usage: python getInputLocation.py [OPTIONS] [WORKFLOW]

  -h, --help            show this help message and exit
  -f FILE, --file=FILE  Text file with several workflows
  -a, --any             Any block replica
  -d, --dataset         A single dataset
  -p, --pileup          Look also for pileup location
  -c, --clean           Print ready to use site list (No _Disk or _Buffer)


Note: This script is DEPRECATED, you should use WorkflowPercentage.py instead if you want the percentage, or dbs3Client for independent functions. This script shows the percentage of input events vs output events (or lumis in the case of ReReco) for the output datasets of a given workflow. Usage: python2.6 dbsTest.py workflow_name Example:

[lxplus302] /afs/cern.ch/user/e/efajardo/scripts/cmsweb/WmAgentScripts > python2.6 dbsTest.py ceballos_HIG-Fall11_TSG-00009_T1_TW_ASGC_MSS_v1_111126_204011
/GluGluToHToZZTo4L_M-130_7TeV-powheg-pythia6/Fall11-E7TeV_Ave23_50ns-v1/GEN-SIM-RAW-HLTDEBUG-RECO match: 44.3281525868%
/GluGluToHToZZTo4L_M-130_7TeV-powheg-pythia6/Fall11-E7TeV_Ave23_50ns-v1/DQM match: 44.3281525868%
/GluGluToHToZZTo4L_M-130_7TeV-powheg-pythia6/Fall11-E7TeV_Ave23_50ns-v1/AODSIM match: 44.3281525868%


A collection of python calls to access some PhEDEx functionalities, such as:

  • Query if a given dataset has a subscription.
  • Make subscription requests to sites.
  • Make deletion requests to sites.
  • Create a transfer request.


Note: This script is DEPRECATED, you should use phedexClient.py functions instead. This script is used to test if a custodial subscription for the output datasets of a given workflow has been made and approved, using the following ReqMgr REST API.

  • /reqMgr/request?requestName
  • /reqMgr/outputDatasetsByRequestName?requestName=
Usage: python2.6 phedexSubscriptionTest.py workflow_namw Example:
[lxplus302] /afs/cern.ch/user/e/efajardo/scripts/cmsweb/WmAgentScripts >python2.6 ceballos_HIG-Fall11_TSG-00009_T1_TW_ASGC_MSS_v1_111126_204011
This dataset is subscribed : /GluGluToHToZZTo4L_M-130_7TeV-powheg-pythia6/Fall11-E7TeV_Ave23_50ns-v1/GEN-SIM-RAW-HLTDEBUG-RECO
Custodial: y
Request page: https://cmsweb.cern.ch/phedex/prod/Request::View?request=354308
This dataset is subscribed : /GluGluToHToZZTo4L_M-130_7TeV-powheg-pythia6/Fall11-E7TeV_Ave23_50ns-v1/DQM
Custodial: y
Request page: https://cmsweb.cern.ch/phedex/prod/Request::View?request=358152
This dataset is subscribed : /GluGluToHToZZTo4L_M-130_7TeV-powheg-pythia6/Fall11-E7TeV_Ave23_50ns-v1/AODSIM
Custodial: y
Request page: https://cmsweb.cern.ch/phedex/prod/Request::View?request=358152


This script is used to close out several requests at once. It checks for

  1. Event Completion
  2. The custodial subscription is accepted
  3. Transfer Percentage
  4. There are no duplicated events
The list of workflows is obtained through the following ReqMgr REST API:
  • /reqmgr/monitorSvc/requestmonitor
This script uses DBS3, if you want to run previous version (using dbsTest and DAS) use closeOutWorkflows_leg.py Usage: Example:
   | *Request Name*| *Dataset Name*| *Event Progress*|*Custodial subscription*|*Transfer Percentage*|*Dup Events*|*Ready for Closeout*|
   | spinoso_B2G-Summer12-00373_R2134_B231_11_LHE_121209_162010_1851 |        /TprimeTprimeToTgammaTgammainc_M-850_TuneZ2star_8TeV-madgraph/Summer12-   START53_V7C-v1/GEN-SIM |  102 |  True| 100 | False| True| 
   | spinoso_B2G-Summer12-00377_R2134_B231_07_LHE_121209_161954_5843 |       /TprimeTprimeToTgammaTgammainc_M-1100_TuneZ2star_8TeV-madgraph/Summer12-START53_V7C-v1/GEN-SIM |   99 |  True| 100 | False| True| 
   | spinoso_HIG-Summer11-01259_R2204_B130_05_LHE_130113_234043_4441 |          /Graviton2PMToZZTo4L_M-126_7TeV_ext-JHUgenV2-PYTHIA6_Tauola/Summer11-START311_V2-v1/GEN-SIM |   99 |  True|  50 | False|False| 


Does the same as the closeOutWorkflows.py, but uses a file text for filtering specific workflows. This is useful for checking a small set of workflows without traversing to the whole set of completed workflows. Input args: A text list of the desired workflows to closeout. Usage:

   python2.6 closeOutWorkflowsFiltered.py wfs.txt


This is the previous version of closeOutWorkflows.py, used before the DBS2 to DBS3 migration on February 2014.


NOTE this script is deprecated Does the same as the closeOutWorkflows.py, but produces the output in a web page as same as the standard console output. Its useful for sharing the output of the script. For now the output is published here: http://jbadillo.web.cern.ch/jbadillo/closeout.html


This script clones and resubmit a workflow using the following ReqMgr REST API:

  • /reqmgr/monitorSvc/requestmonitor
Input Args:
  • workflow to resubmit (clone)
  • user name
  • group (DATAOPS)


  1. Create a proxy and load WMAgent environment first.
    jbadillo@vocms049:~/WmAgentScripts $ python resubmit.py -h
           python resubmit.py [options] [WORKFLOW_NAME] [USER GROUP]
    WORKFLOW_NAME: if the list file is provided this should be empty
    USER: the user for creating the clone, if empty it will
          use the OS user running the script
    GROUP: the group for creating the clone, if empty it will
          use 'DATAOPS' by default
      -h, --help            show this help message and exit
      -b, --backfill        Creates a clone for backfill test purposes.
      -v, --verbose         Prints all query information.
      -f TEAM, --file=TEAM  Text file with a list of workflows
       Cloned Workflow: Cloned workflow: spinoso_EXO-Summer12-01733_R1608_B145_34_LHE_120802_151433_4590
  2. Take note of the name of the new workflow.
  3. The workflow is created but NOT assigned, if you need get it running, follow the instructions here: AssigningWorkflows

Note: When you use the -b option at the end, the script will add the particle "Backfill" to the requestString, AcquisitionEra, Campaing and ProcessingString, so it can be correctly identified as backfill.


Creates an ACDC in reqMgr, given the name of the workflow and the task path.

  • Before creation, acdc documents should be already in couch (usually happens when the workflow is completed).
  • You need the full task path (not just the last part), i.e. for a workflow with StepOneProc and StepTwoProc:
    • If you want to create an acdc on StepTwo the taskname is StepOneProc /StepOneProcMerge/StepTwoProc* (or something similar).
  • Remember Create a proxy and load WMAgent environment first.
Usage: makeACDC.py [options] [WORKFLOW] TASK

  -h, --help            show this help message and exit
  -f FILE, --file=FILE  Text file of a list of workflows
  -m MEMORY, --memory=MEMORY
                        Memory to override the original request memory


Creates one ACDC in ReqMgr for every task that has a failed job (i.e. it has an ACDC document in couch) for a given workflow. * Basically calls once makeACDC.py for every task in the couch view. Usage:

jbadillo@vocms049:~ $ python WmAgentScripts/makeAllACDCs.py -h
Usage: makeAllACDCs.py [options] [WORKFLOW]

  -h, --help            show this help message and exit
  -f FILE, --file=FILE  Text file of a list of workflows
  -m MEMORY, --memory=MEMORY
                        Memory to override the original request memory


This script calculates how many events are needed to complete the request and create a new one with only those new events required. It's useful for creating a clone that:

  • Writes to the same dataset as the original workflow
  • Creates new events (needed to complete the request)
It can only be used in MonteCarlo from scratch (workflows that have no input dataset), that have NOT been extended before. Input Args:
  • workflow to resubmit (clone)
  • user name
  • group (DATAOPS)


  1. Create a proxy and load WMAgent environment first.
  2. type:
    python2.6 extendWorkflow.py [workflow name] [user] [group]
  3. Take note of the name of the new workflow.
  4. The workflow is created but NOT assigned, if you need get it running, follow the instructions here: AssigningWorkflows


For recovering a list of missing lumis on a workflow with input dataset. For further information check: RecoveringWorkflows


If blocks were inserted into an input dataset after the dataset moves to running-closed you will have missing output datasets. To run just these missing blocks without having to re-run the entire dataset do the following:

  1. Run these commands
     cd ~/WmAgentScripts 
     source setenvscripts.sh 
     source /data/srv/wmagent/current/apps/wmagent/etc/profile.d/init.sh 
  2. Resubmit workflow
    python resubmitUnprocessedBlocks.py [workflow name] [username] [group]
    * Result:
    • python resubmitUnprocessedBlocks.py  pdmvserv_B2G-Summer12DR53X-00586_T1_ES_PIC_MSS_00070_v0__131021_083438_8848 jen_a DATAOPS
      303 See Other
      This resource can be found at <a href='https://cmsweb.cern.ch/reqmgr/view/details/jen_a_B2G-Summer12DR53X-00586_T1_ES_PIC_MSS_00070_v0__131122_201448_5707'>https://cmsweb.cern.ch/reqmgr/view/details/jen_a_B2G-Summer12DR53X-00586_T1_ES_PIC_MSS_00070_v0__131122_201448_5707</a>.
      Cloned workflow: jen_a_B2G-Summer12DR53X-00586_T1_ES_PIC_MSS_00070_v0__131122_201448_5707
      200 OK
      Successfully updated splitting parameters for /jen_a_B2G-Summer12DR53X-00586_T1_ES_PIC_MSS_00070_v0__131122_201448_5707/StepOneProc  <a href="details/jen_a_B2G-Summer12DR53X-00586_T1_ES_PIC_MSS_00070_v0__131122_201448_5707">Details</A> <a href=".">Browse</A><BR>
      Assigned workflow: jen_a_B2G-Summer12DR53X-00586_T1_ES_PIC_MSS_00070_v0__131122_201448_5707 to site: [u'T1_ES_PIC'] custodial site: [] acquisition era: Summer12_DR53X team reproc_lowprio processin string: PU_S10_START53_V19 processing version: 1 lfn: /store/mc maxmergeevents: 50000 maxRSS: 2300000 maxVSize: 4100000000
  3. note this script actually submits the workflow for you, but if you have a list of blocks you know it should be running over you may want to double check the list quick before the workflow begins to run.
  4. Go to WMStats and search for the cloned workflow

python stuckRequestDiagnosis.py

  • This script will check a request that is listed as stuck and depending on the options you give it tell you which agent
  • Case one find which agents have stuck WF's

  • Once you know what machine a WF is stuck on, log onto that machine and you can check to see why it is stuck:
    • It needs the following env:
    • cmst1
      source /data/admin/wmagent/env.sh
      source /data/srv/wmagent/current/apps/wmagent/etc/profile.d/init.sh
      python stuckRequestDiagnosis.py agent spinoso_SUS-Summer12_FS53-00059_R2877_B20_13_LHE_130711_190230_9824
      Task /spinoso_SUS-Summer12_FS53-00059_R2877_B20_13_LHE_130711_190230_9824/MonteCarloFromGEN/MonteCarloFromGENMergeAODSIMoutput/MonteCarloFromGENAODSIMoutputMergeLogCollect has 2 jobs executing 
    • If you see the following :
      There are 0 files available and 0 files acquired in the subscription XXXX. 
      Usually means that TaskArchiver has too many requests and it is taking time to finish the subscriptions. so just let it continue on it's way.


  • name: WFComplete_main.py
  • dependencies: WFComplete_submitter.sh & dbsTest.py
  • easiest when run from a WmAgentScripts directory on lxplus (from svn)
  • run with: python WFComplete_main.py [options]
  • You need a grid proxy to run that script (do: voms-proxy-init --voms cms)


  • You can put a text file with a list of workflow names (one per line) with the -f flag. If you don't, the script will ask you to input workflow names interactively [end input with double enter]

  • Flags can be combined, putting -das works fine.
  • Lookup in the request monitor might be a bit slow at some times.

  • Info options:
    • With -d you get the 'percentage complete' from dbs.
    • With -a you lookup on which agent workflows are (from the request monitor)
    • With -s you check the status of workflows (from the request monitor)
    • Log output is written to /afs/cern.ch/user/c/cmst1/scratch0/WFFinalize/WFComplete.log for the general requests.

  • With -c you ask to (force) complete the workflows.
    • This sends by ssh the commands you would normally put interactively in the matching agent:
      ./config/wmagent/manage execute-agent wmagent-workqueue -i
      [at prompt] workqueue.doneWork(WorkflowName = 'name')
    • Log output is written to /afs/cern.ch/user/c/cmst1/scratch0/WFFinalize/WFComplete.log for the general requests.
    • Python output of the complete command is written to /afs/cern.ch/user/c/cmst1/scratch0/WFFinalize/WFComplete_output.log. If there is an error with the completion, it will be written here, but not give you an extra error message in the closing script. Please check the log file each time you complete workflows.
    • After you give this completion command, it might take a while (sometimes up to a couple of hours), but the workflows should move to completed automatically. You can just check the list again with the -s option.

help output

python WFComplete_main.py -h
usage: WFComplete_main.py [options]

  -h, --help            show this help message and exit

  Essential options:
    Nothing will happen if you don't provide one or more of these options.

    -d, --dbstest       get progress from dbs
    -c, --complete      complete workflows
    -s, --status        wf status from reqmon
    -a, --agent         wf agent from reqmon

  Additional options:
    Pick at will.

    -v, --verbose       maximize output
    -n, --norun         don't run the created complete scripts
    -t, --testconnection
                        don't run anything but some dummy commands to test
    -f FILENAME, --file=FILENAME
                        load workflow names from file


This script lists gives a summary of all ReDigi workflows waiting to be assigned, arranged by site, along with their priorities, time-per-event, and average number of events per lumi.

python Priorities.py --status assignment-approved --type ReDigi

The output can be used to calculate the best splitting of the workflows in lumis/job, which can then be set using changeSplittingWorkflow.py


NOTE this script is deprecated since we don't request tape family creation anymore. T1's should create them by default. This script is used to produce a list of required tape families at each site. Usage:

listReqTapeFamilies.py file_name

where file_name is a list of workflows (one per line). The output consists of a list of directories and associated sites, for example:

/store/mc/Summer12_DR53X/BdToMuMuPi0_EtaPtFilter_8TeV-pythia6-evtgen/GEN-SIM-RECO   T1_ES_PIC
/store/mc/Summer12_DR53X/BdToMuMuPi0_EtaPtFilter_8TeV-pythia6-evtgen/AODSIM   T1_ES_PIC
/store/mc/Summer12_DR53X/BdToMuMuPi0_EtaPtFilter_8TeV-pythia6-evtgen/DQM   T1_ES_PIC
/store/mc/Summer12_DR53X/BsToKK_EtaPtFilter_8TeV-pythia6-evtgen/GEN-SIM-RECO   T1_FR_CCIN2P3
/store/mc/Summer12_DR53X/BsToKK_EtaPtFilter_8TeV-pythia6-evtgen/AODSIM   T1_FR_CCIN2P3
/store/mc/Summer12_DR53X/BsToKK_EtaPtFilter_8TeV-pythia6-evtgen/DQM   T1_FR_CCIN2P3

The first column of the output should be included in the tape family request Savannah ticket. For each campaign it gives the standard possible data tiers required (e.g. GEN-SIM-RECO, AODSIM, DQM), in addition to any others (e.g. GEN-SIM-RECODEBUG). The acquisition era is chosen appropriately.


Quick request assignment, useful if you want to avoid assigning by Web interface (no task-chain case) and if reqmgr.py is too unflexible (i.e, the workflows have different processing versions or processing strings). Usage:

  1. Create a proxy and load WMAgent environment first.
  2. You must provided the -l LFN parameter (**Mandatory), be careful to assign the proper one (<a href="https://twiki.cern.ch/twiki/bin/edit/CMSPublic/MergedLFNBase?topicparent=CMSPublic.AssigningWorkflows;nowysiwyg=0" rel="nofollow" title="this topic does not yet exist; you can create it."> MergedLFNBase </a>)
  3. You can use this script to assign any kind of workflow.
  4. You can use a text file to assign multiple workflows at the same time.
  5. You may use additional options to:
    • enforce disk replica subscription
    • change dashboard activity
    • change processing version
    • fix a processing string or acquisition era
    • You can also provide a list of sites separated by commas (no spaces) T1_US_FNAL,T2_US_UCSD,...
    • You can use -s all: It will assign to all sites available (Works for any taskchain acdc).
    • You can skip -s option: It will assign to the "good site" list (Works for any clone you need).
Usage: assign.py [options] [WORKFLOW]

-h, --help show this help message and exit
-t TEAM, --team=TEAM Type of Requests
-s SITES, --sites=SITES
"t1" for Tier-1's and "t2" for Tier-2's
--special=SPECIAL Use it for special workflows. You also have to change
the code according to the type of WF
-r, --replica Adds a _Disk Non-Custodial Replica parameter
Processing Version, if empty it will leave the
processing version that comes by default in the
-a ACTIVITY, --activity=ACTIVITY
Dashboard Activity (reprocessing, production or test),
if empty will set reprocessing as default
-x, --xrootd Assign with trustSiteLocation=True (allows xrootd

--secondary_xrootd Assign with TrustPUSitelists=True (allows xrootd capabilities)

-l LFN, --lfn=LFN Merged LFN base
-v, --verbose Verbose
--testbed Assign in testbed
--test Nothing is injected, only print infomation about
workflow and Era
-f FILE, --file=FILE Text file with a list of wokflows. If this option is
used, the same settings will be applied to all
-w WORKFLOW, --workflow=WORKFLOW
Workflow Name
-e ERA, --era=ERA Acquistion era
--procstr=PROCSTRING Overrides Processing String with a single string


NOTE: This script is deprecated This script is used to assign MC reprocessing workflows. The following information is determined automatically:

  • team (selected according to the priority of the workflow)
  • acquisition era
  • site (selected according to the site name in the workflow name; if there is any confusion the custodial location of the input dataset is used instead)
  • processing version, including the pileup scenario, global tag and version number (the version number is incremented by 1 if there is an existing dataset with the same name)
Workflows will not be assigned if: (1) the input dataset is not VALID or PRODUCTION, or (2) if the pileup scenario cannot be determined.


  • -f, --filename : name of file containing a list of workflows, one per line
  • -w, --workflow : name of a single workflow
  • -e, --execute : actually assign the workflows (without this it will just print to the screen what options will be used when assigning)
  • -p, --procstring : processing string (prepended to automatically generated processing string)
  • -m, --procversion : processing version (overrides the automatically determined processing version)
  • -n, --specialstring : special string name, which is appended to automatically generated processing string
  • -a, --extension : extension dataset ( _ext is appended to the processing version)
  • -s, --site : force workflow to run at the specified site. Use T2_US for T2_US_Caltech, T2_US_Florida, T2_US_MIT, T2_US_Nebraska, T3_US_Omaha, T2_US_Purdue, T2_US_UCSD, T2_US_Vanderbilt, and T2_US_Wisconsin.
  • -c, --custodial : custodial site. If not specified, the default is to use the same site where the workflow is run.
  • -t, --team : force workflow to be assigned to the specified team
  • -x, --restrict : only assign workflows which have custodial input datasets at this site (others are ignored)
  • -r, --rssmax : specify max RSS
  • -v, --vsizemax : specify max VMem
  • -o, --xrootd : input datasets to be read using xrootd
Additional notes:
  • Known pileup scenarios: PU_S10, PU_S9, PU_S8, PU_S7, PU_S7, PU_S4, PU_S0, PU50, PU50bx25, PU35, NoPileUp.
  • Max Merge Events is set to 50000, except for Fall11_R1 workflows where it is set to 6000.
The recommended way of using assignWorkflowsAuto.py is to run it first without the -e option, glance at the results to see how it will assign the workflows to make sure everything is fine, then run it again using -e.

Example usage:

$ python assignWorkflowsAuto.py -f assigns_20130114_01_Upgrade_wfs
Would assign  casarsa_SUS-UpgradeL1TDR_DR6X-00031_T1_US_FNAL_MSS_4_v1_PU35_130114_144214_8546  with  acquisition era: Summer12 version: PU35_POSTLS161_V12-v1 lfn: /store/mc site: T1_US_FNAL team: processing maxmergeevents: 50000
Would assign  casarsa_SUS-UpgradeL1TDR_DR6X-00035_T1_US_FNAL_MSS_4_v1_PU50_130114_144150_628  with  acquisition era: Summer12 version: PU50_POSTLS161_V12-v1 lfn: /store/mc site: T1_US_FNAL team: processing maxmergeevents: 50000
Would assign  casarsa_SUS-UpgradeL1TDR_DR6X-00036_T1_US_FNAL_MSS_4_v1_PU50bx25_130114_144143_2479  with  acquisition era: Summer12 version: PU50bx25_POSTLS161_V12-v1 lfn: /store/mc site: T1_US_FNAL team: processing maxmergeevents: 50000

ReqMgr REST API used

  • /reqmgr/monitorSvc/requestmonitor
  • /reqMgr/outputDatasetsByRequestName?requestName
  • /reqmgr/view/showWorkload?requestName=


Changes priority of a workflow or a list of workflows

  • Usage:
Usage: Usage changePriorityWorkflow.py [WF1 WF2 ... | -f FILE] PRIO
  -h, --help            show this help message and exit
  -f FILE, --file=FILE  Text file


This script makes the assignment for MonteCarlo /MonteCarloFromGEN/LHEStepZero requests. It interacts with ReqMgr and DBS, making useful checks before assigning (output dataset not already in DBS, status is really assignment-approved…). Then it creates the whitelist for the assignment interacting with the T2s->T1 PhEDEx matrix. Team selection is also automatic.

Basic usage:

python WmAgentScripts/mc/mcassign.py -l 135113_in2p3.txt -a Summer12 --assign 

For more options:

python WmAgentScripts/mc/mcassign.py --help

ReqMgr REST API used

  • /reqmgr/monitorSvc/requestmonitor
  • /reqMgr/outputDatasetsByRequestName?requestName


This script prepares the Savannah request for tape families starting from a list of requests. Interaction is only with ReqMgr.


python WmAgentScripts/mc/mctapefamilies.py -l fnal.txt -a Summer12 -t FNAL -b "R2181_B240 Set1"


Announce a given list of workflows

  • Usage:
python announceWorkflows.py [WF1 WF2 ... | -f FILE]


This script is supposed to retrieve all possible information on a given request, interacting with ReqMgr, DBS, PhEDEx.

Basic usage:

python WmAgentScripts/mc/reqinfo.py -w  etorassa_BPH-Summer12-00081_batch226_v1__121130_200132_2586

For more options:

python WmAgentScripts/mc/reqinfo.py --help


This sets the status of a request. Usage is

python WmAgentScripts/mc/setrequeststatus.py request status

Scripts that interact with DBS


This script encapsulates all the calls to dbs3, it uses the dbs3 api provided here: https://cms-http-group.web.cern.ch/cms-http-group/apidoc/dbs3-client/current/dbs.apis.html If called directly it shows the number of events and lumis for the output datasets of a given workflow. Input:

  • Workflow name


  • Create a proxy and load WMAgent environment first.
  • Type:
    python WmAgentScripts/dbs3Client.py workflowName
  • It should show something like this:
     /DATASET1/WWW Events: 100000 Lumis: 2000 /DATASET2/WWW Events: 200000 Lumis: 3000 


This is imported from DBS client: https://github.com/dmwm/DBS/blob/master/Client/utils/DataOpsScripts/DBS3SetDatasetStatus.py

Give the dataset path, the new status and DBS Instance url (writer), it will set the new status. If the children option is used, the status of all its children will be changed as well Usage: DBS3SetDatasetStatus.py --dataset= --status=<newStatus> --url=<DBS_Instance_URL> + optional options

Options: -h, --help show this help message and exit -u DBS_Instance_URL, --url=DBS_Instance_URL DBS Instance url -r True/False, --recursive=True/False Invalidate all children datasets,too? -d /specify/dataset/path, --dataset=/specify/dataset/path Dataset to change status -s newStatus, --status=newStatus New status of the dataset -v, --verbose Increase verbosity -p socks5://, --proxy=socks5:// Use Socks5 proxy to connect to server Usage: DBS3SetDatasetStatus.py --dataset= --status=<newStatus> --url=<DBS_Instance_URL> + optional options


python2.6 WmAgentScripts/DBS3SetDatasetStatus.py --dataset=/RelValProdTTbar/CMSSW_3_5_1-MC_3XY_V21_HcalCalIsoTrk-v1/ALCARECO --status=INVALID --recursive=True


Looks for duplicated lumis the output dataset in a given workflow. A lumisection is considered "duplicated" when appears in two or more different files for the same run. For example if the following dataset has two files, and two runs:

  • File1:
    lumis = {1,2,3,4,7}
    lumis = {1,2,3,}
  • File2:
    lumis = {4,5,6}
    lumis = {5,6,7}
Lumisection #4 in run $1 is considered "duplicated". However Lumisection #7 is NOT considered duplicated because is in a different run (on File1 is in run1, on File2 is in run2).

When a dataset has duplicated lumis is generally invalidated and the producing workflow is ran again. Input:

  • A text file with the workflows to look into
  • A text file in which the workflows with duplicate events will be saved.
Usage: Note if the option -v is used, the script will print out every run/lumi that is duplicated and in which files is present. This could be used as an input for file invalidation.


Analyzes the output of duplicateEvents.py and calculates the minimum file set for invalidating

  • Usage: python analyzeDuplicates.py FILE
  • FILE: A text file with the output of lumis from the duplicateEvents.py with -v (verbose) option. Should be in this format:
      dataset : /DATASET_NAME
      lumi x is in these files
      file 1
      file 2
      lumi y is in these files
      file 3
      file 4


This replaces the use of dbsTest.py, calculating a workflow percentage by using dbs3 only. Usage:

  • Create a proxy and load WMAgent environment first.
  • Type:
    python2.6 WmAgentScripts/WorkflowPercentage.py [options] [workflow_name]
  • options include:
    • -l Displays percentage of lumis instead of percentage of events.
    • -v Displays more detailed information
    • -f file Calculates the percentage on the list of workflows inside a text file.
    • -k Discounts invalid files.
  • Example:
    jbadillo@vocms049:~ $ python WmAgentScripts/WorkflowPercentage.py jbadillo_TOP-Summer11LegwmLHE-00010_00005_v0__141103_115430_5369 jbadillo_TOP-Summer11LegwmLHE-00010_00005_v0__141103_115430_5369 /TT_weights_CT10_7TeV-powheg/Summer11LegwmLHE-START53_LV4-v2/GEN 66.321% 
  • With details:
    jbadillo@vocms049:~ $ python WmAgentScripts/WorkflowPercentage.py -v pdmvserv_SUS-Summer12FS53-00135_00043_v0__140404_121722_8710 pdmvserv_SUS-Summer12FS53-00135_00043_v0__140404_121722_8710 /CMSSM_tanb-10_m0-1100to2000_m12-400to575_TuneZ2star_8TeV-pythia6/Summer12-START53_V19_FSIM_PU_S12-v1/AODSIM Input events: 12000000 Output events: 10814529 (90.121075%)(filter=1.0) 
  • Lumis instead of events:
    jbadillo@vocms049:~ $ python WmAgentScripts/WorkflowPercentage.py -l -v pdmvserv_SUS-Summer12FS53-00135_00043_v0__140404_121722_8710 pdmvserv_SUS-Summer12FS53-00135_00043_v0__140404_121722_8710 /CMSSM_tanb-10_m0-1100to2000_m12-400to575_TuneZ2star_8TeV-pythia6/Summer12-START53_V19_FSIM_PU_S12-v1/AODSIM Input lumis: 120000 Output lumis: 108155 (90.1291666667%) 
  • With a text files (multiple wokflows):
    jbadillo@vocms049:~ $ python WmAgentScripts/WorkflowPercentage.py -f wfs pdmvserv_BTV-RunIIFall14GS-00003_00005_v0__141031_161554_2881 /QCD_Pt_30to50_TuneCUETP8M1_13TeV_pythia8/RunIIFall14GS-MCRUN2_71_V1-v1/GEN-SIM 99.6406666667% pdmvserv_BTV-RunIIFall14GS-00005_00005_v0__141031_161604_1266 /QCD_Pt_80to120_TuneCUETP8M1_13TeV_pythia8/RunIIFall14GS-MCRUN2_71_V1-v1/GEN-SIM 99.475% pdmvserv_BTV-RunIIFall14GS-00006_00005_v0__141031_161559_3749 .... 

StoreResults /createStoreResults.py

Handles a StoreResults request, for further information check StoreResults twiki


This script changes the status of a dataset (and the files, if option is given) in DBS3. Here are the parameters do be used:

  • --dataset=DATASET_NAME: the complete dataset name
  • --status=NEW_STATUS: new status that the dataset will acquire (usually: DELETED, DEPRECATED, INVALID, PRODUCTION, VALID)
  • (optional) --files: use this option when you want to change the files status as well. Files will be set to INVALID (or 0) when deleting/deprecating/invalidating a dataset. Files will be set to VALID (or 1) when moving the dataset to production/valid.
You need to run it (as yourself and having a proxy) in a machine that has WMAgent 0.9.82, sourcing the following two files:
source /data/srv/wmagent/current/sw/slc5_amd64_gcc461/cms/dbs3-client/3.1.7b/etc/profile.d/init.sh
source /data/srv/wmagent/current/apps/wmagent/etc/profile.d/init.sh
source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh

Also be careful to adapt to the dbs3-client version available on that agent the first command


python2.6 RelVal/setDatasetStatusDBS_2_3.py --dataset=<dataset> --status=<new_status> {--files}


python2.6 RelVal/setDatasetStatusDBS_2_3.py --dataset=/RelValProdTTbar/CMSSW_3_5_1-MC_3XY_V21_HcalCalIsoTrk-v1/ALCARECO --status=INVALID --files 

Scripts that interact with the WMAgents

Slot Configuration

~/www/wmaconfig/slot-limits.conf This file has a list of sites and their thresholds and it's used by every agent in order to define how many jobs should be submitted to a given site. It can be also used to set a given site to "down" or "drain" mode, here is the difference between them:

  • "down": no jobs at all are submitted to the site.
  • "drain": no processing jobs are submitted, but all the others are allowed.
Note that you only need to change it in one CERN machine/place, them it will be downloaded via HTTP by the other agents (at FNAL).


T2_US_Purdue 10000
T2_FR_GRIF_LLR 1000 down
T2_PT_NCG_Lisbon 800 drain

Condor Overview

Displays a summary of the condor jobs in a WMAgent, classified by Site and Task.

  • It also shows a summary of Jobs that are running for more than 48 hours, jobs with MaxWallTime > 42 hours, Jobs failed more than 3 times and Jobs removed.
  • It must be ran from a WMAgent machine.
  • You need to load WMAgent environment
  • If you run from a production machine, this script is available with the alias condor_overview

An example of the output:

cmst1@vocms0308:/data/srv/wmagent/current $ python ~jbadillo/WmAgentScripts/condor_overview.py
| Running              | Processing | Production | Merge      | Cleanup    | LogCollect | Total      |
| T1_DE_KIT            |        268 |          0 |          0 |          0 |          0 |        268 |
| T1_ES_PIC            |        199 |          0 |          0 |          0 |          0 |        199 |
| T1_IT_CNAF           |        481 |          0 |          9 |          0 |          4 |        494 |
| T1_UK_RAL            |         98 |          0 |          0 |          0 |          0 |         98 |
| T1_US_FNAL           |         16 |          0 |          0 |          0 |          0 |         16 |
| T2_CH_CERN           |          4 |          0 |          0 |          0 |          0 |          4 |
| T2_IT_Pisa           |          1 |          0 |          0 |          0 |          0 |          1 |
| T2_US_Caltech        |          5 |          0 |          0 |          0 |         12 |         17 |
| T2_US_Purdue         |          0 |          0 |          0 |          0 |          1 |          1 |
| T2_US_Vanderbilt     |          0 |          0 |          0 |          0 |         27 |         27 |
| Total                |       1072 |          0 |          9 |          0 |         44 |       1125 |

| Pending              | Processing | Production | Merge      | Cleanup    | LogCollect | Total      |
| T1_DE_KIT            |          1 |          0 |          1 |          0 |          0 |          2 |
| T1_ES_PIC            |        185 |          0 |          0 |          0 |          0 |        185 |
| T1_IT_CNAF           |        197 |          0 |          0 |          0 |          1 |        198 |
| T1_UK_RAL            |        187 |          0 |          0 |          0 |          0 |        187 |
| T2_US_Caltech        |         19 |          0 |          0 |          0 |          0 |         19 |
| Total                |        589 |          0 |          1 |          0 |          1 |        591 |

Condor Global Overview

Displays similar results than Condor Overview but collecting job information from all the agents connected to the same pool. Since you are interested only in Production agents (and not Crab3 or test agents), you should check that the schedds list at the beginning of the script contains the agents you're interested in.


cmst1@vocms0308:/data/srv/wmagent/current $ python ~jbadillo/WmAgentScripts/condor_global_overview.py 
getting jobs from cmssrv217.fnal.gov
getting jobs from cmssrv218.fnal.gov
| Running              | Processing | Production | Merge      | Cleanup    | LogCollect | Total      |
| T1_DE_KIT            |        932 |          0 |          8 |          0 |          0 |        940 |
| T1_ES_PIC            |       1575 |          0 |         93 |          0 |          2 |       1670 |
| T2_US_Wisconsin      |       1780 |          0 |          9 |          0 |          0 |       1789 |
| Total                |      20844 |          0 |        234 |          0 |        174 |      21252 |

| Pending              | Processing | Production | Merge      | Cleanup    | LogCollect | Total      |
| T1_ES_PIC            |        456 |          0 |         25 |          0 |          0 |        481 |
| T1_US_FNAL           |         43 |          0 |          2 |          0 |          0 |         45 |
| T2_US_Caltech        |          1 |          0 |          0 |          0 |          0 |          1 |
| Total                |       1549 |          0 |         53 |          0 |          0 |       1602 |

Jobs that have MaxWall > 42 hours by workflow:

jbadillo_ACDC_B2G-RunIISpring15DR74-00637_00301_v0__150824_092009_3558 : 4960.35

| Removed              | Processing | Production | Merge      | Cleanup    | LogCollect | Total      |
| T2_US_Purdue         |         18 |          0 |          0 |          0 |          0 |         18 |
| Total                |         18 |          0 |          0 |          0 |          0 |         18 |

Jobs with RemoveReason!=UNDEFINED

jbadillo_B2G-RunIISpring15DR74-00648_00296_v0__150824_090658_6414 : 5046.2 5046.6 5046.7 5046.8 5046.9 5046.10 ...

| Restarted            | Processing | Production | Merge      | Cleanup    | LogCollect | Total      |
| T1_IT_CNAF           |          1 |          0 |          0 |          0 |          0 |          1 |
| T2_US_Caltech        |          1 |          0 |          0 |          0 |          0 |          1 |
| Total                |          2 |          0 |          0 |          0 |          0 |          2 |

Jobs with NumJobStart > 3

pdmvserv_JME-2019GEMUpg14DR-00044_00103_v0_hcal_150821_153934_4927 : 327814.83

Other Scripts

bash copyLFNfromSite.sh

First, set up a proxy with the agent certificate, just because it's the easiest :

voms-proxy-init --key=$X509_HOST_KEY --cert=$X509_HOST_CERT -voms cms

If you cannot run this, just source this :

source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh

Then run the script specifying the site, LFN, and file destination starting from current directory :

[vocms15] /data/srv/wmagent/current > bash copyLFNfromSite.sh T1_CH_CERN /store/data/Run2012D/MultiJet/RAW/v1/000/206/484/DCB3A637-ED24-E211-83B3-5404A63886A2.root file.name
Using grid catalog type: UNKNOWN
[BDII][][] lcg-bdii.cern.ch:2170: No LFC Endpoint found
Using grid catalog : (null)
VO name: cms
Checksum type: None
Trying SURL srm://srm-cms.cern.ch:8443/srm/managerv2?SFN=/castor/cern.ch/cms/store/data/Run2012D/MultiJet/RAW/v1/000/206/484/DCB3A637-ED24-E211-83B3-5404A63886A2.root ...
Source SE type: SRMv2
Source SRM Request Token: 135960845
Source URL: srm://srm-cms.cern.ch:8443/srm/managerv2?SFN=/castor/cern.ch/cms/store/data/Run2012D/MultiJet/RAW/v1/000/206/484/DCB3A637-ED24-E211-83B3-5404A63886A2.root
File size: 3992285282
Source URL for copy: gsiftp://lxfsrf01c04.cern.ch:20753/cc331edd-7a64-1d01-e043-6ea18a890c5d
Destination URL: file:/tmp/delete.me
# streams: 1
   3938451456 bytes  71224.64 KB/sec avg  75263.40 KB/sec inst
Transfer took 55680 ms

[vocms15] /data/srv/wmagent/current > bash copyLFNfromSite.sh T2_US_MIT /store/unmerged/logs/prod/2013/4/4/jen_a_ACDC234Pro_Winter532012DMETParked_FNALPrio4_537p6_130403_042115_5913/DataProcessing/DataProcessingMergeRECOoutput/skim_2012D_METParked/0001/1/00710a8c-9c5e-11e2-9408-003048f02d38-1-1-logArchive.tar.gz MIT.tar.gz

How to access scripts in the WmAgent GitHub Repository

For first time

If it is for the first time you are accessing the repository do the following:

  1. Log into any lxplus machine
     ssh lxplus.cern.ch 
  2. Run the following command
    git clone https://github.com/CMSCompOps/WmAgentScripts.git 
This will create the WmAgentScripts dir

Not first time

  1. Log into any lxplus machine
     ssh lxplus.cern.ch 
  2. cd to the WmAgentScripts
     cd WmAgentScripts
  3. Run the following command
    source setenvscripts.sh
This last script will ensure you have the kerberos token, create a proxy and guarantee the repository is up to date.

You sill see something like this:

[lxplus420] /afs/cern.ch/user/e/efajardo/WmAgentScripts > source setenvscripts.sh 
Password for efajardo@CERN.CH: 
Already up-to-date.
Enter GRID pass phrase:
Your identity: /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=efajardo/CN=722781/CN=Edgar Fajardo Hernandez
Creating temporary proxy .......................................................... Done
Contacting  lcg-voms.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch] "cms" Done
Creating proxy ........................... Done
Your proxy is valid until Wed Sep 18 22:50:51 2013
[lxplus420] /afs/cern.ch/user/e/efajardo/WmAgentScripts 

Once this is done you should be able to run the following scripts.

-Main.JulianBadillo - 2015-10-09

Edit | Attach | Watch | Print version | History: r75 < r74 < r73 < r72 < r71 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r75 - 2020-05-05 - DmytroKovalskyi
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback