ReDigi workflows

Prestaging
The first step is to subscribe the input dataset to a Tier-1 disk endpoint if necessary. Usually this will be on the same site which has the GEN-SIM on tape, but can be a different Tier-1 if necessary, for example if the custodial Tier-1 has too much work. If there is no custodial site, then subscribe the GEN-SIM to an appropriate Tier-1 based on the current work at each site.

I use the script https://github.com/alahiff/WmAgentScripts/blob/AndrewFixes/listWorkflows.py to produce a list of workflows and input datasets in the assignment-approved state:

python listWorkflows.py

In testing: script which automatically assigns GEN-SIM datasets to disk on custodial Tier-1s.

Assigning ReDigi workflows
The script https://github.com/alahiff/WmAgentScripts/blob/AndrewFixes/assignWorkflowsAuto.py is used to assign workflows. It's designed to be run once - firstly as a "dry-run" to ensure everything is fine (e.g. acquisition era, ProcessingString, etc) then again for real.

Example checking assignment of a single workflow:

bash-3.2$ python assignWorkflowsAuto.py -w pdmvserv_B2G-Summer12DR53X-00799_00332_v0__141022_160538_7286 -s T1_ES_PIC
Would assign  pdmvserv_B2G-Summer12DR53X-00799_00332_v0__141022_160538_7286  with  Acquisition Era: Summer12DR53X ProcessingString: PU_S10_START53_V19 ProcessingVersion: 1 lfn: /store/mc Site(s): T1_ES_PIC Custodial Site: T1_ES_PIC team: reproc_lowprio
This script needs to be run again with the -e option in order to actually assign the workflow.

In progress: set event splitting as appropriate for run-dependent MC workflows if necessary to guarantee the required number of jobs. Also some remaining tidying up after removing code which generated the processing string still needs to be done.

In testing: script which automatically assigns workflows to the appropriate sites when input datasets are 100% complete (or are above a certain threshold).

Announcing workflows
I use the script https://github.com/alahiff/WmAgentScripts/blob/AndrewFixes/listWorkflows.py to produce a list of workflows and input datasets in the closed-out state:
python listWorkflows.py closed-out
The script https://github.com/CMSCompOps/WmAgentScripts/blob/master/WorkflowPercentage.py can be used to both generate a list of the output datasets and check how complete they are.

In testing: script which gets a list of closed-out workflows and if everything is ok they are set to announced and the output datasets are set to VALID.

Edit | Attach | Watch | Print version | History: r7 | r5 < r4 < r3 < r2 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r3 - 2014-11-05 - AndrewLahiff
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Sandbox All webs login

  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback