Evolution of interactions between McM and ReqMgr/WMAgent

Discussed on 10. October 2013

Issue reporting

  • Description: The general idea would be to have a way of tying the computing issue reports to the request in ReqMgr (link with McM via the PrepID is just a remapping) so has to expose issue reports and on-going issue solving to the requesters via McM (or any other interface we can link to)
  • general request: add link to every workflow to quickly and automatically see all communication concerning a workflow
    • ideas:
      • mark all elogs with the affected workflow name(s), present a link for the elog search for the workflow name for every workflow name
  • general development request:
    • need meta data field (free text field) per workflow
    • need url field (list of urls) per workflow to link ticket/TWiki/elog/etc.

Processing String and Acquisition Era

  • confirmed again: campaign name is acquisition era
    • Ops is setting the era according to the campaign name
    • McM is doing validation on their side (no '-' and '_' in the name) and apply common sense
  • McM is handing over to ReqMgr2:
    • campaign/era name
    • processing string
    • flag whether the workflow is an extension or not
  • Assignment scripts are checking for already existing versions of the output dataset names of a workflow
    • query logic
      • first querying DBS
      • to cover for the case that a workflow has not yet registered a file in DBS, query ReqMgr if the output dataset name was not found in DBS
    • the increase the version counter if necessary, otherwise start with v1
  • McM is going to query ReqMgr (needs api?) for the version number to keep McM version number in sync (example: Ops needs to clone and resubmit, version number could be increased)
    • annotation what was said is : McM is going to try to keep McM version number in sync McM cannot query reqmgr directly

Next discussion: 24 October 2013

  • Request type Monte Carlo with flag lhe false (a.k.a produce LHE events from scratch and store them in /GEN datasets)
    • McM has to set a minimum number of events per job to be sure to cover the phase space in one job, this is a physics requirement that cannot be changed
    • But Ops can run more events per job if the time/event allows the jobs to fit in the standard queues
    • The job splitting and the queue choice is done on assignment level => optimization of job splitting has to be done on assignment level as well
    • Following checks needs to be done on assignment, at assignment, the splitting needs to be reset per default:
      1. If minimum number of events per job times time/event exceeds 36 hours, a special queue has to be chosen, the job splitting is set for the minimum number of events and an optimized lumi section size (usually 300 events)
      2. If minimum number of events per job times time/event is less then 8 hours, increase number of events per job to fit in at least 8 hour jobs and do the splitting as normal

Wish List

Uploading of Configuration Files

The uploading of configuration files requires dependency on PSetTweaks.WMTweak and WMConfigCache while there is no real need for such dependency. Many a time, these created problems, and we are using a very outdated version of wmclient for that. The aim is to have a configCache api at which one can just upload the .py file with other required parameters, without pre-processing.

  • Eric provided a simplified version of the script JR is using 6 months ago
    • No dependency on WMCore, still depends on PSetTweaks. Simple "fetch and install" script could be provided if useful.
    • JR will try to find manpower to test
  • Longer term could consider a service which did this from a config
    • Code is not difficult but requires parts of WMCore and CMSSW installed in the same place
    • Precludes installation on CMS Web

Uploading of Request Dictionnary

The request are build with "wmcontrol" as a third party tool to translate the McM request content into a reqmng request content. It would be great to be able to upload the McM request dictionnary to reqmng, which in turn would operate the translation internally. For example, here is an McM request dictionnary: https://cms-pdmv.cern.ch/mcm//public/restapi/requests/get/TOP-Summer12-00191

  • JR will open a ticket on GitHub against ReqMgr2. We will re-discuss when Oli is available and possibly consider for ReqMgr 2

Proposal: ReqMgr Status and acknowledgements

  • holding off on assignment is not necessary anymore, the prioritization mechanisms have been improved so that we can re-prioritize even when the workflow is running
  • proposal is to remove the active acknowledgement step via mail
    • status: assignment-approved means not acknowledged yet, waiting for preparation of tape families or similar
    • status >= assigned: acknowledged, in the system and waiting for processing

Proposal: no announcement emails

Events Per Lumi

See this conclusion which does not seem to be enforced, or understood.

ACDC status toggling

See for a recurrent issue. If ACDC are not moved to announced, when the dataset is moved to VALID, the next step of the chain cannot move forward. A solution needs to be found to inambiguously trigger on the full completion of a request (with it's associated others). Curent example with /W3Jets_TuneZ2_7TeV-madgraph-tauola/Summer11Leg-START53_LV4-v1/GEN-SIM status in McM and Screen_Shot_2013-10-31_at_12.02.20_PM.png
Edit | Attach | Watch | Print version | History: r13 < r12 < r11 < r10 < r9 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r13 - 2013-11-14 - JeanrochVlimant
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback