Agent tweaks

  • AgentStatusWatcher disabled ==> OK
  • Set maxRetries to 0 ==> OK
  • Restart the necessary components ==> OK
    $manage execute-agent wmcoreD --restart --components=RetryManager,ErrorHandler,AgentStatusWatcher
  • Enable ALL sites and set a high number of thresholds for them ==> OK
    $manage db-prompt wmagent
    UPDATE wmbs_location SET state=(SELECT id from wmbs_location_state where name='Normal') WHERE state!=(SELECT id from wmbs_location_state where name='Normal');
    UPDATE wmbs_location SET running_slots=2000, pending_slots=1000;
    UPDATE rc_threshold SET max_slots=2000, pending_slots=1000;

Draining logs

cmst1@vocms0311:/data/srv/wmagent/current $ python drainAgent.py 

*** Amount of jobs in condor per workflow, sorted by condor job status:

*** WORKFLOWS: found 10 distinct workflows in this agent.
prozober_ACDC0_SUS-RunIISpring16FSPremix-00083_00047_v0__170210_085036_9266                                                  	completed
pdmvserv_task_HIG-RunIISummer16DR80Premix-01881__v1_T_170113_162606_4964                                                     	running-closed
pdmvserv_task_SMP-PhaseIIFall16GS82-00004__v1_T_170213_041355_8715                                                           	running-closed
pdmvserv_task_HIG-RunIISummer16DR80Premix-01911__v1_T_170116_133411_9741                                                     	completed
pdmvserv_task_HIG-PhaseIFall16wmLHEGS-00003__v1_T_170203_174534_3479                                                         	completed
pdmvserv_task_TOP-RunIISummer16DR80Premix-00083__v1_T_161219_123711_1390                                                     	completed
pdmvserv_task_SMP-PhaseIIFall16GS82-00005__v1_T_170213_041344_640                                                            	running-closed
pdmvserv_task_HIG-RunIISummer16DR80Premix-01880__v1_T_170113_162606_6082                                                     	running-closed
pdmvserv_task_B2G-RunIISummer16DR80Premix-01472__v1_T_170120_005831_9210                                                     	completed
pdmvserv_task_TOP-RunIISummer16DR80Premix-00130__v1_T_170208_203239_6372                                                     	running-closed

*** WORKFLOWS: there are 0 distinct workflows not completed.

*** WORKFLOWS: found 0 workflows not fully injected.

*** WMBS: amount of wmbs jobs in each status:
[{'count': 528992, 'name': 'cleanout'}]

*** SUBSCRIPTIONS: subscriptions not finished: 0

*** SUBSCRIPTIONS: found 0 files available in WMBS (waiting for job creation):

*** SUBSCRIPTIONS: found 0 files acquired in WMBS (waiting for jobs to finish):

*** DBS: found 0 blocks open in DBS. Printing the first 20 blocks only:

*** DBS: found 0 files not uploaded to DBS.

*** PHEDEX: found 0 files not injected in PhEDEx, with valid block id (recoverable).

*** PHEDEX: found 466047 files not injected in PhEDEx, with valid block id (unrecoverable).
==> Which maps to 4032 unique datasets:
... that were NOT produced by any agent-known workflow. OR, the wfs are gone already.

I'm done!

and just double-checking the status of the local workqueue(_inbox) databases.

cmst1@vocms0311:/data/srv/wmagent/current $ python localWorkQueueStatus.py 
INFO:root:************* LOCAL workqueue elements summary ************
INFO:root:Found a total of 600 elements in the 'workqueue_inbox' db
INFO:root:{u'Done': 600}
INFO:root:Found a total of 600 elements in the 'workqueue' db
INFO:root:{u'Done': 600}

this agent is GOOD TO GO!!!

Edit | Attach | Watch | Print version | History: r23 < r22 < r21 < r20 < r19 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r23 - 2017-02-27 - AlanMalta
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback