Standard Corrective Actions for known problems

TW not running

DBS Migrations fail causing Publisher failures

  • Diagnosis:
    • Publication Aquired plot in https://monit-grafana.cern.ch/d/zEhXoSxMk/crab-asometrics develops a significat (>1k) baseline, indicating that some files are stuck in Aquired and never get published. This can be due to many reasons, but failed migrations is a known cause and needs to be looked at first.
    • log on crab-prod-tw02 or wherever appropriate, do sudo su crab3 then do:
      • cd ~/Utils
      • source SETUP.sh
      • optionally do a git pull on the CRABServer directory there if the scripts indicated here need updating
      • python  CRABServer/scripts/Utils/FindFailedMigrations.py --file /data/container/Publisher_schedd/logs/migrations/TerminallyFailedLog.txt
      • typical output
        [crab3@crab-prod-tw02 Utils]$ python  CRABServer/scripts/Utils/FindFailedMigrations.py --file /data/container/Publisher_schedd/logs/migrations/TerminallyFailedLog.txt 
        Found 4 unique migration IDs logged as terminally failed
         2987329
         2987219
         2987260
         2987235
        Check current status
         2987219 has been removed
         2987260 has been removed
         2987235 has been removed
        Found 1 terminally failed migrations
           ID      created         block
         2987329   2021-01-02 06:19:22   /Spin0ToBB_2j_scalar_g1_M200_pT300_TuneCP5_13TeV_madgraphMLM_pythia8/RunIIFall17MiniAODv2-PU2017_12Apr2018_94X_mc2017_realistic_v14-v1/MINIAODSIM#31455a75-885d-4726-bb6f-0fa1357d0cbf
        [crab3@crab-prod-tw02 Utils]$
        

  • Cure
    • failed migrations must be reported to DBS experts for investigation
    • if DBS expert says "go ahead and remove it" or simply we want/need to get rid of the backlog w/o waiting for the expert, we can remove the failed migration ourselves with
    • python  CRABServer/scripts/Utils/RemoveFailedMigration.py --id 2987329 or whatever the correct id is
    • the command will check the migration Id, make sure that it is one which makes sense to remove and ask for confirmation before doing it. CRAB Publisher will create new publication requests as needed, but if we want to speed up things it can done now as well, the RemoveFailedMigrations.py command prints out instructions for that.
    • Example
       [crab3@crab-prod-tw02 Utils]$ python  CRABServer/scripts/Utils/RemoveFailedMigration.py --id 2987329
      migrationId: 2987329 was created on 2021-01-02 06:19:22 by service@crab-prod-tw02.cern.ch for block:
       /Spin0ToBB_2j_scalar_g1_M200_pT300_TuneCP5_13TeV_madgraphMLM_pythia8/RunIIFall17MiniAODv2-PU2017_12Apr2018_94X_mc2017_realistic_v14-v1/MINIAODSIM#31455a75-885d-4726-bb6f-0fa1357d0cbf
      Do you want to remove it ? Yes/[No]: N
      [crab3@crab-prod-tw02 Utils]$ python  CRABServer/scripts/Utils/RemoveFailedMigration.py --id 2987329
      migrationId: 2987329 was created on 2021-01-02 06:19:22 by service@crab-prod-tw02.cern.ch for block:
       /Spin0ToBB_2j_scalar_g1_M200_pT300_TuneCP5_13TeV_madgraphMLM_pythia8/RunIIFall17MiniAODv2-PU2017_12Apr2018_94X_mc2017_realistic_v14-v1/MINIAODSIM#31455a75-885d-4726-bb6f-0fa1357d0cbf
      Do you want to remove it ? Yes/[No]: Y
      
      Removing it...
      Migration 2987329 successfully removed
      
      CRAB Publisher will issue such a migration request again as/when needed
      but if you want to recreated it now, you can do it  with this python fragment
      
        ===============
      
      import CRABClient
      from dbs.apis.dbsClient import DbsApi
      globUrl='https://cmsweb.cern.ch/dbs/prod/global/DBSReader'
      migUrl='https://cmsweb.cern.ch/dbs/prod/phys03/DBSMigrate'
      apiMig = DbsApi(url=migUrl)
      block='/Spin0ToBB_2j_scalar_g1_M200_pT300_TuneCP5_13TeV_madgraphMLM_pythia8/RunIIFall17MiniAODv2-PU2017_12Apr2018_94X_mc2017_realistic_v14-v1/MINIAODSIM#31455a75-885d-4726-bb6f-0fa1357d0cbf'
      data= {'migration_url': globUrl, 'migration_input': block}
      result = apiMig.submitMigration(data)
      newId = result.get('migration_details', {}).get('migration_request_id')
      print('new migration created: %d' % newId)
      status = apiMig.statusMigration(migration_rqst_id=newId)
      print(status)
      
        ===============
      
      [crab3@crab-prod-tw02 Utils]$ 
      

-- StefanoBelforte - 2021-01-02

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2021-02-04 - StefanoBelforte
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback