Shifter Checklist

This is the procedure to be followed at the end of a production

When a production has ended

  • "Stop" the production
  • Check that the files exist in the bookkeeping. They should appear at the very latest, the day after the job is finished, unless there are errors with the bookkeeping
    • Check that the number of events generated is correct (>~= the number requested)
  • Output files for the production should exist in the SE and LFC. You can check this for a representative sample of jobs in the production.
    • To check the file in the LFC, you can run the following command (after a "SetupProject Dirac ; lhcb-proxy-init") : lfc-ls -l /grid/. You can obtain the JobOutputLFN from the tag "ProductionOutputData" of in the JDL information of the job. For example
[lxplus253] ~ > lfc-ls -l /grid/lhcb/production/DC06/phys-v4-lumi5/00001908/RDST/0000/00001908_00000115_1.rdst
-rw-rw-r--   1 19028    2695              377251657 Sep 13  2007 /grid/lhcb/production/DC06/phys-v4-lumi5/00001908/RDST/0000/00001908_00000115_1.rdst
    • To check if a file is in the SE you can run the DIRAC command "dirac-dms-lfn-accessURL " as shown below for example
[lxplus253] ~ > dirac-dms-lfn-accessURL /lhcb/production/DC06/phys-v4-lumi5/00001908/RDST/0000/00001908_00000115_1.rdst CERN-tape
2008-12-12 12:47:03 UTC dirac-dms-lfn-accessURL.py  INFO: ReplicaManager.getReplicaAccessUrl: Attempting to get access urls for 1 replicas.
2008-12-12 12:47:03 UTC dirac-dms-lfn-accessURL.py  INFO: ReplicaManager.getReplicaAccessUrl: Resolving replicas for supplied LFNs.
2008-12-12 12:47:03 UTC dirac-dms-lfn-accessURL.py  INFO: ReplicaManager.__getPhysicalFileAccessUrl: Attempting to get access urls for 1 files.
2008-12-12 12:47:03 UTC dirac-dms-lfn-accessURL.py  INFO: Using lcg_util from: /afs/cern.ch/lhcb/software/releases/DIRAC/DIRAC_v4r3p1/Linux_x86_64_glibc-2.3.4/lib/python2.4/site-packages/lcg_util.py
2008-12-12 12:47:03 UTC dirac-dms-lfn-accessURL.py  INFO: The version of lcg_utils is 1.6.13
2008-12-12 12:47:04 UTC dirac-dms-lfn-accessURL.py  INFO: Using gfalthr from: /afs/cern.ch/lhcb/software/releases/DIRAC/DIRAC_v4r3p1/Linux_x86_64_glibc-2.3.4/lib/python2.4/site-packages/gfalthr.py
2008-12-12 12:47:04 UTC dirac-dms-lfn-accessURL.py  INFO: The version of gfalthr is 1.10.15
2008-12-12 12:47:04 UTC dirac-dms-lfn-accessURL.py  INFO: StorageElement.isValid: Determining whether the StorageElement CERN-tape is valid for use.
2008-12-12 12:47:04 UTC dirac-dms-lfn-accessURL.py  INFO: StorageElement.isLocalSE: Determining whether CERN-tape is a local SE.
2008-12-12 12:47:04 UTC dirac-dms-lfn-accessURL.py  INFO: StorageElement.getAccessUrl: Generating protocol PFNs for SRM2.
2008-12-12 12:47:04 UTC dirac-dms-lfn-accessURL.py  INFO: StorageElement.getAccessUrl: Attempting to get access urls for 1 physical files.
2008-12-12 12:47:36 UTC dirac-dms-lfn-accessURL.py ERROR: SRM2Storage.__gfal_turlsfromsurls: Failed to perform gfal_turlsfromsurls: [SE][StatusOfGetRequest] httpg://srm-lhcb.cern.ch:8443/srm/managerv2: User timeout over Unknown error 18446744073709551615
2008-12-12 12:47:37 UTC dirac-dms-lfn-accessURL.py  INFO: StorageElement.getAccessUrl: Generating protocol PFNs for RFIO.
2008-12-12 12:47:37 UTC dirac-dms-lfn-accessURL.py  INFO: StorageElement.getAccessUrl: Attempting to get access urls for 1 physical files.
{'Failed': {},
 'Successful': {'/lhcb/production/DC06/phys-v4-lumi5/00001908/RDST/0000/00001908_00000115_1.rdst': {'RFIO': 'rfio://castorlhcb:9002/?svcClass=default&castorVersion=2&path=/castor/cern.ch/grid/lhcb/production/DC06/phys-v4-lumi5/00001908/RDST/0000/00001908_00000115_1.rdst'}}}
  • Check the log files and that the successful log files also correspond to the entries in the bookkeeping / LFC. Jobs which have stalled, or failed due to problems in the application will usually not have their output data stored (except for the log files). If they exist, please contact the grid operator on duty to clarify the situation.
  • Mark the production as "Done"
    • Actually the production tools are not yet up-to-date on this. For now, "stop"ping the production is enough.
  • Inform the requester that the production is completed. Also put in a message that this production is finished in the elogger
    • Also put in a reference to the bookkeeping to the requester and the elogger. The format could be something along the lines below
"CONFIGNAME"    "CONFIGVERSION" "SIMDESCRIPTION"        "PRODUCTION"    "EVENTSTAT"
"MC"    "2008"  "Beam450GeV-VeloOpen-BfieldZero"        "3066"  "4996"
"MC"    "2008"  "Beam450GeV-VeloOpen-BfieldZero"        "3066"  "4997"
Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2008-12-12 - RajaNandakumar
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LHCb All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback