Difference: ClosingProcedure (5 vs. 6)

Revision 62016-10-28 - MarcoCorvo

Line: 1 to 1
 
META TOPICPARENT name="LHCbComputing"

Closing procedure for productions

Line: 138 to 138
  If everything's ok, the final steps are to go to the Dirac web portal and mark the production "Complete", then set it "Done" in the Production Requests tab. This commands will leave the production in a "suspended" status for a week, just to let the experts have the time to resume it or to fix other issues.
Added:
>
>

Particularly odd situation

It can happen that a job gets killed or finishes in a very bad shape while performing the last operations on the output files, that is while moving them to their final destination. In this case the job is marked as Failed, but still some files could have been replicated. In principle the system should take care of issuing a Removal request for the output files that made it through, but this doesn't happen always.

Best thing to do in this case is to remove the output files and reset the input Unused.

First check files descendants:

[localhost] ~ $ dirac-bookkeeping-job-input-output --Output 142393519 | dirac-bookkeeping-get-file-descendants
Got 8 LFNs
Getting descendants for 8 files (depth 1) : completed in 0.1 seconds
NotProcessed :
    /lhcb/LHCb/Collision16/CHARMCHARGED.MDST/00053884/0000/00053884_00003714_1.charmcharged.mdst
    /lhcb/LHCb/Collision16/CHARMKSHH.MDST/00053884/0000/00053884_00003714_1.charmkshh.mdst
    /lhcb/LHCb/Collision16/CHARMMULTIBODY.MDST/00053884/0000/00053884_00003714_1.charmmultibody.mdst
    /lhcb/LHCb/Collision16/CHARMSPECPARKED.MDST/00053884/0000/00053884_00003714_1.charmspecparked.mdst
    /lhcb/LHCb/Collision16/CHARMSPECPRESCALED.MDST/00053884/0000/00053884_00003714_1.charmspecprescaled.mdst
    /lhcb/LHCb/Collision16/CHARMTWOBODY.MDST/00053884/0000/00053884_00003714_1.charmtwobody.mdst
    /lhcb/LHCb/Collision16/LEPTONS.MDST/00053884/0000/00053884_00003714_1.leptons.mdst
    /lhcb/LHCb/Collision16/LOG/00053884/0000/00003714/DaVinci_00053884_00003714_1.log

then check the status of the replicas:


[localhost] ~ $ dirac-dms-replica-stats --Last
Got 8 LFNs
Getting replicas for 8 LFNs : completed in 0.2 seconds
6 files found without a replica
2 files found with replicas

Replica statistics:
0 archives: 2 files
0 replicas: 6 files
1 replicas: 2 files

SE statistics:
     CNAF-BUFFER: 2 files

Sites statistics:
     LCG.CNAF.it: 2 files

Two out of eight files, in this example, were copied even if the system didn't set the replica flag correctly. Remove these two files:

dirac-dms-remove-files --LFNs=<list of lfns>

and reset the input Unused

[localhost] ~ $ dirac-transformation-reset-files <ProdID> --LFNs=<list of lfns>
 -- MarcoCorvo - 2016-10-07
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback