PandaPilotFAX
Introduction
The
PanDA Pilot is equipped with several retry mechanisms. In the case stage-in fails due to a temporary SE related problem, the pilot will re-attempt the stage-in a second time after a few minutes. If that fails as well, the pilot has the option to attempt stage-in from a remote SE using FAX (federated xrootd).
Activation
FAX retries by the pilot are activated mainly by the schedconfig fields allowfax and faxredirector. allowfax is currently a boolean and takes the two values True and False. Setting allowfax to True will enable FAX retries by the pilot, but the pilot also needs to know which FAX redirector to use. This is specified by the faxredirector field. The current list of FAX redirectors can be found
here.
FAX retries can also be achieved through the use of the job parameters option --overwriteQueuedata={allowfax=True,faxredirector=..}. This, in turn, is set either directly in the jobParameters (for production jobs) or via the prun/runAthena option --queueData={allowfax=True,faxredirector=..} (prun/runAthena will forward this information via the jobParameters to the pilot which receives it as --overwriteQueuedata; note the difference).
It is also possible to use FAX as a primary copytool (internally there is a
FAXSiteMover). If copytoolin=fax, the pilot will use FAX for all stage-ins. Note: FAX should not be attempted for stage-out (not implemented), so do not set copytool=fax, use only copytoolin=fax and set copytool to the copytool relevant for the site.
Major updates:
--
PaulNilsson - 05-Jun-2013
Responsible: PaulNilsson
Never reviewed