Pilot 2.0 Task and Wish List

This is a collection of special requests, wishes, suggestions, recommendations, etc. for the Pilot 2.0 Project.

Task and wish list for new developments (including tasks common between Pilot 1 and Pilot 2)

Feature Comments Status
Auto API documentation Generate API documentation from GitHub automatically. Assigned to Daniel. Not completed
Harvester support For all workflows but initially for jumbo-job and event service on HPC:s using old PanDA Pilot code. In progress
PilotDB Keep local DB file like sqlite3 to record job status, workflow status, job metrics, etc instead of dump files. Suggested by W. Guan (August 2016). Files/info to consolidate: 1) job metrics, 2) job state files, 3) jobReport json file, 4) pilot error report json file, 5) .. Assigned to Wen Guan (September 2016) Being discussed
Information service component Design a pilot component responsible for contact with information services. Assigned to Alexey Anisenkov (August 2016). Done
GFAL2 site mover Primarily for non-rucio users. Implemented by Alexey Anisenkov. (September 2016). Done
Objectstore site mover Inheriting from rucio site mover. Assigned to Wen Guan (September 2016) Done.
Traceability and pilot logging See JIRA ticket ATLASPANDA-295 for more info (October 2016) Not done
getJob(), updateJob() All job types. Message body for the final heartbeat. Assigned to Daniel and Danila. Not done
getEventRanges(), updateEventRanges() Mainly for event service. getEventRanges() also used for normal jobs to make sure the events are still available when the payload is executed, i.e. relevant for clone jobs. Assigned to Wen. Not done
Extraction of information from jobReport.json Experiment specific json created by payload and interpreted by pilot. In progress
Construction of jobMetrics Sent to server on every heartbeat. Assigned to Danila. Not done
Construction of OutputFiles.xml Metadata info relevant for Nordugrid. Not done
Conversion of worker profiles to job profiles E.g. adjustment of start and end time. Relevant for event service on HPC. Not done
Plug-in mechanism Discuss different implémentations. 1) Wen's plug-in manager, 2) Harvester plug-in mechanism, 3) plug-in handling in Pilot 1. Done
Data Control component interface to copy tools Assigned to Tobias. In progress
Implementation of copy tools Assigned to Tobias/Pavlo. In progress
Support for containers Pilot should be able to execute the payload in its own container. Container planning. Common task, Pilot 1+2. Assigned to Paul. Supported in Pilot 1, v 69.0 and Pilot 2 In progress
Better debug mode capability Pilot should do the tail of the active log in debug mode, not necessarily the payload stdout. See mail from Rod, Feb 14. Pilot 2. To be done

Pilot 1.0 feature requests

This section contains major feature requests for Pilot 1.0 that arrive during the development of Pilot 2.0, that should be added to Pilot 2.0 as well.

Feature Pilot 1.0 version Comments Status
Benchmarking 68.0 Support for cern-benchmark suite. Assigned to Paul Nilsson Done.
Benchmarking 2 68.1 Optimizations for cern-benchmark suite. Assigned to Paul Nilsson Done.
Using payload release to setup memory monitor 67.6/68.0 Fallback to release 21 in case of setup problems (will happen with older releases). To be done by mid-February. Assigned to Danila Oleynik. Reassigned to Daniel Drizhuk. Done.
Insufficient adler32 exception handling 69.X See ATLASPANDA-329. New adler32 function added, to be tested In progress.
HC testing for preproduction versions of asetup and xrootd 68.0 Extension of HC controls in the pilot (a la --overwriteQueuedata={..} instruction). See ATLASPANDA-322 Done.
Failed job showing secondary error in panda monitor rather than actual important error ? To be solved by using error priority (mechanism exists). See ATLASPANDA-282 Not done (low priority).
Support for Event Streaming Service V1 68.0+ Event Service using Prefetcher tool. See EventServiceDataPrefetching. Assigned to Paul Nilsson. Prefetcher updated, not yet released (21.0.21). In progress.

