Pilot 2.0 Task and Wish List
This is a collection of special requests, wishes, suggestions, recommendations, etc. for the Pilot 2.0 Project.
Task and wish list for new developments (including tasks common between Pilot 1 and Pilot 2)
Feature |
Comments |
Status |
Auto API documentation |
Generate API documentation from GitHub automatically. Assigned to Daniel. |
Not completed |
Harvester support |
For all workflows but initially for jumbo-job and event service on HPC:s using old PanDA Pilot code. |
In progress |
PilotDB |
Keep local DB file like sqlite3 to record job status, workflow status, job metrics, etc instead of dump files. Suggested by W. Guan (August 2016). Files/info to consolidate: 1) job metrics, 2) job state files, 3) jobReport json file, 4) pilot error report json file, 5) .. Assigned to Wen Guan (September 2016) |
Being discussed |
Information service component |
Design a pilot component responsible for contact with information services. Assigned to Alexey Anisenkov (August 2016). |
Done |
GFAL2 site mover |
Primarily for non-rucio users. Implemented by Alexey Anisenkov. (September 2016). |
Done |
Objectstore site mover |
Inheriting from rucio site mover. Assigned to Wen Guan (September 2016) |
Done. |
Traceability and pilot logging |
See JIRA ticket ATLASPANDA-295 for more info (October 2016) |
Not done |
getJob(), updateJob() |
All job types. Message body for the final heartbeat. Assigned to Daniel and Danila. |
Not done |
getEventRanges(), updateEventRanges() |
Mainly for event service. getEventRanges() also used for normal jobs to make sure the events are still available when the payload is executed, i.e. relevant for clone jobs. Assigned to Wen. |
Not done |
Extraction of information from jobReport.json |
Experiment specific json created by payload and interpreted by pilot. |
In progress |
Construction of jobMetrics |
Sent to server on every heartbeat. Assigned to Danila. |
Not done |
Construction of OutputFiles.xml |
Metadata info relevant for Nordugrid. |
Not done |
Conversion of worker profiles to job profiles |
E.g. adjustment of start and end time. Relevant for event service on HPC. |
Not done |
Plug-in mechanism |
Discuss different implémentations. 1) Wen's plug-in manager, 2) Harvester plug-in mechanism, 3) plug-in handling in Pilot 1. |
Done |
Data Control component interface to copy tools |
Assigned to Tobias. |
In progress |
Implementation of copy tools |
Assigned to Tobias/Pavlo. |
In progress |
Support for containers |
Pilot should be able to execute the payload in its own container. Container planning . Common task, Pilot 1+2. Assigned to Paul. Supported in Pilot 1, v 69.0 and Pilot 2 |
In progress |
Better debug mode capability |
Pilot should do the tail of the active log in debug mode, not necessarily the payload stdout. See mail from Rod, Feb 14. Pilot 2. |
To be done |
Pilot 1.0 feature requests
This section contains major feature requests for Pilot 1.0 that arrive during the development of Pilot 2.0, that should be added to Pilot 2.0 as well.
Feature |
Pilot 1.0 version |
Comments |
Status |
Benchmarking |
68.0 |
Support for cern-benchmark suite. Assigned to Paul Nilsson |
Done. |
Benchmarking 2 |
68.1 |
Optimizations for cern-benchmark suite. Assigned to Paul Nilsson |
Done. |
Using payload release to setup memory monitor |
67.6/68.0 |
Fallback to release 21 in case of setup problems (will happen with older releases). To be done by mid-February. Assigned to Danila Oleynik. Reassigned to Daniel Drizhuk. |
Done. |
Insufficient adler32 exception handling |
69.X |
See ATLASPANDA-329 . New adler32 function added, to be tested |
In progress. |
HC testing for preproduction versions of asetup and xrootd |
68.0 |
Extension of HC controls in the pilot (a la --overwriteQueuedata={..} instruction). See ATLASPANDA-322 |
Done. |
Failed job showing secondary error in panda monitor rather than actual important error |
? |
To be solved by using error priority (mechanism exists). See ATLASPANDA-282 |
Not done (low priority). |
Support for Event Streaming Service V1 |
68.0+ |
Event Service using Prefetcher tool. See EventServiceDataPrefetching. Assigned to Paul Nilsson. Prefetcher updated, not yet released (21.0.21). |
In progress. |
--
PaulNilsson - 2016-06-15