Activity Reports 2017 - Paul Nilsson

December

TODO:

  • Create new wiki for AGIS/schedconfig usage in Pilot 2 (more detailed than before). To be merged with main schedconfig at a later time.
  • Create new wiki for Pilot 2 error codes and explanations (more detailed than before).
  • Add proper error handling to Pilot 2, including new error related modules.
  • Pilot 2 HammerCloud testing.
  • Begin Pilot 2 testing on grid - specifically for testing containers. Rework usage of containers in pilot (to more complicated use cases; e.g. only use container for stage-in/out in case of missing local copytools).
  • LSM copy tool (pending delivery of Information Service component).
  • Completing HPC MiniPilot integration with Pilot 2 (no progress since early December).

Pilot 1

  • Released pilot versions 72.0 and 72.1. Worked on 72.2.
  • Maintenance updates related to containers (following recent changes to cmtconfig and cvmfs.
  • Proper support for direct_access_wan/lan schedconfig fields (following recent rucio updates to especially geoip-sorting support which allows for proper WAN handling). In collaboration with A. Anisenkov and M. Lassnig.
  • Final implementation for Prefetcher support in pilot. In collaboration with N. Magini.

Pilot 2

  • Successfully ran finished/complete Pilot 2 job on the grid, a major milestone for 2018. It means that main functionalities are now largely in place.
  • Planned for integrating HPC MiniPilot workflow into Pilot 2
  • Planned for Pilot 2 use on HPCs (other than integrated MiniPilot workflow)
  • Added preliminary container support.
  • Initial implementation of job report dictionary sent to server at the end of the job (not fully populated, but contains required fields to job to be finished by server).
  • Updates to Pilot 2 Architecture document.

Other

  • Improved HammerCloud testing by moving from constant functional to time limited stress testing (incl. conversion of rc_test templates). PN is now HammerCloud admin and can control test suites by himself. In collaboration with J. Schovancova.
  • Two presentations in ATLAS Computing & Software Week (Pilot 2 update and container plans)
  • Prepared [[https://docs.google.com/document/d/1y7tzlWkgkMYvKesHvM7FxaJJ3-yAVCOqj7m7ysJcUKM/edit#heading=h.dtyyyqfg91m][CHEP 2018 abstract about Pilot 2].

November (not updated regularly..)

Pilot 1

Pilot 2

  • Added real payload setup (VO specific ATLAS code)
  • Created example of semi-auto generated code documentation

Other

  • Two Pilot Developer meetings
  • Presentations in two WFMS meetings (Pilot updates)

October (in progress)

Pilot 1

  • Released pilot version 71.0.
  • Support for new schedconfig fields related to containers (container_type, container_name)

Pilot 2

  • Added plug-in handling e.g. used for VO specific code
  • Created mv copytool
  • Preparing for minipilot migration into Pilot 2 workflow

Other

  • One Pilot Developer meeting
  • Wrote ACAT 2017 proceedings paper about Pilot 2
  • Discussed pilot options for wrappers in dedicated TCB meeting. Gave heads-up for Pilot 2 changes.
  • Review of activity settings for movers, storages and protocols in AGIS and pilot. In collaboration with A. Anisenkov.

September (see USReporting for more news)

PanDA Pilot

  • Released pilot version 70.4. Presented by Wen Guan at the ADC Weekly

Pilot 2

  • Lots of progress, including bug fixes, new pilot options (plus implementation) and function development
  • Prepared for container support

Other

  • Created plans for container support for both Pilot 1 and 2 (discussions with Alessandra and Andrej - to be discussed at the TIM)
  • (Distributed Analysis tutorial at CERN)
  • (Presented Pilot 2 project at CERN TIM)

August

PanDA Pilot

  • Presented pilot version 70.3 in ADC Weekly
  • Added setup command verification and introduced new error code to especially improve debugging of release setup failure

Pilot 2

  • Prepared Pilot 2 poster for ACAT 2017 conference together with Daniel Drizhuk and Danila Oleynik

Other

  • Prepared for slides about Pilot 2 and long term developments presented by Kaushik De at the annual US ATLAS Computing week
  • Planned for WAN and LAN discussion at CERN TIM in September with Ilija Vukotic and Mario Lassnig (created Google Doc), two meetings
  • Planned for HPC benchmarking on HPCs and Tier-0, both in Pilot 2 API and with new Benchmark_tf.py transform (with Graeme Stewart)
  • Discussed future AES development with Wen Guan, both for Pilot 1 and 2 (UML sequence diagrams created for both pilot versions)

July

PanDA Pilot

  • Released two pilot patches (70.1 and 70.2) with urgent changes
  • Presented pilot version 70.0 in ADC Weekly
  • Additional work for direct access in Event Service (further development needed incl. on rucio server side)
  • Greatly improved handling and setting of ATHENA_PROC_NUMBER (replaced years of )

Pilot 2

  • Presented Pilot 2 update in WFMS meeting
  • Created work plan for Daniel Drizhuk for his stay at CERN in July. Supervised Drizhuk during his 4-week stay at CERN with regular meetings
  • Worked on Pilot 2 documentation, function description, learned about Sphinx tool used for documentation

Other

  • Distributed Analysis Tutorial at CERN
  • Created test site BNL_PROD_MCORE_TEST for testing direct access

June

PanDA Pilot

  • Developed, tested and released pilot versions 69.0, 69.1 (hot fix). Presented at ADC Weekly
  • Initial support for direct access in Event Service
  • Simplified Nordugrid payload setup
  • Implemented zip map support
  • Implemented support for event streaming service

Pilot 2

  • Worked on Pilot 2 wrapper, updated wrapper and pilot with more options

Other

  • Participated in Software & Computing Week in Valencia, Spain
  • Participated in checkpointing meetings

May

PanDA Pilot

  • Development, testing and release of pilot version 68.3 (hot fix, so no ADC Weekly presentation)
  • Development, testing of pilot version 69.0
  • Testing Singularity

Pilot 2

  • Created new Google doc discussing Pilot 2 APIs (internal)
  • Finished Google doc about Pilot 1+2 payload setup (internal)
  • Organised two Pilot Development meetings

Other

April

PanDA Pilot

  • Development, testing of pilot version 69.0
  • Development, testing and release of pilot version 68.2. Presented at ADC Weekly meeting
  • Benchmarking; from 68.2 the pilot is sending an optimised and updated dictionary to an intermediary ES service provided by Ilija
  • Simplification of payload setup functions. In collaboration with Johannes, Graeme and Tadashi.
  • Implemented zipping of output files

Pilot 2.0

Other Activities

  • Developed a new version of the cachePilots script that copies the pilot tarball from cvmfs to the PanDA servers. Asked Alessandro de Salvo to create pilot tarballs and store them in cvmfs
  • Debugging broken/new site RAL-AZURE_VAC

March

PanDA Pilot

  • Development, test and release of pilot version 68.0. Presented at ADC Weekly meeting
  • Development, test and release of pilot version 68.1. Presented at ADC Weekly meeting
  • Testing of event streaming service implementation (pre-fetching) [currently problematic - problem identified, pending fix]
  • Testing and improving implementation of benchmarking suite, released with version 68.0. Additional fixes released with version 68.1. Identified new problems with running the benchmark suite on the grid (using argparse module which is not always available since it was not introduced until python 2.7)

Pilot 2.0

Other Activities

  • Participated in mini-workshop about containers. Work plan being discussion with Andrej. Tests can begin soon (~April) pending details to be worked out regarding AGIS and PanDA changes.
  • Participated in Computing and Software Week, presented Pilot 1+2 update
  • Discussed follow up implementation for benchmarking with Tadashi, Fernando, Ilija (about populating ES).
  • Discovered badly formatted release strings in some HC test jobs (later fixed by HC team)

February

PanDA Pilot

  • Released pilot versions 67.5, 67.6 (hot fix)
  • Development of pilot version 68.0
  • Implementing major event service pilot updates to support pre-fetching (68.X). Discussing implementation details with Vakhtang Tsulaia.

Pilot 2.0

  • Identified functions to be used by Harvester (“Tadashi’s request list”), now in active development
  • Activated Travis CI support for Pilot2 GitHub

Other Activities

  • Assigned additional and new tasks to Pilot Developers (for Pilot 1 and 2)
  • Presented Pilot 1+2 in WFMS meeting
  • Participated in discussions (email + dedicated WFMS meeting) about benchmarking (on/off HPC:s)

January

PanDA Pilot

  • Development of pilot version 67.5, presented at ADC Weekly, Jan 31
  • Planning for major event service pilot updates to support pre-fetching (67.X). Discussing implementation details with Vakhtang Tsulaia
  • Tested CERN Benchmark suite, being discussed if it should be executed by the pilot or by a TRF (TBD in meeting in February)

Pilot 2.0

Other Activities

-- PaulNilsson - 2017-02-09

Edit | Attach | Watch | Print version | History: r22 < r21 < r20 < r19 < r18 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r22 - 2018-01-17 - PaulNilsson
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PanDA All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback