HpcYoda

Yoda is the Event Service implemention on HPC. Overall architecture is shown below.

yoda-structure.png

Yoda is composed of several parts:

  • pilotRunJobHpcEvent - the frontend part to download jobs and get event ranges from Panda, stage out outputs to objectstore and update event status to Panda.
  • YodaDroid - HPC MPI job to run Events
  • EventServerJobManager - main part in Droid to manage AthenaMP, TokenExtractor and Yampl messaging. It's the main part to inject events to AthenaMP to process and retrieve outputs.

RunJobHpcEvent

It's part of pilot to start HPC ES. After pilot get job from Panda. It will setup the environment, prepare job files and commands. Then it will use HPCManager to getHPCResource(free cores for backfill mode, default resource defined in schedconfig for normal mode), submit HPC jobs and poll the jobs. HPCManager is the interface between pilot and HPC. Now it's implemented based on PBS/Torque cluster. It can be extended. RunJobHpcEvent.png

Yoda-Droid

Yoda-Droid is the HPC MPI job.

  • Yoda is the part running on MPI rank 0. It manages the job and events table centrally. It uses MPI interface to distributed job and events to Droid. Outputs received from Droid through MPI interface will be updated in events table and dumped to pilot periodly.
  • Droid is the part running on MPI rank more than 0. It gets job from Yoda, then starts EventServerJobManager to start the job. When EventServerJobManager is ready(AthenaMP is setup), Droid will get event ranges from Yoda and inject event ranges to ESJobManager. Then Droid will poll ESJobManager to wait the outputs and send outputs to Yoda. HPC-Yoda.png

EventServerJobManager

main part in Droid to manage AthenaMP, TokenExtractor and Yampl messaging thread. AthenaMP and TokenExtractor are components in Athena. Users can use yampl messaging channel to contact AthenaMP. So ESJobManager is the part to handle messages in Yampl messaging channel. EventServerJobManager.png

How to Run Yoda ES pilot

If you are interested in running Yoda ES jobs. You can follow theses steps:

-- WenGuan - 2015-02-24

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng EventServerJobManager.png r1 manage 16.3 K 2015-02-24 - 10:35 WenGuan  
PNGpng HPC-Yoda.png r1 manage 22.8 K 2015-02-24 - 10:35 WenGuan  
PNGpng RunJobHpcEvent.png r1 manage 19.3 K 2015-02-24 - 10:35 WenGuan  
PNGpng yoda-structure.png r1 manage 90.4 K 2015-02-24 - 10:35 WenGuan  
Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2015-02-24 - WenGuan
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PanDA All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback