BigPanDA Maintenance HowTos

Introduction

  • This page is evolving. Please keep checking.
  • This page contains information related to BigPanDA instance maintenance.
  • Feel free to contact JaroslavaSchovancova and TorreWenaus with questions.

EC2 instance

OSG host certificates

OSG GridAdmin

  • OSG GridAdmin is a contact person for a domain of machines. In case of BigPanDA EC2 instance the domain is pandawms.org.
  • GridAdmin can issue/revoke host and service certificates in OSG.
  • List of OSG GridAdmins: https://oim.grid.iu.edu/oim/gridadmin
  • Request for GridAdmin Enrolment can be issued via a form from https://oim.grid.iu.edu/oim/gridadmin
    • Prerequisities: prospective GridAdmin has a grid certificate known to oim.grid.iu.edu
    • Procedure: fill in Enrolment form, wait for approval.
      • Filling in the Enrolment form triggers creation of OSG ticket.

Issue and install a host certificate

Reboot of the instances

pandawms.org

  • as root:
    • service mysqld restart
    • service httpd restart
    • make sure that the auto-increments for pandadb1 are OK, if they are not, alter tables to set them properly, e.g.
      -- what is the actual max pandaid?
      select max(pandaid) from jobsarchived4;
      -- now max(pandaid) is 2789, thus 2795 is larger than that:
      ALTER TABLE jobsdefined4 AUTO_INCREMENT = 2795;
      ALTER TABLE subcounter_subid_seq AUTO_INCREMENT = 2795;
      
    • Check: http://pandawms.org/lsst/jobs/?display_limit=100

pilots1.pandawms.org

  • as root:
    • service proxymanager restart
    • Check: tail -f /var/log/apf/proxymanager.log is being updated
    • service factory restart
    • Check: tail -f /var/log/apf/apf.log is being updated
  • as jschovan:
    • /data/jschovan/work/panda-lsst/lsstserver.sh
    • Check: /data/jschovan/pandaservice.out, e.g.
      jschovan  2257  0.0  0.0 103244   840 pts/4    S+   06:32   0:00 grep PandaJobControl
      Sep 25, 2014 6:32:27 AM org.srs.jobcontrol.panda.PandaJobControlService main
      
    • Check: Run /data/jschovan/work/panda-lsst/lsstjob.sh, see that the pandaservice.out is updated, e.g.
      Sep 25, 2014 6:53:02 AM org.srs.jobcontrol.panda.PandaJobControlService submit
      INFO: submitting: command from 10.249.34.181
      Sep 25, 2014 6:53:02 AM org.srs.jobcontrol.panda.PandaJobControlService submitInternal
      INFO: Submit request received
      Sep 25, 2014 6:53:02 AM org.srs.jobcontrol.panda.PandaJobControlService submitInternal
      INFO: Submit: /data/jschovan/work/panda-lsst/panda_submit panda.submit
      Sep 25, 2014 6:53:04 AM org.srs.jobcontrol.panda.PandaJobControlService status
      INFO: status: 2795 from 10.249.34.181
      Sep 25, 2014 6:53:04 AM org.srs.jobcontrol.panda.PandaStatus updateStatus
      INFO: Status returned 1 lines
      Sep 25, 2014 6:53:04 AM org.srs.jobcontrol.panda.PandaStatus updateStatus
      INFO: id=2795 status=defined start=None end=None submit=1411653184 host=107.22.166.93 queue=ANALY_BNL-LSST user='unknown-user' comment=''
      
      • If you see any Java error traceback in pandaservice.out, something is wrong with the PandaService client and the error needs to be addressed.
      • If you get PanDA ID None, or a python traceback, returned by lsstjob.sh, the job submission failed and something is wrong with the job submit environment (e.g. proxy expired), or the PanDA server (autoincrement on jobdefined4 table, server certificate, etc.).
      • Submitted job can be monitored e.g. here: http://pandawms.org/lsst/jobs/?display_limit=100&produsername=unknown-user

OASIS SW installation

APF2 instance






Major updates:

-- JaroslavaSchovancova - 15 Sep 2014






Responsible: JaroslavaSchovancova

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2014-10-08 - JaroslavaSchovancova
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PanDA All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback