Testing BigPanDA Instance

Introduction

  • This page is evolving. Please keep checking.
  • This page contains information about tests one can perform to smoke-test BigPanDA instance.

Test that PanDA server accepts jobs

  • Prerequisities: PanDA server is installed, and at least one PanDA site ID is configured in table schedconfig, and also table cloudconfig is configured. User can use (big)panda-client (has voms-proxy created and panda-client set up).
  • As a part of the PanDA server code, there is a directory pandaserver/test, which contains various test job options.
  • One can submit a test job to a PanDA instance by following instructions from the BigPanDA client: https://github.com/PanDAWMS/panda-server/tree/mysqloraclemerge/bigpanda-client (lsstSubmit.py is one of the test job options).
  • Expected outcome:
    • When user submits job to the PanDA instance, cli response of the client is PandaID=XYZ (XYZ is an integer).
      • When XYZ is null, there is something wrong with PanDA server. Possible causes:
        • Improperly configured host certificate, and SSL part of PanDA is not responding properly. Solution: check that host certificate is issued for the machine with the DNS name in the DN of the host certificate, check permissions on the hostcert files, check configuration of httpd for PanDa server to see that it points to correct .pem files.
        • Issues with the schema/duplicit primary key values. Solution: increase autoincrement of the jobsdefined4 table, or address other errors you may see in panda-DBProxy.log.
    • After job submission, job status changes very fast to activated.
      • If it stays in defined, something is wrong with your PanDA instance. Check brokerage and ddm logs. Check ddm configuration. To avoid DDM in brokerage, configure your panda-server.cfg with
##########################
#
# Plugin setup
#
# plugins for Adder. format=vo1(|vo2(vo3...):moduleName:className(,vo4...)
adder_plugins = local:dataservice.AdderDummyPlugin:AdderDummyPlugin
# plugins for Setupper. format=vo1(|vo2(vo3...):moduleName:className(,vo4...)
setupper_plugins = local:dataservice.SetupperDummyPlugin:SetupperDummyPlugin

Test that PanDA server communicates with Pilot jobs

  • Prerequisities: PanDA server is installed (and a PanDA site ID is configured in schedconfig and cloudconfig tables). APF2 is installed and configured. "Pilot jsons" are installed/placed in the location known to the Pilot code. User can submit a job with the (big)panda-client (has voms-proxy created and panda-client set up).
  • Expected outcome:
    • Pilot job, running on the WN, can issue GET request to PanDA server, getJobs, and receive response (either that there is no job for this site, or response with the job).
      • If the getJobs request does not go through, check voms proxy used by the pilot (expired? missing?) and check panda_error and panda_ssl logs.
    • Job status is being updated in PanDA server.
      • If the job status remains sent even after job finishes (you can check that the job finished in APF monitoring, from the condor logs), Pilot job cannot communicate with your PanDA server. Check SSL settings of PanDA server, check logs (e.g. DBProxy, or panda's error/ssl/access logs).

Test that PanDA server post-run crons are enabled

  • Prerequisities: PanDA server is installed and configured, APF2 is installed, Pilot job can communicate with the PanDA server, user can submit jobs and jobs are executed on a WN.
  • INSTALL.txt file refers to 2 cronjobs to set up:
0-59/5 * * * * INSTALLDIR/usr/bin/panda_server-add.sh > /dev/null 2>&1
15 0-21/3 * * * INSTALLDIR/usr/bin/panda_server-copyArchive.sh > /dev/null 2>&1
  • Expected outcome:
    • Jobs change status to either finished, or failed. Jobs do not stay hanging in holding status for long time (more than 5 minutes).

Testing BigPanDAmon

Testing JEDI installation

  • Summary from Tadashi:
First you need to define workqueus. e.g.,

insert into jedi_work_queue (QUEUE_ID,QUEUE_NAME,queue_type,vo,QUEUE_SHARE,QUEUE_ORDER) values(300,'test','test',yourVO,100,10);

Where queue_type corresponding to prodSourceLabel.

Next  edit panda_jedi.cfg, e.g.,

modConfig = yourVO:1:pandajedi.jediddm.GenDDMClient:GenDDMClient
...

Then, there are some test scripts in 

https://github.com/PanDAWMS/panda-jedi/tree/master/pandajedi/jeditest

* Submit a test task after changing taskParamMap['vo'] and taskParamMap['prodSourceLabel']

$ python -i addNonAtlasTask.py

* Generate jobs
$ python -i taskRefinerTest.py [prodSourceLabel] [yourVO]
$ python -i contentsFeederTest.py [prodSourceLabel] [yourVO]
$ python -i jobGeneratorTest.py [prodSourceLabel] [yourVO] y

* Finish the task
$ python -i postProcessorTest.py






Major updates:

-- JaroslavaSchovancova - 16 Oct 2014






Responsible: n/a

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2014-10-16 - JaroslavaSchovancova
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PanDA All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback