Central Harvester Instances

Technical documentation

For a technical description of Harvester components please visit the Harvester github wiki

Monitoring

Harvester machines

You can connect with the usual atlpan user
Name HarvesterID Description
aipanda170 cern_cloud Will soon be fully migrated to aipanda175.
aipanda171 CERN_aipanda171 Currently running for main Grid PQs. In the future will be moved into CERN_central_A. Submits via remote schedds
aipanda172 CERN_central_A Not running yet. In the future will run for part of main Grid PQs. Submits via remote schedds
aipanda173, aipanda174 CERN_central_B Currently running small amount of jobs of Grid PQs. In the future will run for part of main Grid PQs. Submits via remote schedds
aipanda175 CERN_central_0 Submits to special resources like GCE, P1. It contains a local MySQL database and local schedd.

Important paths and files

Files under /cephfs are shared across harvester nodes and schedd nodes, which are used when harvester submits via remote schedds.

Filename Description
/var/log/harvester All the harvester logs of the various agents
/usr/etc/panda/panda_harvester.cfg General configuration of subcomponents, DB connection, etc.
/usr/etc/panda/panda_queueconfig.json Queue configuration
/data/atlpan/harvester_common/ , /cephfs/atlpan/harvester/harvester_common/ Condor sdf templates and other files needed by Harvester
/data/atlpan/harvester_wdirs/${harvesterID}/XX/YY/${workerID} , /cephfs/atlpan/harvester/harvester_wdirs/${harvesterID}/XX/YY/${workerID}/ Worker directories: sdf file submitted to Condor for each job and other files. Where XXYY are the last 4 digits of workerID.
/data/atlpan/harvester_worker_dir/ , /cephfs/atlpan/harvester/harvester_worker_dir/ Deprecated. (worker directory)
/data/atlpan/condor_logs/ Local condor and pilot logs for each job
/data1/atlpan/condor_logs/ On schedd nodes: condor and pilot logs for each job; on harvester nodes: dummy folders/files but must exist

Restarting Harvester

[root@]# /usr/etc/rc.d/init.d/panda_harvester-uwsgi reload

MySQL

The most important tables in the DB structure can be found here, although it's under constant evolution. The DB configuration can always be found in the panda-harvester.cfg file, but currently you can connect to them like this:

Read-only account atlas-ro for debugging:

  • On aipanda171,172: # mysql -h dbod-harv.cern.ch -P 5501 -u atlas-ro -p HARVESTER

  • On aipanda173,174: # mysql -h dbod-harv2.cern.ch -P 5500 -u atlas-ro -p HARVESTER

  • On aipanda175: # mysql -u atlas-ro -p harvester

Password is in /cephfs/atlpan/harvester/mysql-passwd

Note the database names, DB hostname, and ports are different on nodes. Check configuration in [db] schema in harvester.cfg

You will need to know the password to connect. Run only queries you understand and where you know what you are doing.

If write permission really needed, use the account of harvester service, Check harvester.cfg for account user and password.

Condor Schedd machines

These are external condor schedd nodes harvester nodes submit through. You can connect with the usual atlpan user
Name
aipanda023
aipanda024

They are interchangeable.

Restarting condor

[root@]# systemctl restart condor

Important paths and files

Filename Description
/etc/condor/ Condor configuration, our configuration usually goes under config.d
/var/log/condor/ Condor logs, one per each agent

Troubleshooting

Condor schedd related

more...

FahuiLin - 2018-09-24

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2018-09-24 - FahuiLin
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PanDA All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2018 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback