Central Harvester Instances

Technical documentation

For a technical description of Harvester components please visit the Harvester github wiki

Monitoring

Harvester machines

You can connect with the usual atlpan user
Name HarvesterID Description
aipanda170 cern_cloud Pre-poduction node.
aipanda171, aipanda172 CERN_central_A Currently running part of main Grid PQs. Submits via remote schedds
aipanda173, aipanda174 CERN_central_B Currently running part of main Grid PQs. Submits via remote schedds
aipanda175 CERN_central_0 Submits to P1. It contains a local MySQL database and local schedd.
aipanda177, aipanda178 CERN_central_1 Submits to special resources like CloudSchedulers. Submit to local schedd.

Important paths and files

Files under /cephfs are shared across harvester nodes and schedd nodes, which are used when harvester submits via remote schedds.

Filename Description
/var/log/harvester All the harvester logs of the various agents
/usr/etc/panda/panda_harvester.cfg General configuration of subcomponents, DB connection, etc.
/usr/etc/panda/panda_queueconfig.json Queue configuration
/data/atlpan/harvester_common/ , /cephfs/atlpan/harvester/harvester_common/ Condor sdf templates and other files needed by Harvester
/data/atlpan/harvester_wdirs/${harvesterID}/XX/YY/${workerID} , /cephfs/atlpan/harvester/harvester_wdirs/${harvesterID}/XX/YY/${workerID}/ Worker directories: sdf file submitted to Condor for each job and other files. Where XXYY are the last 4 digits of workerID.
/data/atlpan/harvester_worker_dir/ , /cephfs/atlpan/harvester/harvester_worker_dir/ Deprecated. (worker directory)
/data/atlpan/condor_logs/ Local condor and pilot logs for each job
/data1/atlpan/condor_logs/ On schedd nodes: condor and pilot logs for each job; on harvester nodes: dummy folders/files but must exist

Restarting Harvester

[root@]# /usr/etc/rc.d/init.d/panda_harvester-uwsgi reload

MySQL

The most important tables in the DB structure can be found here, although it's under constant evolution. The DB configuration can always be found in the panda-harvester.cfg file, but currently you can connect to them like this:

Read-only account atlas-ro for debugging:

  • On aipanda171,172: # mysql -h dbod-harv.cern.ch -P 5501 -u atlas-ro -p HARVESTER

  • On aipanda173,174: # mysql -h dbod-harv2.cern.ch -P 5500 -u atlas-ro -p HARVESTER

  • On aipanda175: # mysql -u atlas-ro -p harvester

  • On aipanda177,178: # mysql -h dbod-harv-c1.cern.ch -P 5500 -u atlas-ro -p harvester

Password is in /cephfs/atlpan/harvester/mysql-passwd

Note the database names, DB hostname, and ports are different on nodes. Check configuration in [db] schema in harvester.cfg

You will need to know the password to connect. Run only queries you understand and where you know what you are doing.

If write permission really needed, use the account of harvester service, Check harvester.cfg for account user and password.

Condor Schedd machines

These are external condor schedd nodes harvester nodes submit through. You can connect with the usual atlpan user
Name
aipanda023
aipanda024

They are interchangeable.

Restarting condor

[root@]# systemctl restart condor

Important paths and files

Filename Description
/etc/condor/ Condor configuration, our configuration usually goes under config.d
/var/log/condor/ Condor logs, one per each agent

Troubleshooting

Condor schedd related

more...

FahuiLin - 2018-09-24

Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r9 - 2018-12-06 - FahuiLin
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PanDA All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback