Task/Job monitoring

Description of Dashboard API

Development server

Task monitoring dashboard development server launched on http://pcadc01.cern.ch/

Front page: List of users

List of users, who run jobs during last month.

URL :http://pcadc01.cern.ch/client/index.html

Action name: gangataskmonitoring

URL request:

http://pcadc01.cern.ch/dashboard/request.py/gangataskmonitoring

Output JSON object format:

{"basicData": {"GridName": "UserName1"},... {"GridName": "UserNameN"}}

List of tasks

List of tasks for defined user during given time period (from-to) OR timerange

Action name: gangataskstable

Parameters:

    • usergridname,
    • timerange OR from , to time period in format
    • typeofrequest=A (currently mandatory)

Example:

  • gangataskstable?usergridname=%22KonstantinosKousouris%22&from=2010-04-02%2018:20&to=2010-04-07%2012:30&typeofrequest=A

JSON output example:

{"user_taskstable": [{"Executable": "cmsRun", "UNKNOWN": 239, "SubmissionType": "direct", "Application": "CMSSW",  "NUMOFJOBS": 264,  "TargetCE": "15_Selected_SE", "SubmissionTool": "crab", "SubmissionUI": "T1_US_FNAL",   "PENDING": 0, "TASKMONID": "kkousour_crab_0_100411_030027_1mv54a", "TaskType": "analysis",   "ApplicationVersion": "CMSSW_3_5_6", "TaskId": 3338662, "SUCCESS": 0, "TaskCreatedTimeStamp": "2010-04-11 10:08",   "SchedulerName": "LOCALFNAL", "TaskMonitorId": "kkousour_crab_0_100411_030027_1mv54a",   "FAILED": 25, "RUNNING": 0, "NEventsPerJob": 35761, "InputCollection": "/MinimumBias/Commissioning10-PromptReco-v8/RECO", "SubToolVersion": "2.7.1"}]}

Jobs of chosen tasks

Action name: gangataskjobs

URL request:

Parameters:

  • taskmonitorid;
  • what. Could be :
    • all - Displays all job states;
    • P - pending;
    • R - running;
    • U - unknown;
    • S - successfull;
    • F - failed;

Example:

gangataskjobs?taskmonid=kkousour_crab_0_100411_030027_1mv54a&what=all

Output JSON object format example for what = all parameter :

{"taskjobs": [[{"STATUS": "U", "resubmissions": 1, "EventRange": "1", "started": "2010-04-11 08:03", "GridEndId": "U",   "AppGenericStatusReasonValue": "Error return without specification",  "finished": "2010-04-12 08:03",  "submitted": "2010-04-11 08:03", "Site": "T3_US_FNALLPC",  "TaskJobId": 153301646, "JobExecExitCode": null,   "SchedulerJobId": "https://cmslpc16.fnal.gov/be0adc69f8cfa9f7a184ad7ce27dd2b2c81c68fa/1", "GridEndReason": "unknown"}],   , {"username": "\"KonstantinosKousouris\"", "what": "ALL", "taskmonid": "kkousour_crab_0_100411_030027_1mv54a"}]} 

Page details in case of parameter what=all

Plots

  • Terminated Jobs by Site ( groupping by "Site" and "STATUS" values);
  • Graphical Overvew (used status values(# running, #pending, etc ) from Page2 for appropriate task.

Jobs' table

Colomn name Key
SchedulerJobId SchedulerJobId
Id in Task EventRange
Status STATUS
Appl Exit Code

Check STATUS, if "P" or "R" - display "Not yet"

JobExecExitCode >-1?JobExecExitCode : Unknown

toolTipText:

AppGenericStatusReasonValue
Grid End Status GridEndReason
Retries resubmissions
Site Site
Submitted submitted
Started started
Finished finished

JSON output example for what = F parameter :

{"taskjobs": [[{"JobExitReason": "  Output file(s) not found", "resubmissions": 1, "EventRange": "23",  "started": "2010-04-11 08:03", "GridEndId": "U",  "finished": "2010-04-11 08:26", "submitted": "2010-04-11 08:03", "Site": "T3_US_FNALLPC",  "TaskJobId": 153301734,  "AppStatusReason": "unknown", "JobExitCode": 60302, "SchedulerJobId": "https://cmslpc16.fnal.gov/be0adc69f8cfa9f7a184ad7ce27dd2b2c81c68fa/23",   "GridEndReason": "unknown"}],  {"username": "\"KonstantinosKousouris\"", "what": "F", "taskmonid": "kkousour_crab_0_100411_030027_1mv54a"}]}

Resubmitted jobs

Action name: resubmittedjobsAtl

URL request:

Parameters:

Colomn nameSorted ascending Key
Appl Exit Code JobExitCode
Appl Exit Reason AppStatusReason
Finished finished
Grid End Status GridEndId
Id in Task EventRange
JobExitReason JobExitReason
Site Site
Started started
Submitted submitted

Example: Request : resubmittedjobsAtl?what=ALL&taskjobid=152060487&taskmonid=aproskur_crab_0_100406_145411_tl7n64

JSON output example:

{"rsJobs": [{ "JobExitReason": "CMS exception (CMSSW)", "EventRange": "1", "started": "2010-04-06 13:00:05", "GridEndId": "D", "Site": "T2_UK_London_Brunel", "submitted": "2010-04-06 12:55:10", "finished": "2010-04-06 13:10:06", "AppStatusReason": "unknown", "JobExitCode": 8001, "SchedulerJobId":"https://wms218.cern.ch:9000/p4J3bxUEvPlLICjIXb1olg", "GridEndReason": "unknown"}, ...}]}

-- LauraSargsyan - 21-Apr-2010


This topic: ArdaGrid > WebHome > Dashboard > TaskJobMonitoring
Topic revision: r9 - 2010-04-23 - LauraSargsyan
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback