-- JakubMoscicki - 15 Jun 2006

GangaPlotter

GangaPlotter : graphical summary about job statistics using matplotlib package

GangaPlotter_diagram.png

Piecharts

plotter.piechart(jobs, attr, **options)

  • create a piechart for jobs given attr as a piechart value. The attr may be a string representing the desired attributed or a one-argument function mapping a job to an arbitrary value.

Examples:

  • plotter.piechart(jobs, "backend")
backend_piechart.png

  • plotter.piechart(jobs.select(1,300), "status")
status_piechart.png

  • plotter.piechart(jobs, "backend.actualCE")
backendCE_piechart.png

  • plotter.piechart(jobs,lambda j: len(j.subjobs),title='number of subjobs')
lensubjobs_piechart.png

  • plotter.piechart(jobs,lambda j: len(j.subjobs)==10, title='len(j.subjobs)==10')
len10subjobs_piechart.png

Barcharts

plotter.barchart(jobs, xattr, yattr, **options)

  • create a barchart for jobs given xattr as the barchart xvalue, yattr as the barchart yvalue. The xattr and yattr may be a string representing the desired attributed or a one-argument function mapping a job to an arbitrary value.

Examples:

  • plotter.barchart(jobs,'backend','status',title='Job efficiency on each backend')
barchart_backend_jobefficiency.png

  • plotter.barchart(jobs,lambda j:j.backend.CE.split(':')[0],'backend.status',title='Job efficiency on each site',xlabel='CE')
barchart_ce_jobefficiency.png

Histograms

plotter.histogram(jobs, attr, **options)

As the data for generating the histogram needs to be a number (otherwise, it's the case of using barchart), it might be hard to find a use case of histogram directly on ganga's job attributes.

Nevertheless, thanks to the dataproc option (also available in other chart generators), there is an example which could be useful: making a histogram on the elapsed time of the LCG jobs.

Example: application runtime histogram of the LCG jobs
In the example, we are going to parse the file j.outputdir/__jobscript__.log to get the start-time and stop-time of the application executable. So firstly, define a parser as the following:

import os.path
import re
import time

def get_app_runtime(filepath):

    re_timebeg = re.compile('^(.*)\s+(\[Info\])\s+(Load application executable).*$')
    re_timeend = re.compile('^(.*)\s+(\[Info\])\s+(GZipping stdout and stderr).*$')

    timebeg_str = ''
    timeend_str = ''

    timebeg_sec = 0
    timeend_sec = 0
    if os.path.exists(filepath):
        f = open(filepath)
        for l in f.readlines():
            matches = re_timebeg.match(l.strip())
            if matches:
                timebeg_str = matches.group(1)
                continue
            else:
                matches = re_timeend.match(l.strip())
                if matches:
                    timeend_str = matches.group(1)
                    break
                else:
                    continue
        f.close()

    if timebeg_str and timeend_str:
        timebeg_sec = time.mktime(time.strptime(timebeg_str.strip(), '%a %b %d %H:%M:%S %Y'))
        timeend_sec = time.mktime(time.strptime(timeend_str.strip(), '%a %b %d %H:%M:%S %Y'))

    return timeend_sec - timebeg_sec

It would be handy if you can save this function in a script and load it into Ganga via execfile() every time when you re-entering Ganga.

Now we can generate a job elapsed time histogram:

plotter.histogram(jobs.select(status='completed'), attr=lambda j:j.outputdir+'/__jobscript__.log', dataproc=get_app_runtime, title='Application Runtime', xlabel='second', label='runtime')

and this is the result:

job_etime_histogram.png"

Scatter plots

plotter.scatter(jobs, xattr, yattr, **options)

  • create a scatter plot for jobs given xattr as the xvalue, yattr as the yvalue. The xattr and yattr may be a string representing the desired attributed or a one-argument function mapping a job to an arbitrary value. The data can be also grouped by an optional attribute cattr, data belongs to different group is distinguished by color and maker.

Example: application runtime (on X-axis) v.s. job turnaround time (on Y-axis) of the LCG jobs, data grouped by country

We will reuse the function get_app_runtime given above to modify the job data to make the plot.

plotter.scatter(jobs.select(status='completed'), xattr=lambda j:j.outputdir+'/__jobscript__.log', yattr=lambda j:(j.time.final() - j.time.submitted()).seconds, cattr='backend.actualCE', xdataproc=get_app_runtime, cattrext='by_country', title='Application Runtime v.s. Job Turnaround Time', xlabel='app runtime (sec)', ylabel='job turnaround (sec)', deep=True)

and this is the result:

scatter_example.png

Deep looping on sub-sub-sub-...-jobs

By default, the plotter loops over all the sub-job levels of the given (top level) jobs to extract the information for generating the plots. It could be disabled by giving an optional key argument deep=False to the plotter commands.

plotter.barchart(jobs,'status',deep=False)

Examples:

  • plotter.piechart(jobs,'status',title='with subjob deep looping')
piechart_with_subjob_looping.png

  • plotter.piechart(jobs,'status',title='without subjob deep looping',deep=False)
piechart_without_subjob_looping.png
Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng GangaPlotter_diagram.png r1 manage 36.5 K 2006-06-19 - 12:04 HurngChunLee architecture diagram of GangaPlotter module
Unknown file formatsxd GangaPlotter_diagram.sxd r1 manage 7.3 K 2006-06-19 - 12:36 HurngChunLee  
PNGpng backendCE_piechart.png r1 manage 495.6 K 2006-06-15 - 09:08 JakubMoscicki  
PNGpng backend_piechart.png r1 manage 168.9 K 2006-06-15 - 09:08 JakubMoscicki  
PNGpng barchart_backend_jobefficiency.png r1 manage 42.7 K 2006-06-22 - 11:46 HurngChunLee  
PNGpng barchart_ce_jobefficiency.png r1 manage 164.0 K 2006-06-22 - 11:46 HurngChunLee  
PNGpng job_etime_histogram.png r1 manage 29.9 K 2008-12-05 - 12:46 HurngChunLee Job Elapsed Time Histogram
PNGpng len10subjobs_piechart.png r1 manage 105.2 K 2006-06-15 - 09:09 JakubMoscicki  
PNGpng lensubjobs_piechart.png r1 manage 121.8 K 2006-06-15 - 09:09 JakubMoscicki  
PNGpng piechart_with_subjob_looping.png r1 manage 36.9 K 2006-10-25 - 15:03 UnknownUser  
PNGpng piechart_without_subjob_looping.png r1 manage 37.4 K 2006-10-25 - 15:03 UnknownUser  
PNGpng scatter_example.png r2 r1 manage 41.6 K 2010-09-08 - 11:54 HurngChunLee scatter plot example
PNGpng status_piechart.png r1 manage 143.9 K 2006-06-15 - 09:09 JakubMoscicki  
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2010-09-08 - HurngChunLee
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback