Processing Passes

ProcessingPasses

Monitoring plots for LHCb computing production shifts

Here we gather a collection of plots which are suggested for helping LHCb computing operations shifter. Please suggest alternatives or additions.

Thanks to Andrew Smith and Elisa Lanciotti for lots of helpful information and for pointing me to many of these plots which I have just copied in here.

ProductionShiftMonitoringOverwiewPlots

ProductionShiftMonitoringDiagnosticDataProduction

ProductionShiftMonitoringDiagnosticUserJobs

ProductionShiftMonitoringDiagnosticDataTransfers

ProductionShiftMonitoringDiagnosticSites

ProductionShiftMonitoringDiagnosticMC

ProductionShiftMonitoringDiagnosticWeek

ProductionShiftMonitoringQuickOverview

NewProductionPlotPage

The DIRAC System Monitoring page

The DIRAC system monitoring page has many links to plots: https://twiki.cern.ch/twiki/bin/view/LHCb/DIRACSystemMonitoring


#WLCGQRPLOTS

WLCG QR PLots

These are for the WLCG QR report


Overview Plots

These first plots simply set the scene by showing the split between user group over 7 days and 1 day (in this case RED does not mean a problem).

These next plots show an overview of final status for data reconstruction (left), merging (middle), user (right) in last 24h. [Note: if a plot is empty it means there were none of the relevant type of job in the 24h period]

These show an overview of data transfers in last 24h - but this is really not fleshed out yet.


Diagnostic: Data Production (Reconstruction and Merging)

The format is always a 7-day view on the left and a 1-day view on the right. [Note: if a plot is empty it means there were none of the relevant type of job in the 24h period]

All lhcb_prod Jobs:

This set of plots shows the success/failure rate for data reconstruction and stripping (lhcb_prod) only.

This set of plots shows the success/failure rate for data merging (lhcb_prod) only.

Failed lhcb_prod Jobs:

The rest of these plots are for the failed lhcb_prod jobs only. To diagnose where and why they failed.

Diagnostic : User Jobs

This set of plots is similar to those above, but this time for lhcb user only (filtered on lhcb_user). The first line shows the number of failed jobs.

The rest of these plots are for the failed user jobs only, to diagnose where and why they failed.

Diagnostic: Data Transfers

This next set of plots shows the results of data upload quality monitoring. These are very "busy" but the key point is that lots of red indicates some problem. If you spot a problem then you may want to look at this link http://sls.cern.ch/sls/service.php?id=LHCb-Storage to investigate further.

Data upload quality at each Tier-1 site, listed by source location.

These plots only show there may be a space problem, but dont tell you which space token - the next plots are for that.

Data upload quality listed by space token.

These plots show the data upload quality for different space tokens listed by destination site.

These plots show the data upload quality for different space tokens listed by source site.


Diagnostic: Site view:

Sorry that these are not here for all sites yet. I am gradually completing them, but it takes a lot of time to create and copy the link for each site so it is a slow process.

Important note: Many plots will say "no data for this selection" and be empty. This means that none of the relevant job type ran at that site in the period

CERN

These three plots show job final major status for Reconstruction then Merging then User jobs

These three plots show job final minor status for Reconstruction then Merging then User jobs

For all failed jobs :: Reconstruction:job group , Merging: job group, User jobs : user

These show information on Pilot jobs


CNAF

These three plots show job final major status for Reconstruction then Merging then User jobs

These three plots show job final minor status for Reconstruction then Merging then User jobs

For all failed jobs :: Reconstruction:job group , Merging: job group, User jobs : user

These show Pilot job information

GRIDKA

IN2P3

PIC

RAL

These three plots show job final major status for Reco+Stripping then Merging then User jobs

These three plots show job final minor status for Reconstruction then Merging then User jobs

For all failed jobs:

These show Pilot job information


SARA

-- PeterClarke - 04-Oct-2010

Edit | Attach | Watch | Print version | History: r20 < r19 < r18 < r17 < r16 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r20 - 2011-10-05 - RobCurrie
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback