The new portal is available for testing and validation

https://accounting-next.egi.eu/

The views we are mostly interested in are 'WLCG' view and 'Reports'

Portal

Draft of the specification for WLCG-dedicated accounting portal

The draft can be found here

Detected problems

  • Double counting of the Desy statistics because of Desy belonging to several per-VO Germain federations (Fixed)
  • "Wall Time (h)" is lower than "CPU Time (h)", "Normalised Wall (h)" is also lower than "Normalised CPU (h)". Problem in number of cores? Something else? (Fixed)
    The naming schema on the WLCG tab should probably be re-used on the landing page portal.png
  • The "CPU Efficiency" table sometimes show 1% differences when compared with the CSV/JSON available on the same page. Rounding problem? Better rounding? Or perhaps one more significant digit needed?

Suggestions

  • Everything relevant for WLCG should be available under WLCG view, for example aggregation by country, or VO manager view for WLCG VOs (Julia)

To enable country view under WLCG - beginning of August Ivan confirmed that he will enable completely separate entry point for WLCG - dedicated WLCG portal. He needs a document describing in details how the portal should look like. Julia will work on a draft document

  • Add an option to get all T1 and T2 sites including OSG altogether

Second half of August

  • I would suggest that the very first plot on the page (just below the table) can be shown in a different form: line or stacked bar. I would suggest stacked bar is a default one. In some cases accumulative plot also makes sense. A possibility to get a plot in a different form should be provided in the upper part menu (Julia)

Beginning of August

  • Would it be possible to select number of series shown separately in the plot , while all others are summed up. For example in this view in the plot in the bottom of the page I get 6 results, 5 top sites and all the others grouped. If I would like to have top 15 sites and all other grouped what should I do? I suggest you add a possibility to define number of time series shown separately on the plot for every plot, where number of items is higher than let's say 5.

  • From the input from Simone (below) as well as from Ian Bird and some site admins: it would be very useful to have a bar plot where consumption is shown with bars and at the same plot we have a pledge shown with the line. Since it is not relevant for all kinds of selection, it can be shown under a dedicated selection option under the WLCG view, something like "consumption vs pledges". (Julia)

This request as well as a request to compare a particular instance with an average among all instances in the category will be implemented together. Second half of August

  • We can not make plots or show percentage shares of the measurements which have different scale. This relates to non-normalized CPU. Pepe suggests to remove completely the table and the plots which show shares. An alternative would be to leave the table with numbers, but remove a column with percentage shares and all plots. Add a warning on the table that the measurements have different scale. We need to agree inside the task force members which option we would like to be implemented.

Agreed to get rid completely from the non-normalized CPU distributions

  • Average numbers in the rows and columns in the plot with the CPU efficiency have to be changed to waited averages (SUM(CPU)/SUM(WallClock))

Done

  • Monetary cost to be removed from the cloud view. Numbers in the cloud view need to be validated.

  • We need to review the labels of the axes and units to be correct at all plots. Need to discuss whether we keep term 'normalized time' or 'normalized CPU', or we rather call it 'work based on elapsed time' etc... For normalized metrics the units should be HS06 hours, rather than hours

  • Not urgent, but 'nice to have' functionality. Add a plot showing any kind of metric for a particular instance (site or VO) compared to an average among instanced in this category. For example show normalized wallclock multipled by number of cores for PIC as a function of time in a form of bar plot compared to average among all T1 sites shown as a line.

Second half of August

Reports

  • T1 report generation which is currently implemented by REBUS should move to the EGI accounting portal. REBUS code can be re-used by the EGI accounting portal. Eddie shared code with Ivan.

Ivan's estimation for implementation is that it might be done by the 15th of August.

  • Changes to be implemented in the reports:
    • In T1 reports Change units to HS06 hours instead of days. Drop installed capacity metrics
    • In T1 and T2 reports Everywhere (apart of CPU/wallclock ratio), use 'normalized elapsed time multiplied by number of cores' metric instead of normalized CPU. Since pledges are for wallclock (average efficiency factor already applied), we should compare them with normalized wallclock multiplied by number of cores, rather than with normaliized CPU.

What experiments need from the portal for preparation of the scrutiny reports

ATLAS, Input from Simone

We take all pledges from Rebus. This has nothing to do with the accounting portal, but a reminder we do need pledges from Rebus. The total Used for CPUs refers to WallTime. The efficiency is CPU/Wall and therefore in general we need both CPU and Wall time.

CERN CPU: a can of worms. It is a combination of T0 resources + local batch + machines for central services. What we need here from WLCG accounting is the WallTime and CPU time of T0 resources (they are in a dedicated ATLAS LSF instance). That will account for those resources when they run T0 jobs and also Grid jobs.

T1 and T2 CPUs: we need CPU time and WallTime consumed by ATLAS. It needs to be separate for T1s and T2s of course. Not critical, but used some time is the same information at the level of the site and country. We use this for debugging purposes and to answer special questions of the CRSG. So, we do not strictly use it to write the report but we use the information often. We need this information with the daily granularity (as sometimes we need to report e.g. 3 months from 15 of february to 15 of may). It is very useful to have monthly summaries as well, which is what we mostly use, for example to generate the plot in attachment with the monthly consumption with respect of pledge.

The HLT CPUs are monitored only in the dashboard and we use that one. HPCs are monitored only in the dashboard and also there this is what we use.

We take all the “Total Used” disk information from SRM, no need of this from the WLCG accounting. We take all the tape information from the dashboard, no need to do anything in WLCG accounting also there.

A special mention to HS06. We need a “as reliable as possible” average HS06/core slot at the level of the site (we do not use machine level granularity). This is useful when you want to translate the number of running jobs you observe with what you expect from the pledge (which is in HS06). We assume an average 11HS06/core, which is the average we get from rebus which is I guess about right as it is the global WLCG average and mistakes at the level of the single site average out. But would be nice to have the number as reliable as possible. Again, not strictly used in the report, but user rather often to answer questions.

Table with data for the scrutiny report

Plot would be useful to get on the portal

CMS, Input from Pepe

CMS Presentation

ALICE input

LHCb input

-- JuliaAndreeva

Topic attachments
I Attachment HistorySorted ascending Action Size Date Who Comment
PDFpdf 20160622_CMS_ResourceUtilization.pdf r1 manage 363.5 K 2016-06-29 - 16:16 JuliaAndreeva  
PDFpdf ProposedPortalChanges-v4.pdf r1 manage 520.8 K 2016-11-30 - 15:54 JuliaAndreeva  
PNGpng Screen_Shot_2016-06-29_at_1.00.33_PM.png r1 manage 221.3 K 2016-06-29 - 13:00 JuliaAndreeva  
PNGpng Screen_Shot_2016-06-29_at_1.01.58_PM.png r1 manage 168.1 K 2016-06-29 - 13:02 JuliaAndreeva  
Unknown file formatdocx WLCGPortalSpecification.docx r1 manage 1372.2 K 2016-10-06 - 14:14 JuliaAndreeva  
PNGpng portal.png r1 manage 212.6 K 2016-07-06 - 13:33 MiguelSantos  
Edit | Attach | Watch | Print version | History: r14 < r13 < r12 < r11 < r10 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r14 - 2016-11-30 - JuliaAndreeva
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback