Using accounting validation view in APEL we need to address sites which have problems publishing to APEL, or do not correctly publish to BDII
attributes which are required in order to convert time to work. APEL uses for this purpose
,
while Dashboard uses REBUS. From REBUS Dashboard takes overall HEPSPEC and divides it by number of logicac CPUs. In its turn REBUS takes
from BDII
.
There is slight difference how SSB application can be used for ATLAS , ALICE and CMS.
For ATLAS one might first check the metric which shows EGI clock time work versus Dashboard clock time work.
If this one disagrees then to look in the ratio of the EGI raw wall clock time vs Dashboard raw wall clock time
. That is why clock time work metric is not very reliable in Dashhboard, and therefore one better look in the ratio of the EGI raw wall clock time vs Dashboard raw wall clock time.
For ALICE only raw wall clock time metric is available.
Site name |
Inconsistency |
Impacted VOs |
Time range |
Problem description |
Ticket |
Status |
UKI-SOUTHGRID-RALPP |
all experiments see lower consumption in EGI then in their systems ~50% |
ATLAS, CMS, LHCb |
September |
- |
https://its.cern.ch/jira/browse/ADCINFR-28 |
to be done |
UKI-NORTHGRID-MAN-HEP |
APEL 45% bigger than ATLAS |
ATLAS |
June |
Batch system agrees with APEL, something is missing in Dashboard. Could be event servicej obs not correctly accounted there |
https://its.cern.ch/jira/browse/ADCINFR-28 |
Resolved for APEL accounting |
UKI-LT2-RHUL |
good consistency for time, but ~50% discrepancy for work |
ATLAS |
May-July |
Might be wrong info in BDII for time-work transformation |
https://its.cern.ch/jira/browse/ADCINFR-28 |
resolved, turned to green in August, need to republish previous months |
UNIBE-LHEP |
lower or completely missing consumption |
ATLAS |
since the beginning of the year |
had the wrong DN in GOCDB and APEL undeclared services, wrong info in BDII |
https://its.cern.ch/jira/browse/ADCINFR-28 |
BDII info is still missing. GcoDB is fixed. Still would need to republish |
CSCS-LCG2 |
EGI is twice lower than Dashboard both for work and time |
ATLAS, CMS. LHCb |
starting form April |
wrong DN in GocDB |
https://its.cern.ch/jira/browse/ADCINFR-28 |
Info in GocDB and BDII fixed. Need to republish |
Cyfronet |
EGI is twice lower than Dashboard both for work and time |
ATLAS, CMS |
from the beginning of the year |
- |
https://its.cern.ch/jira/browse/ADCINFR-28 |
Under investigation. Alessandra follows up with site admins |
MPPMU |
EGI is considerably (8 times) lower than Dashboard both for work and time |
ATLAS |
from the beginning of the year |
possible missing ARC CE in gocdb also HPC resources may not be accounted for correctly |
https://its.cern.ch/jira/browse/ADCINFR-28 |
under investigation |
BEIJING-LCG2 |
EGI is considerably lower than Dashboard both for work and time, ATLAS only while CMS is fine |
ATLAS |
Starting form May |
- |
https://its.cern.ch/jira/browse/ADCINFR-28 |
To be done |
TR-10-ULAKBIM |
completely wrong numbers both for time and work metrics. For raw wallclock time, EGI is twice higher than Dashboard , for work it is ~90 times higher |
ATLAS |
January was fine , then degraded |
- |
https://its.cern.ch/jira/browse/ADCINFR-28 |
To be done |
RC-KI |
EGI shows only 7% of usage both for time and work metrics |
Only ATLAS, ALICE is fine |
August |
- |
- |
To be done |
Brunel |
EGI shows twice lower consumption than dashboard for ATLAS, and 5 times lower for CMS, both for work and time |
August-September |
ATLAS, CMS |
Apparently the problem is in ARC-Condor interface , which also has impact on CMS job submission |
https://ggus.eu/index.php?mode=ticket_info&ticket_id=123947 |
In progress |
FZK-LCG2 |
5 times higher work in EGI than in Dashboard, raw wallclock is fine |
all VOs |
September |
Wrong info published in BDII |
- |
BDII info is fixed, September and partially October data to be republished |
Begrid-ULB-VUB |
EGI shows twice higher number both for time and work metrics |
CMS |
July-September |
"we looked at or site_bdii and indeed there were information that were not reported anymore. This has been fixed" |
https://ggus.eu/index.php?mode=ticket_info&ticket_id=125016 |
Reported to CMS site support, solved? |
INDIACMS-TIFR |
EGI shows 3 times lower numbers than Dashboard does for time, while for work on the contrary EGI provides 60% higher numbers than Dashboard |
CMS |
September, first 5 months of the year look fine |
" acknowledge that the accounting data was not getting uploaded to EGI. Solved" |
https://ggus.eu/index.php?mode=ticket_info&ticket_id=125017 |
Reported to CMS site support, solved? |
NCP-LCG2 |
EGI shows 2,5 higher values both for time and work than Dashboard does |
CMS, while ALICE is fine |
First 4 months of the year are fine, than situation degraded |
TBD |
https://ggus.eu/index.php?mode=ticket_info&ticket_id=125018 |
Reported to CMS site support, ongoing |
T2 Estonia |
EGI shows 1.5-2 times higher consumption both for time and work than Dashboard does |
CMS |
September |
"one error has been fixed and sending should be normal now" |
https://ggus.eu/index.php?mode=ticket_info&ticket_id=125019 |
Reported to CMS site support, solved? |
Ru-PNPI |
EGI shows 30 times higher consumption than Dashboard does (Are you sure this is the correct way round?.The CMS cpu numbers in APEL look very small.) |
CMS only, ALICE, ATLAS and LHCb are fine |
starting from the beginning of the year |
Not correct reporting of jobs running at this site to Dashboard? |
https://ggus.eu/index.php?mode=ticket_info&ticket_id=125020 |
Reported to CMS site support, ongoing |
Ru-SPbSU |
Time metric is fine, while work is shown 10 times higher in EGI, than in Dirac |
LHCb |
September (no data before September) |
Apparently wrong info in BDII |
- |
To be done |