SSB Shifter instruction
Instructions for ADCoS shifter
A full description of the Atlas SSB is reported on the
SSB twiki.
Open the
shifter view
of the ATLAS Site Status Board:
By default, the sites are ordered by alphabetical order. Different ordering are possible just clicking on the small arrow near the field name.
If you want to display only a subset o f sites, you have to use the "search" option (top-right). As an example, to select only the sites belonging to a cloud, click on the "search" button, select the field
CloudInfo and set equal to the cloud you want to display (fr,it,us,...).
Before checking all the listed sites, please send an email to atlas-adc-ssb-notifications @ cern.ch if:
- a column is not updated
- there are many "n/a" displayed in the same column/row
- a site you supposed to be monitored is missing from the list
The following issues have already been reported:
- many n/a for sites: RO-14-ITIM, RO-16-UAIC
For each site in the list:
- check whether the site is on Scheduled Downtime. I that case, please check which services are affected and ignore the related errors reported on the dashboard (as an alternative, follow the procedures for sites in downtimes).
- check the DDM 4h. It shows the DDM status in the last 4 hours for the activities of production, T0-exp, FT and data consolidation, filtering only the source errors. If the box is NOT green, click on it and you'll be redirected to the DDM dashboard. Please follow the instruction as reported in the ADCoS twiki
- cross-check if there are blacklisted space-tokens using the DDM Status column; consider whether exclude/include the site/spacetoken in DDM activities
- check the SRM SAM 12 Column: in the case of failures, please cross check the ddm dashboard information: if no transfers, no activity, consider whether ticket the site
- check the panda efficiency (column to be implemented)
- check the Analysis Functional test on the LCG backend (AFT_lcg Column) and the Analysis Functional test on the PANDA backend (AFT_Panda Column). The site is automatically excluded in HammerCloud
- cross-check the panda exclusion column; consider whether setting a site online/offline
- check the Panda analysis efficiency and the Panda production efficiency (still to be implemented).
- cross-check the panda exclusion column; consider whether setting a site online/offline
Hint: clicking on the site name will show the sensor history
Instructions for Cloud Support and Site Shifters
Note: ssb-team is working on cloud view implementation
Subscribe to the
SSB alert mailing list
, selecting the cloud for which you want to get notification.
Open the
site status view
of the SSB.
- DDM status: if a site is "blacklisted", it means that at least one space-token has been centrally excluded in writing/reading mode. Clicking on any box will show the two columns for source and destination exclusion. Clicking again on any box will show the web page containing centralized excluded sites in DDM, with all the details on site exclusion.
- Panda production/analysis status: the status of the panda queues (online, offline, test, brockeroff) are reported. panda production/analysis site name help in the case the panda name differs from the conventional site name.
- Scheduled Downtime: site downtime schedule according to AtlasGridDowntime
If you need to display the status in terms of DDM and PANDA exclusion in a given time interval, click on the site name. You can customize the time interval by changing the time=xx value in the URL bar.
If your site is up and running, but it still excluded from some activities, please ask
ADCoS shifter for the exclusion motivation and request to set your site online.
The Site Shifter can check the status of their site on the the
shifter view
(see the above
ADCoS shifter instruction for the details on the metrics).
Major updates:
--
LorenzoRinaldi - 15-Dec-2010
Responsible:
LorenzoRinaldi1
Last reviewed by:
Never reviewed