Quarterly plan for November 2011-January 2012

Job monitoring area

Historical view

ATLAS

Multiple bug fixes and feature requests implemented. See report from Eddie for details.

CMS

  • Adapt new version of the historical view redesigned for ATLAS to CMS

This include 1). changing smry tables adding new sorting attributes (data type, CMSSW version), 2).add # of users metric, 3).add site/application failure on the interactive-like plot. 4).Redoing UI (Deploy prototype for validation - middle of February)

Interactive interface

CMS

  • Develop new version of the interactive interface in the hBrowser framework (end of January) (done mid December, currently under validation)
  • Evaluate interactive UI performance quering denormalized DB object (table or materialized view) (mid of January) (first checks done, next steps poponed waiting for migration to 11g and new hardware in the end of January)
  • In case of positive results of evaluation (point 2) rewrire interactive UI DAO part using denormalized DB object (end of January)

Task monitoring

ATLAS

  • Develop stored procedures for aggergating data of the analysis task for increasing of the UI performance (?)
  • Prototype first version of the alanysis task monitoring (end of February?)

Transfer monitoring area

ATLAS DDM Dashboard

  • 2.0 M3.1 release (DELIVERED 15-Dec-2011)
    • Features:
      • Details links are now all real (non-JavaScript) and can be opened in a new tab or window.
      • Dates are locked by default when selecting details and user is alerted.
      • Details queries tuned to achieve acceptable response times for full details time range (3 months).
    • Bug fixes:
      • Special characters in SURLs from the details page prevent effective copying and pasting.
      • Tabs fail to refresh correctly when resizing browser window.
    • Other:
      • Google Analytics set up.
      • Data model refactored to simplify development.

  • 2.0 M3.2 release (POSTPONED awaiting Modularization of UI code see below)
    • Links to monitoring data via Web API in JSON/XML formats.
    • Help page for Web API.
    • Live preview of endpoint selection.

  • Modularization of UI code (ONGOING expected Feb-2012)
    • Allow parallel development of UI for DDM Dashboard and Global Transfer Monitoring.

Global Transfer Monitoring System

  • Deploy first prototype covering the complete data flow form FTS data publisher to the UI. (DELIVERED 16-Nov-2011)
  • Perform consistency checks between new WLCG Transfer Dashboard and PhEDEx and DDM Dashboard systems (DELIVERED 16-Dec-2011 and ongoing)
  • Following deployment of the FTS 2.2.8 to all T1 sites make sure that information from all FTS instances is collected in the WLCG Transfer Dashboard (POSTPONED awaiting FTS 2.2.8 deployment)
  • Enable alarms in case information of any FTS instance is missing or delayed (DELIVERED 23-Jan-2012)
  • Development of the new features in the UI following the feedback of the LHC experiments
    • Add filtering / grouping by country (DELIVERED 27-Jan-2012)

Google Earth

Try kinect sensor and software for GoogleEarth (end of September) *Initial testing performed. Ongoing

Handling of the Dashboard cluster

SSB

Tasks performed but not foreseen in the quarterly plan

Dashboard

  • Netvibes tutorial.

ATLAS DDM Dashboard

  • Testing for 11g database upgrade.

Global Transfer Monitoring System

  • Switch to production MSG brokers with virtual queues. DELIVERED 16-Dec-2011
  • Set up dashboard46 as showcase integration service with best-effort support. DELIVERED 27-Jan-2012

SiteView

  • A lot of improvement and bug fixes done in the SiteView collectors and monitoring display , see details in Eddi's input. (January)
  • SiteView is upgraded to the latest SSB version which would allow to enable another display with historical distribution (SSB - like) as requested by operations TEG working group (January)

Detailed input from team members

Eddie

Google Earth

Fixed a problem with Austrian grid sites not appearing in Google Earth GGUS #76061

Job monitoring area

CMS

Started the cleaning of the old CMS Dashboard job monitoring data. Initial cleaning done, ongoing

- Job Summary / Interactive Vew: Created a denormalised table with one week's data for testing/benchmarking purposes. Created db bitmap indices and indices on that table and changed the Database Access Object of the application to query on that table.

ATLAS

- Fixed problems with ATLAS analysis build jobs: ( #88851 & #88887).

- Fixed bug #89936: Jobs with unresolved site

- Regular meetings with the DBAs about performance problems

- Optimised a procedure which calculates the hourly summaries. Created a procedure that adds daily partitions on the JOB table for up to a month.

Historical view

ATLAS

New version ready for testing: many bug fixes and new features:

#124503 thumbnail of successes/failures for gangarobot

#88668 ATLAS Hist.Views Prototype: Normalisation of running/pending jobs with monthly granularity is wrong.

#88696 ATLAS Hist.Views Prototype: Exception on the Resource Utilisation page

sr#124658 success/all efficiency > 1

sr#124739 legends too small with all sites selected

sr #124784: average efficiency > 1.

#89282: legends too large/difficult to read.

#89234: HEPSPEC Average Coefficient is supposed to fluctuate over time.

#89276: Increase the thickness of the 'pledges' line.

#89290: Wrong individual CPU consumption plot when grouping by ADC Activity.

#89296: Minor problem with the pledges summary table.

#89487: A user should be able to adjust the total number of legends in the plots.

#89814 - plot doesn't take into account the selected granularity on the resource utilisation page, running jobs plot.

#90204: Historical Views: Wrong sites<->patttern association for Romania federation

#90244: Add 3 missing gangarobot activities on the ATLAS Job Monitoring

#90394: ATLAS Historical Views: Generate a url hash for every user selection

- Fixed the HEPSPEC06 values of the IL-HEP Tier-2 Federation sites: GGUS 77079.

- Tweaked and improved the ATLAS HEPSPECs and Pledges collector.

Also added the possibility to select All T2s+T1s and All T2s+T1s+T0 from the selection menu.

CMS

Fixed bug #89516: Plots of the CPU Consumption should follow the selected granularity and present the results in 'hours/hour', 'days/day','weeks/week' and 'months/month'.

Task monitoring

New minor release for CMS targeting bug #89568: issue with non-UTC time.

GridMap SiteView

Fixed broken ATLAS and CMS collectors for the GridMap SiteView application (#88584 & #88633).

Fixed #89802 LHCb Site Status collector does not work
Fixed #89889:
- LHCb Collector was using an outdated topology
- ALICE Collector for the job processing was wrong: collecting wrong number from MonALISA.
- ATLAS Collector for the job processing was wrong: not taking into account terminated jobs but the submitted ones.
- CMS Collector for the job processing was wrong: not taking into account terminated jobs but the submitted ones.

Fixed #90348: Direct links to ATLAS DDM 2 Dashboard.

Others

- EGI CF2011 dashboard abstract.

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2012-02-23 - DavidTuckett
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback