CMS Space Monitoring Visualization

This page is intended for keeping track of development work on CMS Space Monitoring visualization based on CERN IT Monitoring Services unified infrastructure. It may be of interest to other CMS monitoring visualization projects.

Contacts

Meetings

Vidyo meetings are called as necessary, usually on Fridays at 16:00 CERN time.

30 September 2016 4pm CERN /9am Fermilab

Follow-up: CERN IT Monitoring and CMS ELK Use Cases

Agenda and connection details: https://indico.cern.ch/event/572911/ Attended:

  • From CERN: Alberto. Borja, Pedro
  • From Fermilab: Natalia, Yuyi

Borja has shown SpaceMon data in Kibana (please follow a link from the indico page above) .
We discussed further steps, see Summary of tasks.

26 August 2016

Technical discussion Follow-up: CERN IT Monitoring and CMS ELK Use Cases

Agenda: https://indico.cern.ch/event/565536/

Attended:

  • from CERN: Alberto, Pedro
  • From FNAL: Natalia, Alessandro (no mic)

19 August 2016

Pilot meeting on CERN IT Monitoring and CMS ELK Use Cases

Agenda: https://indico.cern.ch/event/563910/

Attended:

  • from CERN: Alberto, Pablo
  • from FNAL: Natalia, Eric, Kevin, Seangchan, Yuyi, Marco

Summary of Tasks

As discussed at Sept, 30th meeting

- Could be done now (next couple of weeks)
1. move from monitqa to monit, so that it is accessible to anyone authenticated (MONIT)
2. send the doc on how to do dashboards and the basics of  kibana (MONIT)

3.  use one call to get all the SPACEMON data instead of multiple site by site calls like now (MONIT and SPACEMON)
4.  use the time gap of 1h interval to get what has changed in that gap, instead of receiving the whole data every time (MONIT and SPACEMON)

- Mid-term period (few more weeks)
5. get new aggregated data (e.g. weekly) provided with scripts and data comes via the http feeds. instead of internal processing with Spark (SPACEMON and MONIT)

- Longer Period (couple of months)
6. Define a MONIT workflow that can be extended  and adapted to flume data sources and processing by external projects (like SPACEMON). Without us doing the flume development. (MONIT)
7.  Move SPACEMON to this model were data sources, processing and visualization can be done and changed without MONIT intervention (MONIT and SPACEMON)
8.  enrich with the pledges from REBUS data already in MONIT (site names do not match probably) (SPACEMON helped by MONIT)

Documentation

  1. CERN IT Monitoring Service documentation ( under construction ) : http://monitdocs.web.cern.ch/monitdocs/ - top page http://monitdocs.web.cern.ch/monitdocs/data_access.html - direct link to data access docs
  2. CERN IT Monitoring Service support:
    https://cern.service-now.com/service-portal?id=functional_element&?name=monitoring
  3. CMS storage space monitoring data:
    https://cmsweb.cern.ch/dmwmmon/datasvc/doc
    - certificate based authentication
  4. Elasticsearch official documentation:
    https://www.elastic.co/guide/index.html
  5. Grid Monitoring at CERN with Elastic:
    https://www.elastic.co/blog/grid-monitoring-at-cern-with-elastic
    blog by Pablo Saiz
  6. Kibana official documentation:
    https://www.elastic.co/guide/en/kibana/current/index.html
  7. Grafana official documentation:
    http://docs.grafana.org/guides/gettingstarted/
  8. Grafan/Kibana evaluation:
    excel table by Alessandro in google-docs
  9. Apache Flume documentation:
    https://flume.apache.org/

Deployments

Elastic-Kibana-Grafana testbed at Fermilab

  • fermicloud318
  • OS: SLF 7.2
  • Setup by Kevin
  • Fermi-wide access with KCA based authentication
  • Used by Alessandro for his evaluation project, download [presentation] , [report].

Natalia's Elastic-Kibana instance at Fermilab

Server has been decommissioned during Sept 17th power outage, all steps are documented in Fermilab redmine (#12066, #13443, #13481)
  • cmsdev33
  • OS: SLF 6.8
  • Set up by Natalia
  • Behind FNAL and CMS firewalls. Only local access.
  • Used by Natalia for testing and reproducing visualizations developed by Alessandro

Prototype on monitqa server at CERN

  • Restricted access, open to a few developers on request
  • Look for monit_qa_spacemon_raw_metric_* data source in https://monitqa.cern.ch/app/kibana .
  • Initial configuration for the kafka agent:
    spaceMon:
        type: exec
        channels: ckafka cerror
        shell: /bin/sh -c
        command: curl -k -o - "https://cmsweb.cern.ch/dmwmmon/datasvc/json/nodes" | python /etc/flume-ng/agent-httpsource/spacemon_parser.py
        restart: true
        restartThrottle: 86400000
        logStdErr: true
    
  • Contents of the spacemon_parser.py are attached as spacemon_parser.py.txt .

-- NataliaRatnikova - 2016-09-14

Topic attachments
I Attachment History Action Size Date Who Comment
Texttxt spacemon_parser.py.txt r1 manage 1.0 K 2016-10-13 - 20:32 NataliaRatnikova spacemon_parser.py to get spacemon data from data service and feed to kafka.
Edit | Attach | Watch | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r7 - 2016-10-13 - NataliaRatnikova
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    CMSPublic All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback