Show Children Hide Children

Main FTS Pages
FtsRelease22
Install
Configuration
Administration
Procedures
Operations
Development
Previous FTSes
FtsRelease21
FtsRelease21
All FTS Pages
FtsWikiPages
Last Page Update
SteveTraylen
2007-10-10

Objective

  • To install Patch #1232 to the prod and tiertwo service.
  • Reboot for long overdue kernel upgrade.

Status

Completed Thursday August 23rd 2007.

Broadcast

Date: Thursday August 23rd 2007
Affected Service: CERN FTS prod-fts-ws.cern.ch and tiertwo-fts-ws.cern.ch.
Impact: Service unavailable for atlas, alice, lhcb, cms, dteam an ops VOs.

On Thursday August 23rd from 06:30 UTC (08:30 CEST) the CERN T0 export and tiertwo FTS
service will be unavailable for a software upgrade. The full service is expected to 
be restored by 10:00 UTC (12:00 CEST). No furthur broadcast will be sent assuming
the service is restored by this time.

Questions: fts-support@cern.ch.

Preparation Steps

  • Check defragmentation of Database - Request made to DBAs.
  • Prepare CDB templates and validate. Same as pilot plus addition of yaim 4.

Intervention

  • Set hosts to SMS status maintenance. Done
  • Switch channels to inactive, wait to drain. Done

  • Stop scripts running on fts102 and lxb2091
  • Disable history pack on prod and tiertwo.
    • exec fts_stats.stop_hourly_job;
    • exec fts_history.stop_job;
    • exec fts_statecount.stop_job.
  • Verify no queries are running.
    • select * from user_jobs;
  • Stop the web services.
  • Stop the agent nodes.
  • Ask DBAs to backup tables and check fragmentation.
  • Upgrade the schema.
  • Commit CDB files.
  • Update RPMS.
  • Run yaim.
    • For some unknown reason /opt/glite/yaim/bin//yaim -i -s /etc/lcg-quattor-site-info.def -m glite-FTA2 works but the more correct. ncm-ncd --configure yaim fails. This is not a problem and can be easily worked around.
    • In fact fixed by replacing tail_pid=$? rather than tail_pid=$!. In fact this yaim is not released and is fixed in what will be release.
  • Start web servers by rebooting.
    • Try a few commands.
  • Start agent nodes by rebooting.
    • Check they are running.
  • Start monitoring jobs fts102
  • Enable jobs
    • exec fts_history.submit_job;
    • exec fts_stats.submit_job;
    • exec fts_statecount.submit_job;
  • Enable production status for nodes in SMS.
  • Have lunch.

-- SteveTraylen - 15 Aug 2007

Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2007-10-10 - SteveTraylen
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback