WARNING: This web is not used anymore. Please use PDBService.MinuteS7June07 instead!
 

Minutes 7 June 07

  • Phone:
    • FNAL: Eric
    • BNL: Carlos
    • SARA: Alexander
    • CNAF: Barbara
    • NDGF: Olli

  • CERN:
    • 3D: Dirk, Eva, Zbigniew, Dawid, Jacek
    • ATLAS: Gancho, Florbela
    • Taiwan: Dave
    • LHCb: Marco

  • Sites status:
    • Tier0 - CERN:
      • 2 sites out of sync during last week, RAL apply aborted, character set conversion needed. SARA apply aborted because of error: No data found. Eva has found that one record at SARA does not correspond with the record registered in the LCR. The cause cannot be identified.
    • SARA: nothing to report
    • CNAF: problem with the GRID cluster, services running on 1 instance, CPU load very high. There was a sw change on FTS, and it is under investigation as possible cause. CERN has not observed any problem with the new FTS sw so far.
    • NDGF: prepared to be joined to the ATLAS setup. No news from the new production setup.
    • RAL: upgrade of the ATLAS database to change the character set has finished with success. It would be good to do the same for the LHCb database. Marco agrees to schedule the intervention for tomorrow. The availability of the LFC replica at RAL will be discussed on the workshop, next week.
    • BNL: no interventions planned
    • Taiwan: nothing to report

  • Preparation for the workshop:
    • Recovery procedure at the destination sites.
    • Every site administrator should have a Laptop + access to all the information to run the recovery, wireless connection available.
    • Phone conference details already in the Agenda.
    • Sites which will be up during the exercise: Sara, Triumf, RAL, CNAF
    • Other sites will run full recovery after introducing some problem.
    • We will test only recovery on the destination site because Streams will need some time to catch up with the current load.

  • Experiments status:
    • ATLAS:
      • PVSS tests: meeting yesterday to schedule next tests. Next tests: table switch: create new table in new tablespace when limit reached. As replication is configured at schema level, it is necessary to check if the create tablespace operation will be replicated. And add new indexes – it will create additional load on the apply side.
      • SW release for the scalability tests, Andrea is working on it, estimated delay of 2 or 3 weeks. Priority is IN2P3, then CNAF will be prepared.
    • LHCB:
      • Marco has implemented a cron job to insert data since Friday, every 2 hours, few hundred new conditions are inserted. A problem with CNAF was observed during the weekend, replication to CNAF was stuck. Apply process was moved from one instance to another. Barbara has not observed any problem with the nodes.
      • Scalability tests as soon as resources available.
      • Single instance Oracle server being prepared on the PIT.
      • RAC server for ONLINE no estimation yet.

  • Zbigniew has changed the monitoring tool: he has added new images. Also there is a new tab with reports ( as suggested on the last workshop). Please take a look and send your feedback to Zbigniew.

  • IMPORTANT!!: Please register the service outage using the normal procedure.
Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2007-06-08 - EvaDafonte
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    PSSGroup All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback