Week of 061030

Open Actions from last week: Put new LFCs into production. Was done on Friday.

Chair: H.Renshall

Gmod: K.Skaburskas

Smod: M.dos Santos

Monday:

Log: Nothing

New Actions: Renumbering of gdrb07 to gdrb11 tomorrow morning. LSF grid queues will be paused.

Discussion: Several dips in both Alice and Cms FTS transfers over the weekend. Different causes - to be followed up individually. Alice noted that the FTS hourly statistics are not refreshed. P.Badino answer is they are waiting for new DB indexes as the query is currently too expensive. CMS bulk job submission is expected to ramp up today/tomorrow.

Tuesday:

Log: lfc101 had a raid-tw alarm. rb108 froze and was rebooted. Nothing obvious in the logs.

New Actions: Atlas castor stager upgrade for this morning - 9.30 to 10.30. gdrb renumbering this morning. GD to inform Ulrich when he can restart the grid queues.

Discussion: Alice and GSSDLHCB stager upgrades for tomorrow.

Wednesday

Log:

New Actions:

Discussion: rb102 has developed bug - l&b process dies after few minutes. With developers. After Atlas upgrade gridftp daemons on disk servers had to be restarted to get new castor library.

Thursday

Log: gdrb's move from 137.138 to 128.142 blocked their high ports. Fixed by ML during Wed afternoon. After Oracle patches an fts query is running much slower - this was in fact a problem on the validation cluster.

New Actions: Hand over voalice01 and 02. dteam to run simple tests over srmv22.

Discussion: cms are ramping up the volume of jobs being submitted towards 50k/day. Currently 25k/day - 60% go to OSG. Have solved output retrieval performance by increasing to 10 job robots. ALICE are seeing very cyclic FTS transfer rates with peaks and troughs each 30 minutes. Miguel suspects Castor issues and will follow up. GSSDLHCB have old root3 applications which do not work with new stager - deciding on whether to roll back. GSSDLHCB LFC reconfig to replicate to CNAF today. SRMV22 is now in pre-production for basic testing.

Friday

Log:

New Actions: Ulrich is testing new LSF RPMs (improved eexec and external mailing wrapper). Costin and Miguel to try to understand cyclic FTS transfer rates for ALICE. CMS and ALICE both dropped their transfer queues overnight.

Discussion: gdrb security scans ok but machines to stay out of landb set. GSSDLHCB LFC replication now setup and looking promising.

Your signature to copy/paste:

Force new revision help | | or or or Access keys: S = Save, Q = Quiet save, K = Checkpoint, P = Preview, C = Cancel

Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2007-02-02 - FlaviaDonno
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback