Week of 060109
Chair: Harry Renshall
Open Actions from last week:
On Call: Maarten and Sophie
Monday:
Log: BNL are seeing some FTS transfers failing with "cannot allocate memory" failures. These have been seen before (reported by Gavin).
New Actions: Gavin to followup BNL FTS problem.
Discussion: By Friday afternoon the SC3 WAN and FTS clusters had been integrated into the LCG network and the previous static routes were removed (by ML). The SC3 disk-disk rerun rampup should start today. There will be a conference call at 16.00 to kick off this activity.
Tuesday:
Log: Nothing.
New Actions: Ramp-up SC3 to FZK (late morning) and
IN2P3 (initially only up to 100 MB/s).
Discussion: Thanks to Sophie for initiating a new on-call list for the first 3 months of 2006. By then many services should be run by FIO so we will revisit the issue then. The BNL malloc problem was attributed by Maarten to an old 'transient' problem in the Castor srm. He found 10 places in the code that could be responsible. For the moment no further action. H.Renshall reported the ramp-up site schedule following the conference call (put in as daily actions).
J.v.Eldik reported there is a new CASTOR2 c2sc3 cluster for the SC3 rerun. Overnight some 120 files were put on each of 16 new disk servers (GM and ML). They should continue now to at least 8000 files total.
Wednesday
Log:
Actions: Ramp-up SC3 to FNAL and BNL. Get new gridftp servers into GRIDVIEW.
Discussion: The new c2sc3 disk servers are not having their gridftp logs reported back to the GRIDVIEW reporting infrastructure. H.Renshall to get this done urgently.
Thursday
Log:
Actions: Ramp-up SC3 to
RAL and CNAF,
Discussion:
Friday
Log:
Actions: Ramp-up SC3 to NIKHEF and other smaller sites.
Discussion: