Week of 050725
Open Actions from last week:
- tape tests
- testing of smoke tests
- new FTS release / deployment plans.
On Shift:
Maarten/Patricia
Monday:
Log:
- Transfers fairly stable. Midnight Sat/Sun lost SARA (tech. problems with tape silo) Problems again last night with BNL. RAL 2-3 hours downtime.
- Need to work out schedule with sites to debug / tune transfers.
- FTS: 1% of transfers abort in gridftp. Grouped by site. Maarten will take a look...
- FTS new version - still testing. Issue with MyProxy. Workaround... Need to agree / resolve release schedule.
- James has populated ia32 disk servers. Olof to move to wan pool. (18 ia64 + 8-9 ia32)
New Actions:
Tuesday:
Log:
- One problem with FTS006 - ASIS_OR_SPMA error - understood
- LFC002 - GSSDATLAS Server - error in log due to list_replicax - fixed in 1.3.6
- One of the new IA32 nodes still in closed state - Olof to check.
- QF version of config service updated with config lockfile fix.
- Wed/Thur - Olof/Sebastien/Ben in RAL. Jan on SMOD - will only turn up at 9.15
New Actions:
Wednesday
Log:
- GRIDVIEW not stable - was an RGMA problem, but not sure if this explains all
- LFC for LHCb deployed and LHCb gave OK for features
- QF deployed already on FTS nodes (Gavin)
- At PEB, experiments agreed to provide storage rates and capacity and num CPU for T0/1/2.
Actions:
- Maarten: Look into GRIDVIEW numbers -
- Jan/J-P: deploy new LFC on all pilot nodes
DONE
Thursday
Log:
- Still problems with Gridview - imported some of the outstanding data, RGMA team will import rest once system is stable
- Waiting for IA64 version of RGMA - not urgent, have workaround (fri)
- testing new version with load-gen to castorsrm
Actions:
- Wait on Olof/... to return from RAL to reconfigure phedex nodes for DPM
- discussion needed on resources for service phase at T1/T2. - for tuesday
Friday
Log:
- Problem with lfc002 (atlas) - NO_CONTACT - seemed to not be a problem
- tracking down problems in lcg-mon-gridftp and - r-gma - got a new version of the monitor that works with edg-rgma rpms
- Andreas Unterkircher will work 50% on SC work until he joins in April
- CMS have submitted test jobs they can do
- Korean site for CMS joining the challenge. Need to start the process now.
Actions:
- update LSF config and new version of lcg-mon-gridftp (Jan)
- 'canned emails' for interventions (Patricia/Maarten)
- start setting up Korean networking (David/James)
Todo:
- new version of SRM (Ben/Vlado on tuesday)
- QF for myproxy/oracle issue in FTS (install new RPM + restart)