Attendees: Olof (chair), Seb, Charles, Gordon, Vlado, Jan, Hugo (sec)
Review of actions: (only changes noted)
point 12) Olof solved the slc3 issues on lxfs4501 by renaming the mount point. Vlado will talk to David in order to change is script to use the FQN as filesystem mount point.
Central services:
vmgr:
vdqm:
Stagers:
NA60 started to use it's standby stager with one disk server. No rootd started/configured in the diskserver, Jan will look into having a rootd in the machine.
Diskservers:
Lxshare2[14-20]d added to castor2 set-up.
Tapeservers:
Olof created a new file class for NA60 with nbcopies=0.
Charles started the labeling of 100 LTO3 tape. No problems seen.
4 failures on both Titanium drives (seems related to 32K block size) with I/0 error. IPL brings the drive back but doesn't solve the proble, until now no understanding of the cause of the problem. Charles/Hugo tried to use the VOP to get logs/dumps/perms from the drives to files, but the files were created with size 0. The VOP display an alert on one of the drives but the status led OK. IP of Titanium drives seem appear on the SDP box but still not configured.
Robotics:
SL8500 robot installation was finished on friday. All LTO 3 drives installed in the robot are visible. STK didn't provide a user/password to login on the 8500 robot console. The SL8500 was left on a bad state shambled (?) state, many vols missing, absent ... an audit performed by Charles fixed the problem.
Grid:
Vlado will moved 7 disk servers from stagergridsc to CASTOR2. This is provided the 7 disk servers have been drained (procedure started by Tony yesterday afternoon).
Castor2 should be ported to IA64 platform and deployed on Oplapro[71-79]. Vlado will try to get a machine (opladev) for the castor2 from Andreas Hirstius where Jean Damien can try to compile castor2. Sebastien forsee some problems in the port specially on marshalling/demarshalling on clients/servers combination from different platforms.
Rolling of the minutes writing defined for the coming three weeks:
Week 4-8 June: Hugo
Week 11-15 June: Jan (to be reconfirmed with him next week)
Week 18-22 June: Sebastien
NOTE
Minute writers are asked to try to retain Tony's format. In particular it is important to include the action list
Scheduled events:
Action list:
6) Keep track of Cocotime allocations for diskservers as we move to the new stager. (action on X..)
9) Review of operator and sys-admin procedures for Castor machines (David)
11) Move databases off JTT machines (David and Jan)
12) Vlado Jean-Damien to solve slc3 issues on lxfs4501 etc - also to consider other cases on slc3 stager running slc3 diskservers and redhat stager running redhat+slc3 diskservers. This is important for the castor production service and Charles' machines.
13) Vlado - this week - will make sure that there are operational procedures known to us/operators (maybe sysadmins) for all machines in stagegridsc.
16) Protect tapeservers against unauthorized read/write access ('ihep02' incident) castor-dev (Ben to investigate ??).
17) Ops think its best to disable a tape that is stuck on a tapedrive that is DOWN (rather than have castor try to mount it elsewhere endlessly; Ben will investigate where such a fix could be introduced.