-- Main.delgadop - 16 Aug 2005
Operator's Guide for LFC nodes
LFC = LCG File Catalog
Applies to nodes: lfc001 - lfc011
For the following alarms:
NO_CONTACT
SWAP_FULL
TMP_FULL
VAR_FULL
ROOT_FS_FULL
HIGH_LOAD
Any other alarm:
Operator's Guide for FTS nodes
FTS = File Transfer Service
Applies to nodes:
fts001 fts002 fts003 fts004 fts005 fts006
lxb1386 lxb1387 lxb1388 lxb1389
lxshare021d lxshare026d
For the following alarms:
NO_CONTACT
HIGH_LOAD
VM_KILL
KERNELPANIC
SWAP_FULL
TMP_FULL
VAR_FULL
ROOT_FS_FULL
- make log only entry
- send mail to hep-service-sc-level2@cernNOSPAMPLEASE.ch
- if the alarm occurred on LXSHARE026D please call at all times (24x7) number 164111 or 164222 and inform them about the problem
- otherwise between 9:00 - 18:00 during working days please call 164111 or 164222 and inform them about the problem
Any other alarm:
Full list of possible alarms:
high_load
swap_full
tmp_full
var_full
root_fs_full
CrashDump_found
UnmountedSwaps
DMA_disabled
snmpd_wrong
klogd_wrong
sshd_wrong
atd_wrong
beniced_wrong
sendmail_wrong
portmap_wrong
syslogd_wrong
postfix_wrong
nfsd_wrong
xfs_wrong
tftpd_wrong
cpu_wrong
Spma_or_Asis_Error
edg_cdp_listend_wrong
crond_wrong
named_wrong
var_lock
iss_nologin
nomorestage
nomorerfio
nomoremigr
xinetd_wrong
rpc_statd_wrong
nscd_wrong
ntpd_wrong
ssh_bin_wrong
var_unwriteable
notd_wrong
KernelPanic
VM_kill
ExtFsWarning
UncorrectableError
MachineCheckException
IO_ERROR
FILESYSTEM_ERROR
3ware_error
3wareTimeout
+ NO_CONTACT ( = machine didn't sent heart beat)