Virtual Nodes On Demand Documentation
This twiki was outdated and its now deleted. Go instead to the first link
Troubleshooting
Guide to resolve problems with the management of virtual machines
Issuing the 'Terminate' command in the portal doesn't work
Check if the respective host is alive - ping "machine"
1. If the machine does not respond to ping an alarm is automatically raised and someone in the CC will check it and resolve it.
2. If the machine responds to ping, connect by ssh to it and execute the following commands to terminate the VM:
Execute: xm list and check if the respective machine is in there and take note of its ID number
Execute: xm destroy ID-number and confirm that the machine no longer is in xm list
Execute: lvs and check the respective logical volumes of the machine that you just destroyed.
The syntax of the lvs is in the form: "xen-root-username_ctb-generic-number vg1 -wi-ao 5.00G" and "xen-swap-username_ctb-generic-number.cern.ch vg1 -wi-ao 512.00M"
Execute: lvremove /dev/vg1/xen-root-username_ctb-generic-number
Execute: lvremove /dev/vg1/xen-swap-username_ctb-generic-number Recheck with the lvs command that both partitions have been removed.
Check in the portal that the machine is set to available, if not then execute the following steps:
Connect by ssh to the server and do find for vnodevmstate.cfg and find the entry [ctb-generic-number] that you just destroyed and set it status to notDeployed
Example:
[ctb-generic-number]
state = notDeployed
|
Portal is down
Check if the server is alive - ping "machine"
1. If the machine does not respond to ping an alarm is automatically raised and someone in the CC will check it and resolve it.
2. If the machine responds to ping, connect by ssh to it and execute the following commands to terminate the VM:
Execute: /etc/init.d/httpd restart and check if the portal is up and running.
|
Cannot deploy virtual machines
Check if the host is alive - ping "machine"
1. If the machine does not respond to ping an alarm is automatically raised and someone in the CC will check it and resolve it - choose another machine
ctb-generic-[1-27] can only be deployed on lcgctb[3-8].cern.ch
vtb-generic-[1-30] can only be deployed on lxxen104.cern.ch
ctb-generic-[40-79] can only be deployed on lxxen007.cern.ch,lxxen009.cern.ch, lxxen010.cern.ch, lxxen011.cern.ch, lxxen012.cern.ch, lxxen013.cern.ch, lxxen015.cern.ch, lxxen018.cern.ch,lxxen024.cern.ch,lxxen025.cern.ch,lxxen026.cern.ch
2. Maybe the physical machine doesn't have enough memory or enough disk - try reducing this values or choose another machine.
|
For more serious problems contact
Ricardo.MendesNOSPAM@cernNOSPAMPLEASE.ch
--
RicardoMendes - 12 Apr 2008