Virtual Nodes On Demand Documentation

New link: http://vnode.web.cern.ch/vnode/index.html

This twiki was outdated and its now deleted. Go instead to the first link

Troubleshooting

Guide to resolve problems with the management of virtual machines

Issuing the 'Terminate' command in the portal doesn't work

Check if the respective host is alive - ping "machine"

1. If the machine does not respond to ping an alarm is automatically raised and someone in the CC will check it and resolve it.

2. If the machine responds to ping, connect by ssh to it and execute the following commands to terminate the VM:

Execute: xm list and check if the respective machine is in there and take note of its ID number

Execute: xm destroy ID-number and confirm that the machine no longer is in xm list

Execute: lvs and check the respective logical volumes of the machine that you just destroyed.

The syntax of the lvs is in the form: "xen-root-username_ctb-generic-number vg1 -wi-ao 5.00G" and "xen-swap-username_ctb-generic-number.cern.ch vg1 -wi-ao 512.00M"

Execute: lvremove /dev/vg1/xen-root-username_ctb-generic-number

Execute: lvremove /dev/vg1/xen-swap-username_ctb-generic-number Recheck with the lvs command that both partitions have been removed.

Check in the portal that the machine is set to available, if not then execute the following steps:

Connect by ssh to the server and do find for vnodevmstate.cfg and find the entry [ctb-generic-number] that you just destroyed and set it status to notDeployed

Example:

[ctb-generic-number] state = notDeployed

Portal is down

Check if the server is alive - ping "machine"

1. If the machine does not respond to ping an alarm is automatically raised and someone in the CC will check it and resolve it.

2. If the machine responds to ping, connect by ssh to it and execute the following commands to terminate the VM:

Execute: /etc/init.d/httpd restart and check if the portal is up and running.

Cannot deploy virtual machines

Check if the host is alive - ping "machine"

1. If the machine does not respond to ping an alarm is automatically raised and someone in the CC will check it and resolve it - choose another machine

ctb-generic-[1-27] can only be deployed on lcgctb[3-8].cern.ch

vtb-generic-[1-30] can only be deployed on lxxen104.cern.ch

ctb-generic-[40-79] can only be deployed on lxxen007.cern.ch,lxxen009.cern.ch, lxxen010.cern.ch, lxxen011.cern.ch, lxxen012.cern.ch, lxxen013.cern.ch, lxxen015.cern.ch, lxxen018.cern.ch,lxxen024.cern.ch,lxxen025.cern.ch,lxxen026.cern.ch

2. Maybe the physical machine doesn't have enough memory or enough disk - try reducing this values or choose another machine.

For more serious problems contact Ricardo.MendesNOSPAM@cernNOSPAMPLEASE.ch

-- RicardoMendes - 12 Apr 2008

Edit | Attach | Watch | Print version | History: r25 < r24 < r23 < r22 < r21 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r25 - 2008-09-12 - RicardoMendes
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Virtualization All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback