Saturday, September 19, 2009

VMotion keep failing at 78% with Error bad0007

VMotion was failing at 78% with error message "A general system error occurred: Failed waiting for data. Error bad0007."

This was brand new cluster and everything is standard as per other cluster like ESX version ,host physical config etc…

I tried following:

V-motion was failing for all VM's within a cluster at 78%

  • Tried to vmotion other vms within this cluster
  • All of the Vm's failed at 78%
  • Implemented KB 1003577 by installing Update manager
  • Creating a baseline and updated 2 of the esx hosts involved in this and now both of the ESX Servers are now compliant with the latest versions of patches.
  • Created a new cluster called "test Cluster" added two ESX host to this cluster
  • Cold migrated over a VM and tried a Vmotion. It was still failing at same stage (78%)
  • Checked/changed the Following variables according to KB 1003577

Migrate.PageInTimeoutResetOnProgress: Set the value to 1.

Migrate.PageInProgress: Set the value to 30, if you get an error after configuring the

Migrate.PageInTimeoutResetOnProgress variable.

Toggle the Migrate.enabled setting from 1 to 0, click OK, then flip the value back to 1, click OK.

These variables are both identical to what was recommended in the KB

Tried a V-motion this failed at 78%

  • Tried Vmotion Servers on different Datastores this failed too.
  • Swopped the Vmotion network to eliminate a problem with the type of card been used and this did not make a difference. It still fail at 78%
  • Reduced the amount of ram that was been used in the VM this was set to 4GB to 2GB this still keep failing
  • Installed Vmware Tools to this VM.

I created brand new test VM and it works fine. So it looks like there are something fishy about the VM itself. So we started to look at the log and found following

Checked the contents of the Vmware log for the VM been v-motioned an error similar to the following appeared

The vmware.log for the Virtual Machine (VM) being migrated has entries similar to:

---------------------------------------------------------------------------------------------------------------------------------------

May 26 12:06:08.162: vmx Migrate_SetFailure: Now in new log file.

May 26 12:06:08.167: vmx Migrate_SetFailure: Failed to write checkpoint data (offset 33558528, size 16384): Limit exceeded May 26 12:06:08.186: vmx Msg_Post: Error May 26 12:06:08.186: vmx [vob.vmotion.write.outofbounds] VMotion [c0a8644e:1243364928717250] failed due to out of bounds write: offset 33558528 or size 16384 is greater than expected May 26 12:06:08.186: vmx [msg.checkpoint.migration.openfail] Failed to write checkpoint data (offset 33558528, size 16384): Limit exceeded.

May 26 12:06:08.187: vmx ----------------------------------------

May 26 12:06:08.190: vmx MigrateWrite: failed: Limit exceeded

--------------------------------------------------------------------------------------------------------------------------------------

When the amount of Video ram was commented out the following error appeared with the Log file for the VM

---------------------------------------------------------------------------------------------------------------------------------------

Aug 18 08:35:38.194: vmx MKS REMOTE Loading VNC Configuration from VM config file

Aug 18 08:35:38.196: vmx DVGA: Full screen VGA will not be available.

Aug 18 08:35:38.196: vmx Msg_Post: Warning

Aug 18 08:35:38.196: vmx [msg.svgaUI.badLimits] The size of video RAM is currently limited to 4194304 bytes, which is insufficient

for the configured maximum resolution of 3840x1200 at 16 bits per pixel.

Aug 18 08:35:38.196: vmx

Aug 18 08:35:38.196: vmx The maximum resolution is therefore being limited to 1180x885 at 16 bits per pixel.

Aug 18 08:35:38.196: vmx

Aug 18 08:35:38.196: vmx ----------------------------------------

Aug 18 08:35:38.199: vmx SVGA: Truncated max res to VRAM size: 4194304 bytes VRAM, 1180x885

--------------------------------------------------------------------------------------------------------------------------------------

I then checked the .vmx for the problematic server and found that

svga.vramSize= 4194304.

Where as newly created VM has following setting for .vmx file

svga.vramSize=31457280

After investigation with the application owner we found that these are desktop modeling workstation and uses software from HP . HP has recommended this setting for their software to be functional.

One of the KB article explain that “Video Ram (VRAM) assigned to the virtual machine is 30MB or less”

Checked with HP support and they asked me to contact VMWare . When we contacted VMware they told that it is the limitation with ESX3.5 and had been taken care with ESX4.0 U1.

........................................................................Crap …................................................................................

How to present 6TB of LUN to Windows Server

I attached 6TB of ISCSI lun to my VCB backup server and after attaching this is  what I found under diskmgmt.msc

I then checked if the VOL has been split across by SAN admin but that was not the case. This was the part of same VOL so why does WIN2K3 show it as two VOL of 2 and 4TB.

After that I realize with MBR I can have only 2TB of lun and if I need to get all 6TB of LUN I need to convert the disk into GPT(GUID Partition Table).

How to do it  ? Just right click on the disk and you will get option “convert to GPT disk”

What I have read and understood is

MBR is the standard partitioning scheme that's been used on hard disks since the PC first came out. It supports 4 primary partitions per hard drive, and a maximum partition size of 2TB.

GPT disks are new, and are readable only by Windows Server 2003 SP1, Windows Vista (all versions), and Windows XP x64 Edition. The GPT disk itself can support a volume up to 2^64 blocks in length. (For 512-byte blocks, this is 9.44 ZB - zettabytes. 1 ZB is 1 billion terabytes). It can also support theoretically unlimited partitions.

Windows restricts these limits further to 256 TB for a single partition (NTFS limit), and 128 partitions.

Only Itanium systems running Windows Server 2003 and Windows Vista systems with an EFI BIOS can boot from a GPT disk. The other operating systems mentioned earlier can use GPT disks as data disks but not boot disks.

To find out more about GPT read this article from MS.

Tuesday, September 15, 2009

How to reset password for root on Ubantu

Yes I have started learning Ubantu. I installed Ubantu8.1 desktop and during installation it ask me to create user name but never prompt me for root password. So once I logged in I was not sure what would be the root password.

When I did su ,it asked me for password.
So now I was in fix how to change password for root.

To change password /reset password
:~$ sudo sh

it will ask for current logged in user password

Then you will be in # prompt.

Type here passwd and it will change password for root

Monday, August 17, 2009

How to find out which HBA is connected on ESX host

I wanted to figure out how to find out which HBA is connected /cabled. I had been using Qlogic QLE406X and usually I used to press Ctrl + S to configure using BIOS. This way I use to ping the target to find out which port has been cabled.

Reason is we don’t cable all the available port and ESX is not kind enough to tell you which port is what. It does select randomly vmhba 0,1,2,3,4

I asked the same question to Vmware support but could not find any clue. Finally I found answer on Qlogic download site. Next go to http://www.qlogic.com downloads section, select your iSCSI HBA and choose Linux Red Hat (32-bit) from the select menu.


Download the SANsurfer iscli (x86/Intel 64) file (Latest available from the site) to your local machine. Unzip the file to a folder. Browse the datastore and upload the entire folder



SSH into the box and CD to the location where the folder has been uploaded . You will find following files under it



We have to install “iscli-1.2.00-15_linux_i386.rpm ”. To install this use following command

[root@xxxx]# rpm -ivh iscli-1.2.00-15_linux_i386.rpm

Preparing... ########################################### [100%]

1:iscli ########################################### [100%]

Installation completed successfully.

Now type command iscli , this will start nice interactive menu to configure/troubleshoot/update firmware for HBA


From the menu select the option 7 and then choose HBA which are detected by ESX host. Which ever HBA has been cabled ,link will show as UP. Select that HBA and configure it for IP and perform the ping test.

Enable/Disable maintenance mode from command line

vimsh -n -e /hostsvc/maintenance_mode_enter to enable maintenance mode

and

vimsh -n -e /hostsvc/maintenance_mode_exit to disable it