Monday, April 5, 2010

Upgrading Virtual Center from 2.5 to 4.0

One of the best upgrade I have done from Virtual Center 2.5 to Virtual Center 4.0 U1. I have done many upgrades where you have to through many pain of redoing the work post upgrade but with 4.0 U1 upgrade it was smoothest and cleanest one. Just pop-in the CD and next –next period.

Here is how I performed it

1. Make sure you have exported all the relevant information from VC so that if any issue comes during the upgrade , you should be able to handle it. Also make sure you have done latest backup of SQL server used by VC. This is how the VC ISO menu looks like and choose


2. It will then detect the VC already running


3. Agree to license agreement and then proceed

4. Fill the correct information

5. t may give message like this and let SQL admin knows about it


6. Type user name with which it can authenticate. Remember this can be your user name since it is just for the authentication purpose

7. Choose the option below as shown and check the box which is mention or else it will not allow to move forwards

8. Here is the option which allow to run the account as a service . Best practice is to run with system account

9. Let this be at default

10. Relax and sit tight till it is done

11. This will do numerous thing during that process

12. Finally it will show like this

Congratulation you have completed the VC upgrade successfully . With my 100+ host environment , I only came across on host which was in disconnect state . It also understand about VC2.5 licensing .

Wednesday, March 31, 2010

Adding additional vmfs volume on ESX host

We may often come across situation that local VMFS volume is exhausted and you have to add additional VMFS. I have written two different blog regarding the same .

One was how to use ACU from web and other was to create additional vmfs if you not able to see the logical volume

In this one we are adding additional drives in the available DL380 and then creating additional vmfs partition.

1 Reboot the ESX host and get into ACU (Array Configuration Utility ) BIOS by pressing F8 on key board. This will bring screen where you can see menu for configuring Array . We need to select first one as shown below

clip_image002

Once we select that ,next screen will show all the available disk . We make sure that you select RAID 5 and then press enter to create this logical drive

clip_image004

It will then provide summary for the total logical space and asking to press F8 to save the configuration

clip_image006

Press continue to see the total logical volume created

clip_image008

It will then show the summary along with newly created logical volume . Here you can also see the previous logical volume

clip_image010

Reboot the host and go to virtual center . Select the configuration from the host and then choose add storage

clip_image012

It will then run the wizard for creating new VMFS partition . Make sure that you select Disk/Lun

clip_image014

It will then show the newly created logical disk

clip_image016

It will show the warning but continue with space creation

clip_image018

Give the name as _storage

clip_image020

Go with default

clip_image022

End it will provide summary

clip_image024

Now you will have two VMFS volume, one old one and other one newly crated

clip_image026

Tuesday, March 30, 2010

Step by Step: Installing HP SIM on ESX4.0

Today I have installed HP SIM on BL460C G6 .  I followed my earlier blog and little bit of change in answer script.

First of all you need to download correct SIM agent for your hardware. So select the your server model  and then from “Software – System Management ” page download correct tar (hpmgmt-8.3.1-vmware4x.tgz)file. Follow the blog till steps 13.

Then the answer file looks like this :

[root@xxxx]# ./install831vibs.sh --install

HP Insight Manager Agent 8.3.1-01 Installer for VMware ESX

Target System is VMware ESX 4.0.0 build-208167

Server:  ProLiant BL460c G6

1.This script will now attempt to install the HP Insight Manager Agents.

Do you wish to continue? (y/n)    yes

2.For accessing the System Management Homepage, the port for hpim service (2381) should be enabled in the firewall. Do you want to enable this port? <y/n> (default is y) yes.

3. For allowing discovery by HP System Insight Manager, the port (2301) should be enabled in the firewall. Do you want to enable this port? <y/n> (default is y) yes

4.For adding the HP Systems Insight Manager Certificate in SMH, the port [280] should be enabled in the firewall.  Do you want to enable this port? <y/n> (default is y) yes

5. Do you wish to use an existing snmpd.conf (y/n) (Blank is n): no

6. Enter the localhost SNMP Read/Write community string (one word, required, no default):  public

7. Enter localhost SNMP Read Only community string (one word, Blank to skip): public

8. Enter Read/Write Authorized Management Station IP or DNS name (Blank to skip): <IP of SIM server>

9. Enter SNMP Read/Write community string for Management Station "<IP of SIM server>" (one word, required, no default): public

10. Enter Read Only Authorized Management Station IP or DNS name (Blank to skip): <Blank if you don’t have one>

11. Enter default SNMP trap community string (One word; Blank to skip): ): <Blank if you don’t have one>

12. Enter SNMP trap destination IP or DNS name (One word; Blank to skip): <Blank if you don’t have one>

13. Enter system contact information (Name, phone, room, etc; Blank to skip): <Blank if you don’t have one>

14. Enter system location information (Building, room, etc; Blank to skip): <Blank if you don’t have one>

System page has been changed from earlier version

clip_image002

And see it inside

image

If you want to do for 3.5 host then follow this blog

Friday, March 26, 2010

Troubleshooting :Set retry timeout for failed TaskMgmt abort for CmdSN

1. We were having issue with one of the esxh host which had 3 VM’s with multiple RDM lun on it. Host was running fine but VM were getting BSOD with following error message.

Host was running fine but vmkernel had following message

vmkernel: 0:00:55:23.452 cpu4:1066)LinSCSI: 3201: Abort failed for cmd with serial=0, status=bad0001, retval=bad0001

Mar 25 22:01:57 xxx vmkernel: 0:00:55:23.458 cpu4:1066)WARNING: ScsiPath: 3802: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T0:L2

Mar 25 22:02:37 xxxx vmkernel: 0:00:56:03.465 cpu4:1066)LinSCSI: 3201: Abort failed for cmd with serial=0, status=bad0001, retval=bad0001

Mar 25 22:02:37 xxx vmkernel: 0:00:56:03.471 cpu4:1066)WARNING: ScsiPath: 3802: Set retry timeout for failed TaskMgmt abort for CmdSN 0x

0, status Failure, path vmhba1:C0:T0:L2

Mar 25 22:02:41 xxxx vmkernel: 0:00:56:06.931 cpu4:1062)VSCSI: 3183: Retry 0 on handle 8202 still in progress after 62 seconds

2. We tried to find out which lun it was . We then use SCSI HBA tool to find out lun . This will show all the vmfs partition. Since all the lun were configured as RDM hence we were not able to find any

[root@xxxx log]# esxcfg-vmhbadevs -m

vmhba0:0:0:3 /dev/cciss/c0d0p3 496a5f4e-dda2c50a-1326-00237d5adda0

3. Then w e were trying to find out which all managed path for the luns

Disk vmhba3:0:6 /dev/sdf (25600MB) has 1 paths and policy of Fixed

iScsi 26:1.1 iqn.2000-04.com.qlogic:qle4062c.lfc0908h85049.1<->iqn.1992-08.com.netapp:sn.xxx vmhba3:0:6 On active preferred

Disk vmhba3:0:5 /dev/sde (71687MB) has 1 paths and policy of Fixed

iScsi 26:1.1 iqn.2000-04.com.qlogic:qle4062c.lfc0908h85049.1<->iqn.1992-08.com.netapp:sn.xxx vmhba3:0:5 On active preferred

Disk vmhba3:0:10 /dev/sdj (5120MB) has 1 paths and policy of Fixed

iScsi 26:1.1 iqn.2000-04.com.qlogic:qle4062c.lfc0908h85049.1<->iqn.1992-08.com.netapp:sn.xxx vmhba3:0:10 On active preferred

Disk vmhba3:0:16 /dev/sdp (1399988MB) has 1 paths and policy of Fixed

iScsi 26:1.1 iqn.2000-04.com.qlogic:qle4062c.lfc0908h85049.1<->iqn.1992-08.com.netapp:sn.xxx vmhba3:0:16 On active preferred

4. We then looked at location /proc/scsi/ location and read the file called scsi file

If you look at the vmkernel error message above there it is mentioning vmhba name with lun number . To clarify more scsi file would be very help. This give NetAPP version which is running and also provide what kind of access it has

Host: scsi3 Channel: 00 Id: 00 Lun: 01

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

Host: scsi3 Channel: 00 Id: 00 Lun: 02

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

Host: scsi3 Channel: 00 Id: 00 Lun: 03

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

Host: scsi3 Channel: 00 Id: 00 Lun: 04

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

Host: scsi3 Channel: 00 Id: 00 Lun: 05

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

5. We also checked following location for HBA F/W version

cd /proc/scsi/qla4022/

[root@xxx]# ls

1 2 3 4 HbaApiNode

6. We also suspected hpsim which was installed as my earlier post and finally we uninstalled it . Go to hpmgmt folder and run ./ installvm811.sh --uninstall
[root@xzxxx qla4022]# esxupdate -l query

Installed software bundles:

------ Name ------ --- Install Date --- --- Summary ---

3.5.0-64607 20:40:59 12/31/08 Full bundle of ESX 3.5.0-64607

ESX350-200802303-SG 20:41:00 12/31/08 util-linux security update

ESX350-200802408-SG 20:41:00 12/31/08 Security Updates to the Python Package.

ESX350-200803212-UG 20:41:00 12/31/08 Update VMware qla4010/qla4022 drivers

ESX350-200803213-UG 20:41:00 12/31/08 Driver Versioning Method Changes

ESX350-200803214-UG 20:41:01 12/31/08 Update to Third Party Code Libraries

ESX350-200804405-BG 20:41:01 12/31/08 Update to VMware-esx-drivers-scsi-megara

ESX350-200805504-SG 20:41:01 12/31/08 Security Update to Cyrus SASL

ESX350-200805505-SG 20:41:01 12/31/08 Security Update to unzip

ESX350-200805506-SG 20:41:01 12/31/08 Security Update to Tcl/Tk

ESX350-200808206-UG 20:41:02 12/31/08 Update to vmware-hwdata

7. We then checked console of the ESX host and press Alt+F12

After doing all the above we decided to swap the HBA cable. Currently HBA was directly plugged into FAS 2020 using QLE4032 dual port. We changed to different HBA and that seems to fix problem. During the course of troubleshooting VMware told us that we can not have 2 dual port QLE4032 as officially one is supported. I was surprised when they share configure MAX as well. I told them that this might be honest mistake in part of statement . Lets see what VMware has to say

Create additional vmfs volume if you have more then one raid disk

You have a situation where you have configure your ESX host as 3.5 with 2 disk as RAID 1+0 and rest all as RAID5 for the VMFS partition. If you have installed your ESX host like next and next finish then you will not see RAID 5 as VMFS partition though when you see your VC you can see that mounted as different target but when you try to add it as datastore you wont be able to do so.

I had college of mind who has installed the ESX like that and was struggling to create additional VMFS partition on RAID 5 partition. . When he tried to see if I can add using add store wizard but we were not able to see anything under storage wizard

How do we then create additional VMFS partition? We can not add it as extend as discussed in my previous blog. I then found beautiful KB and ask them to follow step by step (You never know if VMware make it paid so copying it on blog)

To create a new VMFS volume from the command line:

1. Locate the LUN you wish to format. For example, vmhba1:2:0.

2. Log in to the ESX console, either directly or through an SSH client.

3. Rescan the adapter to ensure that ESX is updated with the latest storage information. Run the command:

esxcfg-rescan vmhba<X>

where <X> is the adapter number

4. Locate the SCSI device from the console in order to find the device node for the LUN, and make note of the identifier.

o For versions of ESX earlier than 4.0, run the command:

esxcfg-vmhbadevs -m

Note: For ESX 3.x, the identifier is in the form of vmhba<C>:<T>:<L>:<P>.

o For ESX 4.0 and later, run the command:

esxcfg-scsidevs -c

Note: For ESX 4.0, the identifier is in the form of naa.<NAA

5. Enter either the Linux or VMkernel device name to open with fdisk.

6.

o For a Linux device, run the command:

fdisk /dev/sd<X>

where <X> is the device node letter

o For a VMkernel device, run the command:

fdisk /vmfs/disk/<device>

where <device> is the device reported in the output of step 4

7. Type p and then Enter to determine if any VMFS partitions already exist.

Note: VMFS partitions are identified by a partition system ID of fb.

8. Type n and then Enter to create a new partition.

9. Type p and then Enter to create a primary partition.

10. Type 1 and then Enter to create partition number 1.

Note: If partitions already exist but you want to use the free space, type 2, 3 or 4. You cannot have more than 4 primary partitions.

11. Select the defaults to use the complete disk.

12. Type t and then Enter to set the partition's system ID.

13. Type fb and then Enter to set the partition system ID to fb (VMware VMFS volume).

14. Skip to step 16 if the partition you created in step 9 is not the first partition.

15. Type x and then Enter to go into expert mode.

16. Type b and then Enter to adjust the starting block number.

17. Type 1 and then Enter to choose partition 1.

18. Type 128 and then Enter to set the offset to 128.

19. Type w and then Enter to write label and partition information to the disk.

20. Use vmkfstools to format the partition.

o For ESX 3.x, run the command:

# vmkfstools -C vmfs3 -b <Block_Size> -S <VMFS_Name> vmhba<C>:<T>:<L>:<P>

Note: Refer to the applicable identifier in step 4. The last number is the partition number, which must match the partition you created with fdisk.

For example:

# vmkfstools -C vmfs3 -b 8m -S LocalVMFS /vmfs/devices/disks/vmhba1:2:0:

This creates a new VMFS3 volume named LocalVMFS on the target vmhba1:2:0:1 with an 8 MB block size.

o For ESX 4.x, run the command:

# vmkfstools -C vmfs3 -b <Block_Size> -S <VMFS_Name> naa.<NAA>:<partition>

Note: Please refer to the applicable identifier in step 4. The last number is the partition number, which must match the partition you created with fdisk.

For example:

# vmkfstools -C vmfs3 -b 8m -S LocalVMFS /vmfs/devices/disks/naa.6090a038f0cd6e5165a344460000909b:1

This creates a new VMFS3 volume named LocalVMFS on the target naa.6090a038f0cd6e5165a344460000909b:1 with an 8 MB block size.

21. Rescan the HBAs on all of the ESX hosts to update them with the new information.