Tuesday, April 7, 2009

The parent virtual disk has been modified since the child was created

We came across many instance when we have expanded the HDD which has snapshot and when we try to power on we get error “The parent virtual disk has been modified since the child was created” One
of my former college and guru’s has
blogged about it . I found nice KB 1007849 which I thought of copying because
of fear that I may loose the track.

This error can occur under the following circumstances:

  • The base disk of thevirtual machine has been changed

  • Running out of space on the LUN that contains the snapshot

  • Virtual machine suffers a blue screen exception when a snapshot is taken

For more details about snapshots and missing header files, see Cannot power on a virtual machine because the virtual disk cannot be opened (1004232)

Before implementing the solution, ensure:

  • You know the file names of all virtual disks associated with the virtual machine. Use the following command and record the output:
    #grep -i filename /vmfs/volumes/old_datastore/vmdir/*.vmx | grep -i
    vmdk

  • That there is enough space on the target datastore to receive the cloned virtual disks. Use the following command to assess how much space is needed:
    #ls -lah /vmfs/volumes/old_datastore/vmdir/*flat.vmdk

    Note: This example assumes that all of the virtual disks are contained in the directory with the virtual machine configuration file (
    .vmx).

  • A complete snapshot chain with matching CIDs and Parent CIDs (obtained from step 2 in the procedure).

  • The virtual machine is in a powered off state.

  • There is enough time to commit the snapshots. This procedure can take an extended amount of
    time depending on the size of the delta files.

To match the CID of the snapshot to the base disk:


  1. Find out what disks the virtual machine is using. Log in to the ESX host and navigate to the directory that contains the virtual machine. Run the following command to identify which virtual disk files are being
    used:

    # grep -i filename *.vmx


    The output appears similar to:
    scsi0:0.fileName = "vm-000002.vmdk"
    scsi0:1.fileName = "vm_1-000002.vmdk"
    This virtual machine has two virtual disks.

  2. Check the CID of the base disks and compare them to the snapshots to see if they match.Run the following command to obtain the information about the virtual disk:
    #cat vm_1-000002.vmdk

    Theoutput appears similar to:
    CID=929c1b7d
    parentCID=20d215cd
    createType="vmfsSparse"
    parentFileNameHint="vm_1-000001.vmdk"
    This indicates that the file vm_1-000001.vmdk has the CID of 20d215cd.
    Do this for each file in the snapshot chain until you get to the base disk.

  3. Clone the virtual disk to a new base disk. To clone the virtual disk:

    1. Run the vmkfstools -i command to copy out and consolidate the snapshots. This requires enough space for the basedisk. The command ls -l *-flat.vmdk gives the size of
      each base disk in the current directory.

    2. Create a directory on a the target datastore if you are not cloning to the current directory:
      #mkdir /vmfs/volumes/new_datastore/recover

    3. Run the vmkfstools -i command to clone out the last vmdk delta disk pointer VMDK:
      #vmkfstools -i /vmfs/volumes/old_datastore/vmdir/vm_1-000002.vmdk
      /vmfs/volumes/new_datastore/recover/vm_1.vmdk

      The output appears similar to:
      Destination disk format: VMFS thick
      Cloning disk '/vmfs/volumes/old_datastore/vmdir/vm_1-000002.vmdk'...
      Clone:100% done.


      Note:If the above process does not work, choose the next snapshot up the tree as the one you are on (or one of the higher ones) is corrupt.

    4. Detach the disk from the virtual machine using the VI Client.

    5. Attach the new disk, /vmfs/volumes/new_datastore/recover/vm-disk2.vmdk
      to the virtual machine.

    6. Power up the virtual machine and ensure data integrity (Event Viewer, SQL, E-mail, etc).

  4. Delete the base disk and only it's snapshot disks.
    #grep -A2 parentFile vm_1-000002.vmdk | grep -v "#"

    Theoutput appears similar to:

    parentFileNameHint="vm_1-000001.vmdk"
    RW 41943040 VMFSSPARSE "vm_1-000002-delta.vmdk"

    This indicates that the pointer vm_1-000002.vmdk points to vm_1-000002-delta.vmdk and is using vm_1-000001.vmdk as it's parent disk.
    You may run the same command on the parent pointer to determine what files it uses, until you locate them all.
    #rm /vmfs/volumes/old_datastore/vmdir/vm_1-000002.vmdk
    # rm/vmfs/volumes/old_datastore/vmdir/vm_1-000002-delta.vmdk
    #rm /vmfs/volumes/old_datastore/vmdir/vm_1-000001.vmdk
    # rm /vmfs/volumes/old_datastore/vmdir/vm_1-000001-delta.vmdk
    #rm /vmfs/volumes/old_datastore/vmdir/vm_1.vmdk
    # rm /vmfs/volumes/old_datastore/vmdir/vm_1-flat.vmdk

    Note:If you are going to use rm with a wildcard, echo the command first.This allows you to see which files are being targeted for erase.

    #echo rm *.vmdk

  5. Move the VMDKs back to the original location and re-associate the virtual machine with the new disks.
    To move the VMDKs back to the original location:

    1. Run the following command to clone the disk back to the original data store.

      #vmkfstools -i /vmfs/volumes/new_datastore/recover/vm_1.vmdk/vmfs/volumes/old_datastore/vmdir/vm_1.vmdk

    2. Detach the recovered disk from the virtual machine using the VI Client

    3. Attach the new disk in the original location, /vmfs/volumes/old_datastore/vmdir/vm-disk2.vmdkto the virtual machine.

    4. Power up the virtual machine and ensure data integrity (Event Viewer, SQL, E-mail, etc).

  6. Clean up the snapshot database.Rename the .vmsd file, remove the virtual machine from inventory, and re-add the virtual machine to clear the snapshot database.
    #mv vm.vmsd vm.oldvmsd

Thursday, April 2, 2009

vSRM2: Database consideration

For Microsoft SQL Server, the database should be configured as follows:
1. Schema name should be same as the user name and you must have a default schema associated with the user account
2. Bulk insert administrator privileges
3. If Windows authentication is used, install SRM service on a host that shares the same domain as the database server
4. If SQL Server is installed locally, you might need to disable the shared memory network setting on the database server

vSRM 1: Network Consideration

I have done demo in past for vSRM 1.0 using NetApp SIM. I though of starting a new fresh topic on my blog about Site recovery Manager.

Option 1: Use the Same IP
When we start powering on VMs at the Recovery Site – they may be on totally different networks requiring different IP addresses and DNS updates to allow for user connectivity. The good news is SRM can control and automate this process. One very easy way to simplify this for SRM is to implement “stretched VLANs” where two geographically different locations appear to be on the same VLAN/subnet. However, you may not have the authority to implement this – and unless it is already in place it is a major change to your physical switch configuration, to say the least. It’s worth making it clear that even if you do implement stretched VLANs you may still have to create inventory mappings because of port group differences. For example there may indeed be a VLAN 101 in Boston and a VLAN 101 in Amsterdam . But if the administrative team in Boston call their port groups on a virtual switch BOS-101 and the guys in Amsterdam call theirs AMS-101 then you would still need a port group mapping in the Inventory Mappings tab.

(Source : Administering VMware's Site Recovery Manager chapter 4)

Option 2: Re-IP all the VM’s at the recovery site

This can be done using “dr-ip-customizer.exe” The basic process is to export the current NIC settings to a CSV, edit the CSV with your new IP addresses and then import it back into SRM with the new desired IP addresses. In reality this tool is helping to create the Customization Specifications for you, so you can actually see them from the Edit - Customization Specifications window after you follow the process.

(Source : Site Recovery Manager Administration Guide Page54)

Duncan : If you want to keep it simple in terms of failover I would always prefer stretched vlan's. Especially dependency wise re-ip'ing can be a huge problem. (hardcoded ip's within apps / databases etc)

Lee Dilworth : From the x86 side of things I sometimes get the feeling that for historic reasons a lot of x86 teams were forced to re-ip in DR since
A) Networks didn't support transparent failover
B) Not that many workloads existed on x86 that were classed as critical for the business that they needed to be included in any DR strategy, as we all know in the last 5-10 years this swing has shifted massively and now the vast majority of workloads will be running on x86 based OS's. So maybe its time for a re-think?
What you need to consider are todays "modern" and multi-tier applications (web/msg/rdbms) how many in your portfolio would you expect to work unaided if you simply re-ip'd them? probably not many and confidence is low...for example would you really want to failover lots of oracle/sql srvr/exchange VM's and then give them all brand new ipaddresses. ok they may work if you are VERY confident that the apps respond well to that and that you have been very strict company wide in your use of FQDN's and aliases if not then there's always those hardcoded ipaddresses that can bring your application stacks to their knees if you forget (or miss) to change on. Bottom line, when changing the ip-addresses over you could (not always) introduce a whole extra set of tests that need to be run once the VM's are recovered. The nice thing about SRM is that at least now you can perform these tests in a non-disruptive way which is great but if you feel that in real DR situations you still need to run them no matter how well your testing went then any extra tests

1. Whats our strategy for failover? change ip addresses / keep same address?
2. Do we stretch layer2 vlan?
3. Do we failover layer 3?
4. How do we update DNS if we make changes? is that scripted?
5. Do we need a secure test set of vlans created at the test site that can be used for DR testing? do these exist?
6. If we do re-ip do we have an existing mechanism/solution/set of scripts/dhcp reservations that do this for us (if you do then these same techniques will almost certainly work unaltered against your VM's)
7. Does SRM offer any other options for re-ip if we don't have an existing mechanism (yes it does!)

Once you can answer the above you then will have a better idea of what your solution should look like. customers i am working with now that are deploying SRM and have answered this usually then come back with a simple addition to their network setup. they create the required vlans at the network layer, ensure they have the correct routing / ACL's set etc and then they present these down the same trunk ports that the recovery site ESX hosts are using. once this is done they can then create portgroups for these new test vlans and finally in their SRM recovery plans they now select these "test" portgroups as the ones to use for the test network rather than the "Auto" default.

Friday, March 27, 2009

Application Virtualization a Different Approach

I basically started working on multi layer virtualization. What does it exactly mean?

Well we have seen talked and discussed about application virtualization. But how we are deploying it using Thin App/Xen-App/V-App? How do you club it with your current VMware environment ?

How about this multi layer virtualization?

1. Xen-App which will allow you to publish application- Layer 1

2. MS App V which provide application virtualization – Layer 2

3. Then you have third layer which is server virtualization where your MS App V server is running.

How does it sound you are running MS application virtualization over VMware server virtualization. Well this is competitive age and if you want to survive you should learn to respect each other.

What I have learn so far with MS App-V is that it allow you to execute application locally on the PC unlike Citrix. App-V run application in its own bubble (address space) including dll and executable. What does it suppose to mean? Say for example dll which is stored in system/system32 and if it used for two different application then every application will try to overwrite it and try to put own version. But incase of App-V it runs all in its own virtual partition.

clip_image002

clip_image004

clip_image006

So benefits of such design are that you can have more control over deployment since you don’t have to do perform deployment on 1000 different server. Again you can have different methodology of deployment, once such which I am aware is HP RADIA but as I mention this is complete different approach of application virtualization.

You virtualize application and then publish it to your XenApp and then your application virtualization as well as XenApp running on ESX host. How does it sound ?

I am currently involved in learning this whole new application virtualization stuff . After SRM/ESX ,

App-V woo next Ubantu/DataONTAP.

Thursday, March 26, 2009

A General Sstem Error occurred: Invalid Fault

I was trying to create a VM on a host and it was throwing following error message



We just have to restart hostd service on the host: service mgmt-vmware restart