Page 1 of 1

Migrating large VMs to vVol fails

Posted: Mon Nov 20, 2023 6:49 am
by celoxgroup
Dear community,

I'm hoping that someone faced a similar problem and has a solution for it :)

Specs: 3PAR 8400 3.3.2.159 (MU1)+P10
Hypervisor: vCenter and ESXi 7.0.3 (with the latest patches applied).

We're facing a challenge when migrating (or copying, deploying) virtual machines that have large (and many) disks attached. The process of cloning the virtual machine goes through however it fails when Reconfiguring the virtual machine (I suppose VMware is changing the storage policy on VM) with a generic error:
Storage policy change failure: The VVol target encountered a vendor specific error.

I pulled vvold.log logs from ESXi host and discovered that it seems like 3PAR started doing the deduplication on that VM and locked something out so ESXi couldn't update the configuration.

I have opened a support ticket with VMware however I do not have a support subscription with HPE. Hopefully someone can help me with this challenge as Google didn't find anything.

Kind Regards, Matt

Partial log attached:

Code: Select all

2023-11-19T06:19:11.077Z info vvold[2100776] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] VasaOp::UpdateProfileForVirtualVolume [#13160]: ===> Issuing 'updateStorageProfileForVirtualVolume' to VP [3par:Connected (Outstanding 0/4)]
2023-11-19T06:19:12.393Z error vvold[2100776] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] VasaOp::IsSuccessful [#13160]: updateStorageProfileForVirtualVolume transient failure: 22 (STORAGE_FAULT / Error has occurred. Details: Storage fault : error: VV dat-Virtuali-81d7a9f2 has dedup accounting in progress. Please retry later. / )
2023-11-19T06:19:12.393Z warning vvold[2100776] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] VasaOp[#13160] ===> Transient failure updateStorageProfileForVirtualVolume VP (3par) retry=false, batchOp=false container=0c0b4e72-c4ff-432b-b730-0529876f97ee timeElapsed=1316 msecs (#outstanding 0)
2023-11-19T06:19:12.393Z error vvold[2100776] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] VasaOp[#13160] ===> FINAL FAILURE updateStorageProfileForVirtualVolume, [color=#FF0000][b]error (STORAGE_FAULT / Error has occurred. Details: Storage fault : error: VV dat-Virtuali-81d7a9f2 has dedup accounting in progress. Please retry later. / )[/b][/color] VP (3par) Container (0c0b4e72-c4ff-432b-b730-0529876f97ee) timeElapsed=1316 msecs (#outstanding 0)
2023-11-19T06:19:12.396Z info vvold[2100781] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] Came to SI::UpdateVirtualVolumeMetadata: vvolUuid naa.60002AC00000000000002FA20001F878 esxContainerId 0c0b4e72c4ff432b-b7300529876f97ee
2023-11-19T06:19:12.396Z info vvold[2100781] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] SI::UpdateVirtualVolumeMetadata replaced metadata key VMW_ContainerId (0c0b4e72-c4ff-432b-b730-0529876f97ee -> 0c0b4e72-c4ff-432b-b730-0529876f97ee)
2023-11-19T06:19:12.396Z info vvold[2100781] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] getProfileFromXml: Error processing xml buffer: fee7b662-db8d-4cf1-97e4-b49e1b86abe5:0
2023-11-19T06:19:12.396Z info vvold[2100781] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] UpdateVirtualVolumeMetadata: kv
--> key[0]: [VMW_ContainerId] = [0c0b4e72-c4ff-432b-b730-0529876f97ee]
--> key[1]: [VMW_GosType] = [vmkernel65Guest]
--> key[2]: [VMW_VVolName] = [Virtualisation Host-LkUI_2.vmdk]
--> key[3]: [VMW_VVolNamespace] = [/vmfs/volumes/vvol:0c0b4e72c4ff432b-b7300529876f97ee/naa.60002AC00000000000002F9D0001F878]
--> key[4]: [VMW_VVolType] = [Data]
--> key[5]: [VMW_VmID] = [5031363e-a218-ce31-0600-b7bd555ca220]
--> key[6]: [VMW_VmID_5031363e-a218-ce31-0600-b7bd555ca220] = [2023-11-19T06:19:06.330921Z]
--> key[7]: [VMW_VvolAllocationType] = [4]
--> key[8]: [VMW_VvolProfile] = [fee7b662-db8d-4cf1-97e4-b49e1b86abe5:0]

Re: Migrating large VMs to vVol fails

Posted: Wed Nov 22, 2023 4:51 am
by MammaGutt
What is the storage policy for the volumes you try to deploy and what is the size of the volumes (and total size of all combined)?

Re: Migrating large VMs to vVol fails

Posted: Wed Nov 22, 2023 5:01 am
by MammaGutt
Reading again, I see this:
2023-11-19T06:19:12.393Z error vvold[2100776] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] VasaOp::IsSuccessful [#13160]: updateStorageProfileForVirtualVolume transient failure: 22 (STORAGE_FAULT / Error has occurred. Details: Storage fault : error: VV dat-Virtuali-81d7a9f2 has dedup accounting in progress. Please retry later. / )

2023-11-19T06:19:12.393Z warning vvold[2100776] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] VasaOp[#13160] ===> Transient failure updateStorageProfileForVirtualVolume VP (3par) retry=false, batchOp=false container=0c0b4e72-c4ff-432b-b730-0529876f97ee timeElapsed=1316 msecs (#outstanding 0)
2023-11-19T06:19:12.393Z error vvold[2100776] [Originator@6876 sub=Default opID=vcd-98c10ff2-618f-4bdd-a7ad-f0ab751b480d;activity=urn:uuid:8217ce0e-4357-49d5-894d-af5ebc16015a-9-01-9c-7e53] VasaOp[#13160] ===> FINAL FAILURE updateStorageProfileForVirtualVolume, error (STORAGE_FAULT / Error has occurred. Details: [b]Storage fault : error: VV dat-Virtuali-81d7a9f2 has dedup accounting in progress. Please retry later. / ) VP (3par) Container (0c0b4e72-c4ff-432b-b730-0529876f97ee)[/b] timeElapsed=1316 msecs (#outstanding 0)

Re: Migrating large VMs to vVol fails

Posted: Thu Nov 23, 2023 8:26 am
by celoxgroup
Like you've mentioned, error comes up dedup accounting is in progress. Don't know what the accounting means in this context.

My storage policy is:
- Storage Type: HPE 3PAR StoreServ
- Thin Persistence: Enabled
- Thin Deduplication: Enabled
- CPG: SSD_r5
- Snap CPG: SSD_r5
- Tagged with: Platinum Storage

Tested just now, if I remove Thin Deduplication then it works :/ however I wish to have deduplication enabled.

Re: Migrating large VMs to vVol fails

Posted: Thu Nov 23, 2023 10:50 am
by MammaGutt
Okay, so you have 8400. You’re doing dedupe and 3.3.2 so I assume 2-node system.

How big is the system(number of volumes, numbers of CPgs,amount of data)?

Are you doing snapshots?

What version of dedupe are you using?

If dedupe accounting is running all the time, the system isn’t probably feeling very well :(

Re: Migrating large VMs to vVol fails

Posted: Fri Nov 24, 2023 2:58 am
by celoxgroup
I suppose the system is not so huge :)

I have 3 virtual volumes, out of which I'm draining two to vvols. Regards the storage containers, I have two.

Although size-wise is not so big, I do have a large amount of objects, 670 virtual machines and 1600 "files" per each vvol. As VMs are linked-clones, dedup ratio is 25:1. 13x SSD 7.68 drives and extra 5 soon coming. No extra cage.

I have one CPG, raid5. Allocated capacity is 43TB out of 100TB. No snapshots.

Don't know which component is a dedup however they are all 3.3.2.168 (P10) or 3.3.2.159 (MU1).

Is there a command which would do a health check and fix itself?

Kind Regards, Matt

Re: Migrating large VMs to vVol fails

Posted: Fri Nov 24, 2023 1:47 pm
by MammaGutt
13 physical drives is an unsupported and untested configuration, so that might cause some funny stuff.

3PAR VV can’t be used for Vmware vVol. A CPG can be used for vVol thru a storage container, but a 3PAR is a block volume that can only be used as a Vmware datastore.

If you have one datastore(3PAR VV) with 650VMs I would really really really recommend to split those into smaller volumes with less VMs. Big volumes with a lot of changes will constantly do dedupe acounting as there is only one process per volume. You also only have one queue per vmware host and on the 3PAR per VV so also from a performance perspective and risk perspective (noisy neighbour) you should devide your workload onto more volumes.

RAID5 is also risky but I guess that is known.

If you go into CLI and run showvv -s. Do you have a volume called _sysvv…….. or _shared….?