I need to run those same tests in my environment, I am curious about block sizes and how those translate up the storage chain. A 4k io done by Iometer is likely getting converted to a 64k io by the vmware storage driver, and perhaps converted again by the VMFS system over the LUN. Are you benchmarking random 4k or sequential? Based on the low result I assume random? How much better/worse does it get when you switch to 8k?
Did you create your VMFS partition during the ESX install or with the GUI based client afterword? Partitions created by the installer are not "aligned" properly causing blocks to overlap tracks. Also Vmware recommends that the NTFS partition be formatted at 32k allocation unit size.
http://communities.vmware.com/docs/DOC-11458Also ESX 4 has proven a tricky beast when it comes to setting the luns up for Round Robin. After painstakingly using the gui to make all 10 luns on 5 esx servers round robin... (50 changes total) we applied some ESX patches and it reverted everything back to fixed path. The 3PAR implementation guide for ESX 4 has the commands to change the ESX defaults to RR.
Verify your ESX host setup inside the 3par as persona 6 and your ESX QFullSampleSize=32 and QFullThreshold=4 (Per 3PAR implementation guide for ESX 4)
Will have more to say once I get my own tests in order.