hi,
i'm writing you today as we are suddenly having huge performance problems on our 3Par and we are struggling to find the root cause.
our setup is the following :
3par 7400 with a 3 tier AO setup (ssd/fc/nl), 64 total disks we have a HP C7000 behind with many ESX hosts, mounting gold/silver/bronze luns corresponding to the 3par VV's we created using different AO configs.
performance was OK since 3 years and a few days ago, we suddenly had massive complaints from different teams saying that they had huge timeouts, disconnected services, applications disconnected from databases (like with network issues). we have important network issues, but suspect the problem is coming from the storage, as on the vmware side, we are having dozens of "disk latency" alerts.
in the vmware logs, we can see suddenly dozens of "performance was degraded" on 3par LUNs, with ms going from 5000 to 5000000.
we stopped half of our VM's (all non-prod) to save some performance. now, when looking at the 3par disks, running a statpd, total I/O per second of disks is arround 13000, every disk has an average of 200.
in terms of Queue Length, all FC & SSD disks are 0, all NL disks have a Qlen of about 20.
it looks like in VMware & VMs, all VMs using the BRONZE configuration are having problems (so mostly NL disks).
we suspect issues with 1 disk or other, but the healthcheck didn't give anything, all disks are still marked as correct. All disks "service time" is arround 30 to 50 (in statpd)
may i require your help to investigate this problem ? what kind of thing should we check to find out what the problem can be ?
thanks again regards
|