Richard Siemers wrote:
Try using Hi-res to view port perf, for DISK ports, compared by N:S:P... see if any particular loops spike.
Very interesting. Looking at the 00:21 yesterday morning time, yes, indeed, 3:2:3 and 2:2:3 are 1874 and 1901 IOPs respectively, whereas the remaining six ports are less than 1000 IOPs per ( 970, 960, 947, 964, 947, 950 ).
Bandwidth on 3:2:3 and 2:2:3 is 92,000 and 95,000, the remaining six are around 38,000 per.
Service time on 3:2:3 and 2:2:3 is 36ms and 45ms respectively, the other average 5ms per.
Average Busy, 3:2:3 and 2:2:3 are nearly 100% ( 96% and 97% ) the other six are low 80% range.
A glance at the colors and I see
RED 3:2:3 and
GREEN 2:2:3 consistently above the rest. Is this a rebalance issue?
Edit: adding to this,
cage4, loop A 3:2:3, loop B, 2:2:3 have both FC and NL disk
cage10, loop A 3:2:3, loop B, 2:2:3 have both FC and NL disk
Of the 12 cages shown, our NL disk is only in those two cages, 4 and 10. Related???
Question??? Would a TUNE on the CPG where the database lives and seems to be most affected, currently RAID 1, Tuned to a RAID 5 CPG help??
Richard Siemers wrote:
Do the same for front end ports, compared by N:S:P .. looking for hosts that are NOT using round robin correctly is pretty hard/difficult.
Is that host port type versus disk port type in previous?
Port Types : host ; Port Rates : --All Port Rates-- ; Ports (n:s:p) : --All Ports-- ; Compare : n:s:p
Select Peak : total_iops
Those all look fairly balanced...only seeing 2:1:1, 2:1:2, 3:1:1, and 3:1:2
Richard Siemers wrote:
If you find a spike on the back end, you can track it down to a shelf, and probably a PD with a PD perf report limited to those on that spiked loop. You should be able to ssh to the insert and pull detailed logs to see of there were LESB errors, or bad chunk let being swapped etc.
If you find a spike on the front end, you can build a list of hosts zoned to that FE port, and work your way down... VLUN perf, limited per 1 host, compared by N:S:P, the lines should be close to on-top of each other if round robin is setup properly.