Hello everybody ,
Need some advice about best way to go forward and what else I can check. We have recently acquired an 8200 with 8x SSDs 1.9Tb. (I know its not the best setup, it was a sales thing as experienced by others in the forum, we should have ordered smaller disks and more of them) but anyways. We have a couple of ESX hosts running on it and everything was running fine (there is a small amount of VMs & are not generating huge amount any traffic).
We recently introduced a physical host hosting a custom application having a ratio of 30/70 rw with a random pattern - write process in spikes of 5 minutes 250MB/s, with an intensive read pattern every 30mins pulling 600MB/s.
We have no control in the application itself, therefore we did some changes which helped a bit :
1) moved LUN from R5 to R1 2) Reduced the IOsize from 64kb to 16Kb - this improved the latency and service time by a huge factor. (before we were seeing around 4 ms now it has reduced below 1ms). IOPS spikes to 30K of IOPS on this LUN.
At this point in time service time and latency are within acceptable parameters however we are still seeing high saturation levels around 80% ( from ssmc). First question , how saturation is calculated ? . We are aware that the PDs are being hit rather hard and are constantly busy. We are now thinking of two options : either increase the number of disks (we are thinking best way how to forecast and the quantity that will suffice to lower the saturation to acceptable levels allowing for growth) or move this application to dedicated local disks.
Any ideas or insights ? Thank you
|