HPE Storage Users Group

A Storage Administrator Community




Post new topic Reply to topic  [ 9 posts ] 
Author Message
 Post subject: 3PAR VV and dedup
PostPosted: Tue Sep 25, 2018 8:50 am 

Joined: Thu Jun 28, 2018 8:26 am
Posts: 13
Helly everybody,

Is there a way to evaluate the real dedup for a VV regardless their snapshots or available space ?
Because "Capacity Efficiency" values showed on SSMC is spectacular but it's just useful for marketing pitch.

Multiple snapshots, zeroed space, or available space vs "virtual sized" in a VV cannot be considered when we speak of real space saving.

See this very interesting article:
http://www.joshodgers.com/2016/02/08/th ... cy-ratios/


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR VV and dedup
PostPosted: Wed Sep 26, 2018 3:46 pm 

Joined: Wed Nov 19, 2014 5:14 am
Posts: 505
It sounds like you might be confusing compaction with Data Reduction ratios, dedupe is calculated and displayed at the CPG level where multiple volumes share common data.

In previous releases this could be displayed at the VV level but could get confusing due to the need to load factor the ratios for shared common data. So this was removed and is more accurately displayed at the CPG.

The below may answer your questions in terms of what values are used and how each is calculated.

See capacity efficiency - page 12 onwards.

https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00006358enw

You could possibly get an approximate per VV ratio by using the host written vs stored in the VV, but that wouldn't be entirely accurate as your not factoring in the shared space.


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR VV and dedup
PostPosted: Thu Sep 27, 2018 1:41 am 

Joined: Thu Jun 28, 2018 8:26 am
Posts: 13
Hello,

Thanks for your reply,

I find it abnormal that we can not measure "real" deduplication at volume level, because it can be very useful when we have to evaluate accurate sizing for futur. global dedup at CPG level is ok but not enough in certain circumstances.

For example: thin provisionning don't have to be include in compaction ratio, sorry but it's a nonsense.

You will find below a good article that resume what i think about Storage Data efficiency ratios boasted by some manufacturers.

http://www.joshodgers.com/2016/02/08/th ... cy-ratios/

Regards

Daniel


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR VV and dedup
PostPosted: Thu Sep 27, 2018 5:41 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
speedypizz75 wrote:
Hello,

Thanks for your reply,

I find it abnormal that we can not measure "real" deduplication at volume level, because it can be very useful when we have to evaluate accurate sizing for futur. global dedup at CPG level is ok but not enough in certain circumstances.


I would like to challenge you on this. If you have a volume with 50% dedupe data (2:1) but all of that data dedupes towards one or multiple other volumes in the CPG, how should it be presented? If you only dedupe that volume, the ratio is 1:1 but in that CPG it is 2:1.

If you want numbers for a single volume, you could simply do a dedupe estimate (checkvv - dedupe_dryrun or something) to get the value for only that volume, but it will require the 3PAR to read the entire volume from disk to estimate.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR VV and dedup
PostPosted: Thu Sep 27, 2018 2:17 pm 

Joined: Wed Nov 19, 2014 5:14 am
Posts: 505
Variants of that article have been doing the rounds for a few years from many different vendors, for as many different reasons :-) Usually it's linked to competitive guarantees based on data reduction ratios and trying to undermine their competitions numbers. However only a few vendors continue to rely on thin provisioning etc to reach their stated numbers but they do sometimes play fast and loose with other measuments. If you don't trust the numbers simply compare the front-end host data to the back-end stored data.

As linked to in the white paper compaction is a measurement that includes thin provisioning and by inference zero detect. Dedupe and compression ratios don't include these as the resultant ratios would start off artificially high and degrade rapidly. However you might still be interested in thin efficiency if you aren't 100% flash and so can't take advantage of those features.

Quote:
• The compaction ratio is how much logical storage space a volume consumes compared to its virtual size and applies to all thin volume types.
• The dedup ratio is how much storage space is being saved by deduplication on deduplicated or deduplicated-compressed volumes.
• The compression ratio is how much storage space is being saved by compression on compressed or deduplicated-compressed volumes.
• The data reduction ratio is how much storage space is being saved by the combination of both deduplication and compression.

Similar to MammaGutt's example - it's an extreme case but illustrates the point:

Consider you have 2 identical volumes in the same CPG, neither contains intra volume data that can be deduped so each volume individually would report a 1:1 ratio. However inter volume dedupe would occur as the two contain identical data. As such CPG dedupe ratio reports 2:1, as yes you are saving 50% across both volumes rather than within the individual volumes. Delete one of the volumes and you're back to 1:1 for both measurements. Add in a few more partial copies, overwrite some of the data etc and the picture gets even more confusing. Load factoring attempted to represent these savings at the volume level, but could lead to wide variations based on what was happening on the array at any given time.

Unless you understood in detail the inter volume dependencies (which is impossible to track outside of a carefully controlled test environment) then the numbers were open to interpretation. Hence why it was removed as trying to load factor each volume meant estimated space per volume could change drastically and caused lots of confusion around space reporting as volume and CPG often didn't tie up.

Edited for clarity - first draft was from my phone :-)


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR VV and dedup
PostPosted: Wed Oct 03, 2018 8:03 am 

Joined: Thu Jun 28, 2018 8:26 am
Posts: 13
Hello,

Thank you very much for your example and opinion.

I think it's often question of interpretation. :)

In the same context, i will expose my question:
we host in our ESXi clusters almost twenty Postgress VM --> 20 Environnements, 1 for prod, 1 for test, 1 for dev, and so on.

each VM reach 500go data used (datastore View), actually these VMs are hosted in three differents datastores, each Vmware datastore correspond to one thinly/dedup VV 3par (no compression enabled).
these 3 volumes are on the same CPG (fullflash / Raid6).

What do you think ? regarding dedup ratio , is it better to host all PG VMs on the same volume or it's not important.

Thanks for your reply

Daniel


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR VV and dedup
PostPosted: Wed Oct 03, 2018 10:04 am 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
The dedupe mechanism in 3PAR works on a CPG level (same as all thin LDs). As long as the volumes is in the same CPG the 3PAR couldn't care less (from a dedupe perspective) if the VMs are stored on the same VV.

Adding to this, Vmware might from a performance view care as each datastore have one IO queue.

And databases in general (unless flat file) generally don't dedupe well, but offers high compression ratio.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR VV and dedup
PostPosted: Fri Oct 05, 2018 4:14 am 

Joined: Thu Jun 28, 2018 8:26 am
Posts: 13
Thanks Mammagutt,

We have 20 Postgressql VMs with almost 90% of same data DB in every VM, do you think the dedpup ratio souldn't efficient ?

What do you think about 3PAR compression feature since 3.3.1 MU2 (we use 2x 3PAR 8400 with 4 nodes in each one) ? because few months ago i think there was bigs problem when compression was enabled, we never had enabled "compression" on our VV, because we own HPE 3PAR since 1 year not more. But we want use "compression" without risk.

Best regards

Daniel Pizzolante


Top
 Profile  
Reply with quote  
 Post subject: Re: 3PAR VV and dedup
PostPosted: Fri Oct 05, 2018 12:32 pm 

Joined: Mon Sep 21, 2015 2:11 pm
Posts: 1570
Location: Europe
My understanding is that they seem to have gotten there with MU2. Data compaction (dedupe, compression ans whatever's next) is complex stuff. 8400 has a lot less CPU power than the flash optimized versions (8440/8450) and compression is CPU intensive.

You should check your databases. My experience is that there are tiny differences on every 16kB block.. Not sure why or what but it's there.

_________________
The views and opinions expressed are my own and do not necessarily reflect those of my current or previous employers.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 


Who is online

Users browsing this forum: Google [Bot] and 40 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group | DVGFX2 by: Matt