INTERNAL - How to maintain the health of SmartDisk and improve general (dedupe and GC) performance?
Striking the balance between time spent Deduplicating, and time spent performing GC is key to the health of NVSD. The following was included on the NVSD 1.6 release note;
"Changed the default Garbage Collection window to occur overnight (18:00 through 06:00). (This default only applies to new installations of NVSD.) (NSD-183)", and "Note: If Garbage Collection has work to do, which occurs whenever data has been retired, it retains priority over deduplication. Beginning with v1.6, Garbage Collection also retains priority if a deduplication process has been run. For new and upgraded installations of NVSD, Quest Software recommends that you update the Garbage Collection window to match the backup window. You might also consider setting Garbage Collection to start approximately 30 minutes before backups start, which discourages new deduplication processes from starting and overlapping with the backup window."
We would recommend that GC has priority for at least a third of the week (56 hours) ideally during the backup window.
Ways to improve dedupe and GC performance:
Dedupe technologies work by taking advantage of runs of duplicate chunks to save time and space. This comes at the cost of breaking each stream into little chunks and managing the many chunks. When theirs is a high amount of duplication (between generations and between different backups) the benefits outweight the costs. When there is a small amount of duplication the costs outweigh the benefits. Fairly similar streams generate little churn and new work for dedupe and GC.
Dissimilar streams create a lot of churn and work for both dedupe and for the later GC. Writing new unique chunks and freeing chunks that were not seen again. One small unique stream can take as long to dedupe and GC as a much larger fairly-duplicate stream. Where dedupe ratios are low, it is best to just use staging and spend the money that would have been spent on dedupe technology licensing on more disks.
To use dedupe effectively, a very good understanding of the backup contents and how they relate to each other is needed. How well do we we expect different generations of each backup to dedupe against each other? Why? Are they fulls or incrementals?
How well do we we expect different backups going to the same store to dedupe against each other? etc. Do the dedupe ratios in the stats match expectations, or is something broken? Eg. accidental encryption or compression could make every backup stream completely unique and non-dedupable. As incrementals are already only changed files, sometimes these dedupe very poorly, depending on the changes. It can be better to just stage incrementals, and only dedupe fulls. We have recently opened Bug 21586 to make it easier to analyse dedupe ratios for individual streams.