Performance issues with SmartDisk. Very low CPU usage and high disk wait times. Slow deduplication and restores/consolidations. Backup performance is good.
The Chunk Index is a key component of deduplication. It is used to locate chunks whilst a stream is being deduplicated and to rehydrate a stream that has been deduplicated. Garbage collection also locates chunks via the Chunk Index.
It is important to tune the disks where the Chunk Index resides to facilitate many fast random I/Os. The use of mirrored disks or solid state disks is advised. Contact BakBone Support if you need to move the index to a different location.
Deduplication involves splitting a backup into approximately 2K chunks. A typical 100G backup results in 50 million chunks. For each chunk a hash value is calculated, and the index is queried with this hash value. By their nature hash values are very random and this means for one 100G backup, 50 million completely random reads are going to be performed against a potential 500G index.
The disk array that holds the index will need to be carefully selected to get the best performance under these conditions. Transfer rates for large sequential data streams are not applicable to this type of usage.
To rehydrate this backup stream involves reading the manifest to get the list of chunks, then the Chunk Index is again read for the 50 million chunk locations before the chunks can be read from the chunk store.
The Chunk Index is typically 3% of the size of the Chunk Store. Given that the largest supported Chunk Store is currently 15T, a Chunk Index can be as large as 500G.
© 2025 Quest Software Inc. ALL RIGHTS RESERVED. Terms of Use Privacy Cookie Preference Center