What disks should be used for SmartDisk for top performance?
The correct answer is often overlooked but it is in the installation guide in section 2.2.2a.
In the next sections the roles a volume can perform will be considered given the information in the installation guide and experience…
The content index (also called the disk index) is a collection of files that describe the backup segments stored in SmartDisk. The documentation says “fault-tolerant disks with good random-access performance”. Fault tolerant because loss of the index results in SmartDisk having no record of it’s contents. Good random access performance, well this index is not queried that regularly and the files are small so this is not too critical.
The Staging Pool is where backups are received into SmartDisk and if they are not going to be deduplicated, this is where they will stay. Typically files in here are large and sequential. The documentation says this about the disks “fault-tolerant disks with good streaming performance”. Fault tolerant so the backups are not lost if a disk fails, good streaming performance because files are going to be sequentially written and then read from the disks. Raid 5 usually handles this sort of I/O well, It is possible that the array tries to write sequentially on each head to give good streaming performance. Almost all benchmarks of raid 5 disk arrays are for large sequential file access and not random I/O because the random I/O figures will be much worse.
The Chunk Index provides a means to determine the page that a chunk resides in, if at all. This is queried randomly by generating a hash from the chunk details and performing a read of the block that is obtained by the hash. It should be clear that hash algorithms are designed to give as even a spread as possible across the result range and therefore the output and resulting disk seek will be completely random. In an ideal world this would be done in memory but with the size of the index at 3% of the Chunk Store, given a maximum supported Chunk Store size of 15Tb means the index will be 472G which is not something that you can cache. The documentation says “fault-tolerant disks with good random-access performance”. A number of customers have implemented a set of mirrored disks to this purpose and the results have been very good every time. Note that the currently supported size of the Chunk Store at 15Tb is something that may increase and therefore some allowance for growth of the Chunk Index is advisable. Note raid 5 handles this poorly, this will be explained.
The Chunk Store holds the deduplicated data which by definition will almost certainly be fragmented. The reuse of data that was previously saved by a different backup is one reason. Accessing the data by querying the Chunk Index to find the page that the chunk resides in doubles the I/O to get each chunk also means the I/O is more to read deduplicated data. The documentation says “fault-tolerant disks”, good random I/O performance will help as well.
By default the installer places the Chunk Index on the Storage volume. This can simply be relocated to another volume by denying the Storage volume this role and adding another volume, but this can only be done this easily between installation and the first backup being received because the objects are not created until they are needed. Once the Chunk Index exists there is no official method for moving this but Support can help you do it but there is a slight risk of loss of the index and having to rebuild it if any mistakes are made during the move.
Raid levels
Raid 5 is very popular because it gives protection at the least price point. The true cost is performance.
Under raid 5 with ‘n’ disks in an array, n-1 disks will be involved in every I/O. During sequential I/O with minimal head movement and data sequentially written to the disk, performance can be good. But random I/O (like the Chunk Index or Chunk Store) means randomly moving n-1 heads for every read means that the I/O is not complete until the head that moved the furthest has completed. As a one off operation this is still not much of an issue, but deduplicating or restoring a single 150G backup can result in 1 million chunk index lookups. Under these conditions it is not unusual to observe the array 100% utilised and returning about 3Mbps not the 300Mbps it’s rated at. Scale that to 12 parallel deduplications and put the Chunk Store on the same array and poor performance is very likely.
SmartDisk is performance tested by development on raid 1+0 arrays (mirrored stripes) because unlike raid 5 the controller only needs to do one I/O per read and it will use the disk with the head that is closer to the data. The array still provides redundancy, performs better than a single disk but uses more disks compared to raid 5.
Mirrored raid 5 (Raid 50)
This setup seems to be becoming more popular but taking the poorest performing array (for random I/O) and mirroring it might be a false economy. Raid 50 means you have to still move as many heads as you did with raid 5, but there should be a choice for every head of a shorter seek. You are still moving lots of heads and the worst performing head is still the limit. The redundancy protection is better (if the controller is not the point of failure) but the performance is only marginally better and you have enough disk space to implement raid 10 with far superior performance.
Putting this into perspective, considering the possibility of data loss with raid 10, the node that is being backed up has to fail and two matching disks on the raid array before data loss occurs. By observation it is more likely that the raid controller fails and all the backed up data is lost. If there are enough disks to implement raid 50, there are enough to implement raid 10 and raid 10 seems to be much faster.
Garbage Collection
Do not under estimate the importance of GC and the load on the system from this operation. If not enough time is allocated or if GC does not complete, the storage volume will eventually fill and then stage will quickly fill and result in outage. If this occurs over the course of months recovery is going to be slow and painful. A full discussion of GC is beyond the scope of this article but the performance of GC is critical and highly depends on the disk performance issues being discussed in this article. Expect your SmartDisk node to be garbage collecting more than 50% of the time because GC is more intensive than deduplication. Note that GC is responsible for cleaning the Chunk Store so only impacts deduplicated backup streams. GC was improved in version 1.6 and should be considered the lowest recommended version for large installations because of this.
In summary the four roles and suggested type of array are;
Disk Index (content index) has a low footprint and can be located on any type of disks.
Staging Pool needs an array that can do good sequential I/O, raid 5 is suitable for this.
Chunk Index should be on it’s own mirrored array not raid 5. Placing this on a volume performing another role will seriously impact the other role.
Chunk Store can be on raid 5 (or 50 if it is perceived that this is better) but raid 10 is my recommendation.
Note the official answer to these questions are as quoted here from the install guide and they still apply, any recommendations in this article are suggestions based on a Senior Engineer’s observations and experiences supporting SmartDisk since release.