What is Data Deduplication?
AppAssure and Rapid Recovery (RR) performs Global Deduplication across the backup data from all protected machines. The deduplication process begins when an agent sends data from a protected machine to the Repository. The Core compresses, encrypts and deduplicates incoming data before sending it to the Core’s storage. The Core calculates a checksum for each incoming block after compression and optional encryption. The checksum determines if the Core has “seen” the block before from any of the multiple servers AppAssure/RR is protecting on the network, hence the name Global Deduplication.
If the block is unique, it’s written to the storage, but if the block already exists in the Core’s repository, the Core then replaces the discarded block of data with a pointer to its identical block in storage. This process of deduplication in conjunction with compression would result in more than an 80% reduction in storage requirements.
Global Deduplication Benefits:
For more information on this topic, please see:
Many factors impact data deduplication. The following table provides factors to consider that may yield higher/lower compression ratios based on the type of data:
Sizing your Deduplication Cache:
To ensure achieving the best Deduplication ratio which will optimize the repository storage and disk I/O it is recommended to size the Deduplication Cache properly when you first deploy the AppAssure Core Server. The Deduplication Cache (Dedupe Cache) is a table where all hashes are stored during deduplicating new incoming data, if you decide to resize the Dedupe Cache after taking backups the benefits of the added size to the Dedupe Cache will only be useful for new incoming data by making more space to store new hashes, so it is recommended to set a sufficient size during the deployment phase.
The default Dedupe Cache size of 1.5 GB is enough to cover 800 GB worth of Deduplicated data (unique data). Assuming that the total deduplication/compression ratio is 50%, the 1.5 GB is estimated to be sufficient for 1.5 TB of raw protected data. If your total protected data is more than 1.5 TB use the following calculations to determine the amount of Dedupe Cache size requirements:
The storage space for Dedupe Cache files should be a minimum of 40 GB
For more details regarding changing the Dedupe Cache size and settings please refer to Configuring Deduplication Cache Settings in the User Guide.