Standard backups of protected machines are taken based on the protection schedule established. For example, every hour, snapshots are taken of each agent machine and the Exchange mail server. The information flows through the switch or router to the Core, and the Core transmits the data to the repository on the NAS, where information is deduplicated across all agent machines before it is saved. This standard flow is depicted in the diagram above.
NAS Performance Issues versus Direct-Attached Storage or Storage Area Network
If the repository is located on Direct-Attached Storage (DAS), the two potential points of failure include the hardware and the software. If the repository is on network-attached storage, troubleshooting complexity increases. When issues arise, the network itself is another potential culprit, and hardware troubleshooting complexity is increased with the addition of another family of hardware devices to investigate (the NAS device itself).
These all have read or write operations that have a more significant impact on network performance than a repository using DAS or a Storage Area Network (SAN). As such, putting the repository on a NAS places a substantial load on the NAS, with the network acting as a bottleneck for the many required read and write operations. Total throughput is lower than can be achieved using DAS or SAN. This should be planned for accordingly, and provides another avenue of investigation when issues are evident.
Considering the heavy load of read/write operations, the NAS device used as a repository in a Replay 4 or AppAssure 5 environment should be dedicated to this purpose alone. If experiencing NAS performance or connection issues, consider re-tasking any other operations, such as file sharing, to other devices.
Quest recommends system memory of at least 8 gigabytes.
If you experience NAS issues are intermittent, check that any mount failures may be occurring at a time when there is particularly high I/O activity. For example, there could be multiple processes occurring such as agent backups during VM Export, or nightly job occurring while also performing rollups, etc. Intermittent failures that are caused by too much I/O traffic are difficult to diagnose and may appear as unrelated issues such as replication failures or Exchange log truncations, but that are actually symptoms of an overtaxed NAS with too many I/O operations being attempted simultaneously. Changing the order of some I/O operations or otherwise reducing I/O is appropriate in these cases. Refer to troubleshooting steps. After trying other steps, consider reducing the rate of transfer speed to allow the NAS to catch up, as described below.
The quality of the NAS is a major factor for ongoing success in the enterprise. When choosing a NAS, consider that up-front price may not be as important as the capability of the device. Lower-end NAS devices may have a much higher total cost of ownership when considering downtime, future upgrades, or lackluster performance. If using a NAS, Quest recommends enterprise-grade network attached storage for best performance. The higher-end NAS hardware that you use, the less likely you are to encounter NAS hardware issues (providing reasonable network load and environmental factors). Consider a device with features such as redundant Gigabit Ethernet connections or 10Gbit Ethernet connections; consider whether you need access to the Fibre Channel storage-area network (SAN). Consider if the NAS appliance allows you to upgrade capacity. Perform research before purchasing a NAS if possible; considering searching the internet using a phrase such as Guide to Network Storage.
For a NAS device to be supportable, the data saved to the repository must remain in the exact state in which the Core stored it. For this reason, for AppAssure 5, Quest does not support NAS devices that have their own built-in deduplication features if those features are enabled.
Sufficient input/output (I/O) transfer speed will yield the best results for backing up to the repository. Quest recommends hard drives of at least 7200 RPM with good access speeds. For transfer speeds, Quest recommends transfer speeds of at least 30 megabytes per second, with a minimum of at least 10 MB per second. If transfer speeds appear to be below 10 MB/second, the issues are most likely to be (a) a result of insufficient hardware, (b) hardware that would be sufficient but is being too heavily tasked (multi-purposed, or poor with multiple operations), or (c) a network that is saturated and is acting as a bottleneck for the transfer.
AppAssure 5 administrators should be aware that NAS devices are susceptible to the same environmental stresses as other systems on the network. Factors that affect network performance include number of concurrent users, network load, number of operations, frequency of backups, and other issues familiar to network administrators. You may consider optimizing your retention policy.
About Storage in the Cloud
A repository is used to store the snapshots that are captured from your protected workstations and servers. The repository can reside on different storage technologies such as Direct Attached Storage (DAS), Storage Area Network (SAN), or Network Attached Storage (NAS). However, the primary repository should never be stored on NAS devices that tier to the cloud. These devices tend to have performance limitations when used as primary storage.
Cloud storage can be used for a replicated core in the cloud. The source core is typically located within an enterprise, stored on a DAS, SAN, or NAS. The replicated core (target) can be stored on the cloud (for example, hosting a replicated core using a service such as eFolder or Amazon Cloud).
The following recommendations suggest optimal configurations depending on the size of the environment:
Troubleshooting NAS Issues
When experiencing NAS issues, follow the procedures below:
Modifying Transfer Settings
In AppAssure 5, you can modify the settings to manage the data transfer processes for a protected machine.
General NAS Troubleshooting Tips
Many NAS issues are difficult to diagnose, often seeming to indicate other errors. Generally, NAS issues are presented by connection issues between the Core and the NAS device. Investigation of such issues then can reveal environmental factors (slow network, flooded network utilization, and so on) that help point out the problem. Because errors differ dramatically with each NAS, try to test and isolate results.
Ensure that the NAS is dedicated to holding the repository for exclusively. Since saving Replay 4 or AppAssure 5 snapshots to the Core is I/O-intensive and since a NAS requires this traffic to flow through the network, key steps to resolving dropped connections when using a NAS are ascertaining that the NAS operations are strictly dedicated to supporting the repository only.
Reboot the NAS, and if connection problems recur, consider evaluating whether the NAS device is robust enough for your environment.
To test if the issue is the NAS device, you can create a second repository on a network share on a different server. If you see the primary repository go offline again, but the new repository does not, this is an indicator that the issue is the NAS device.
You should also investigate the network as a potential cause of dropped connections or slow performance.
You may consider using a SAN instead of a NAS as your storage approach for the repository. Note also that many NAS devices can be used as direct-attached storage if network problems persist.
Frequently Asked Questions
Q: Why does my NAS device sometimes drop its connection?
A1: Flooding the NAS with read/write requests can cause it to drop from the network. Ensure the only function of the NAS is to serve as the primary repository and ensure that it is not tiered to the cloud.
A2: Another cause is defective hardware. The NAS may be suffering from a degrading disk or a faulty network card which causes these drops. This can be difficult to detect when everything else looks to be in operation, and small writes or transfers that users may be performing are not affected. If large transfers fail and small ones do not, look to the hardware as a potential cause.
A3: If the amount of I/O traffic is overwhelming your NAS device, you can try updating the transfer setting queue depth to slow the transfer rates, which may allow the NAS to catch up.
Q: How can I verify if my NAS device itself is the cause for slow performance or dropped mounts/repository connections?
A: Check the system and hardware logs to verify the cause of dropped connections.
One test you can run is to perform a single backup job, and then perform multiple simultaneous jobs. Some devices appear to be functioning perfectly fine with a single read/write task, but as soon as you add a second task, the I/O rates plummet.
You can also create a second repository on another network share on a different server. If you see the primary repository go offline again, but the new repository does not, then that would point to the NAS device, suggesting an upgrade or considering using SAN or DAS over NAS for your repository.
Q: If I am still experiencing problems, what is the most efficient way to get support for my NAS issue?
A: Contact your Quest Support representative by email or the web. Include the following information: