Only two backups have this issue e.g. NDMP and Filesystem backups and these two backups are on the NetVault Server. All other backups are from NetVault Clients and they don’t have this issue. 90% of the time of these two backups have this issue.
The backups are either sometimes hanging at the beginning or at the end of the backup, so won’t see the Index in the Restore window if the backup is hanging at the beginning.
These two backups are smaller that the NetVault Client backups...
NDMP: 5GB
FS: 1.1GB
The backups are performed to a supported HP MSL 6000 tape library that is attached to the Filer via a FC Switch. Both the tape library and the Filer are in the same FC Zone and are not connected to the NetVault Server.
The NetVault Server was upgraded from 8.5.3 to 8.6.1 Build 9 and was rebooted. The NetApp Filer was upgraded from 7.3.6 to 8.0.2 7-Mode and was also rebooted. The NDMP Plugin is on 7.6.3. All versions are supported. The problem only happened after the upgrade.
I could confirm the following messages for NDMP Job ID 422 before the job was aborted, as no data was being transferred:
Job Message 2012/02/01 09:14:59 422 Media Netvault (Koetz) Media in 'DRIVE 3:s1800stp001' assigned to job ready for data transfer
Error 2012/02/01 09:16:51 422 Jobs Netvault Fatal error: Aborted by user
From the NetVault Server Job Manager Trace we see the following, but is at a different time from Trace (nvjobmgr305):
2 JOBMGR ??? 4692 100 0 91500086551 Report job status 'Writing to media' to schedule manager
2 JOBMGR ??? 4692 62 0 91500086551 Plugin is now in state Writing to media
2 JOBMGR ??? 4692 100 0 91651189039 Report job status 'Aborting' to schedule manager
2 JOBMGR ??? 4692 221 0 91651189039 Set exit status to 3
2 JOBMGR ??? 4692 168 1 91651189039 Fatal error: Aborted by user
From the NetVault Server NDMP Trace we see the following, but is
The problem was to do with the Teaming. This is a HP Server with two NICS and has within the Windows > Control Panel > HP Teaming Agent and has a choice on how the Teaming can be setup.
When the backups from the NetVault Server had an issue, the Teaming was setup as "Automatic" and being able to send data with 2GB and receiving only with 1GB - This is what restricts the NetVault messaging / data channels.
Once the Teaming was set to "Switch assisted Load Balancing" (SLB), the two backups from the NetVault Server worked well.
This was changed by an engineer when the NVBU and OnTap Software were upgraded, but didn't tell the customer that this had also been changed.