INTERNAL - IPC timeouts lead to NVSD crash (4045269)

Chat now with support

Get Live Help
Complete Registration

Sign In

Request Pricing

Contact Sales

Return

Feedback Submitted

Did this article solve an issue for you?

Select Rating

Title

INTERNAL - IPC timeouts lead to NVSD crash
Description

SmartDisk appears to "crash" or go offline and needs to be restarted. The failure is characterised by the following messages appearing near the end of the monitortrace file:
Slave has exited unexpectedly with status 0
Slave seems to have exited unexpectedly!
Coupled with corresponding messages in the PercolatorSlave trace file indicating that the monitor has exited at the same time.
This message pair across processes indicate that this is a timeout scenario, rather than one process actually exiting.
Cause

NVSD will regularly check to make sure all processes are running. If this check says a process has died, then it will shutdown all other process. Additionally, the check has a timeout of 120 seconds, so if the check does not complete by that time, then NVSD will shutdown all processes.
A timeout could be due to a failure of the communications channel between the PercolatorMonitor and PercolatorSlave processes, be caused by external system conditions blocking the SmartDisk processes for a long time. For example, under high system load, particularly when there are a large number of items on the dedupe queue or a large backlog of retirement requests in the NVBU database.
Resolution

If this is purely an IPC timeout then it may be possible to temporarily work round the problem by including the following lines in $IDP_ROOT/foundation/etc/percolator.cfg and restarting SmartDisk
[CheckExitedProcess]
MaxFails=N
This will cause the PercolatorSlave to survive N consecutive timeouts before shutting down. The value N may be set in the range 1 to 20. The default is 1. Start with a low value such as 5, any higher could delay conditions where NetVault really should shutdown.
If unsure if a crash is a timeout or an exited process, then escalate to R&D for guidance.

Feedback Submitted

Did this article solve an issue for you?

Select Rating

Request a KB Article

Please select your product:

To serve you better, please complete the Purpose of your Chat:

Recommended Solutions for Your Problem

INTERNAL - IPC timeouts lead to NVSD crash (4045269)

Title

Description

Cause

Resolution

Leave a Comment