The symptom is that sp_tconf defunct process may appear after activation finishes and when any command is issued at sp_ctrl. Further debug shows sp_cop is stuck in lockf and the vardir is on a NetApp Filer (nfs).
The issue with NFS-mounted vardirs is that SharePlex uses file-locking, and file-locking on a nfs-mounted filesystem involves the RPC lock-daemon on both the NFS client and server. Because the lockd communication is UDP, the individual packets that make up the lock requests and releases can be lost. This will result in a lock request that hangs forever. Since the loss can occur on the NFS server side or client side machine, it is difficult to determine where the packet was lost or how to recover without stopping/restarting the lock daemon.
RESOLUTION 1:
llock option controls the behavior of the NFS client during file locking operations. Typically NFS uses the Network Lock Manager (NLM) feature to handle file locking rather than using the local file locking capabilities of the host OS. On Solaris when using NLM, NFS client disables all NFS file caching if an application locks a file. This severely impacts the performance of I/O operations to a file that is locked since no I/O is cached in the file system buffer cache. Applications that rely on the file system cache for performance, as is the case with most 32-bit applications, are negatively impacted when file locking is used with NFS. The 'llock' NFS mount option was introduced to address this performance issue by utilizing local file locking rather than NLM. Since local file locking is used to arbitrate file lock requests, the file system cache can be used to satisfy I/O requests.
To enable local file locking on NFS mount:
use 'nolock' NFS mount option for the shareplex var dir mountpoint on Linux environment.
or
use 'llock' NFS mount option for shareplex var dir mountpoint on Solaris.
Please consider using the following mount options if placing shareplex var dir on NFS mounted file system on Linux
rw,noatime,hard,tcp,rsize=32768,wsize=32768,intr,timeo=600,retrans=2,nfsvers=3,nolock,suid
RESOLUTION 2:
Place shareplex prod dir and var dir on regular files system instead of NFS mounted file system. At minimum, shareplex var dir subdirectories data, config, state need to be on local file system and you can use soft link to accomplish this.
© ALL RIGHTS RESERVED. Terms of Use Privacy Cookie Preference Center