When trying to start Shareplex on a server that has other Shareplex instances running, Shareplex fails to start with the following errors showing up in the event_log (any of the Shareplex processes may complain about semaphore errors depending on whether it is a source or target instance):
System call error: Software caused connection abort tcp server accept [sp_cop/8884]
Error: tcp error - exiting [sp_cop/8884]
Notice: Result (22) Invalid argument calling semop to get semaphore (3008) [sp_xport(shs)/8904]
Error: sem_enter: sem_num = 2 semval = -1 semncnt = -1 semzcnt = -1 pid = -1 ctr = 126 [sp_xport(que)/8904]
Error: sp_xport: failed to read export queue - exiting [sp_xport/8904]
Notice: Result (22) Invalid argument calling semop to get semaphore (3008) [sp_ocap(shs)/8899]
Error: sem_enter: sem_num = 1 semval = -1 semncnt = -1 semzcnt = -1 pid = -1 ctr = 126 [sp_ocap(que)/8899]
Internal error: 10403 - que_write() que_SEMERR: Semaphore error [sp_ocap/8899]
Notice: dumping data to /splex/varcred2500/dump/opc-0 [sp_ocap/8899]
Notice: Result (22) Invalid argument calling semop to get semaphore (3008) [sp_ordr(shs)/8905]
The error can also happen even when Shareplex is running and some activity like activation is carried out. Though there may be no response to the activity (like activation), the event log entries would reveal the problem with shared memory.
Insufficient shared memory, or due to many Shareplex instances running on a server and interfering with each other when acquiring shared memory/semaphores.
WORKAROUND 1:
Kill any orphaned Shareplex processes and then try to launch Shareplex again.
WORKAROUND 2:
Remove or rename the shmaddr.loc and shstinfo.ipc files located in $SP_SYS_VARDIR/rim and restart Shareplex.
WORKAROUND 3:
Remove hung semaphores as root using the following as a guideline before starting Shareplex again:
1. sp_ctrl> shutdown ----> please ensure that all the sp_ processes are dead.
2. cd to the $shareplex/var/rim directory
# od -x shmaddr.loc
# od -x shstinfo.ipc
This will generate a set of numbers, like below:
od -x shmaddr.loc
0000000 0000 1556 fb00 0000 4452 17ca 0200 0000
0000020 0009 0014
0000024
od -x shstinfo.ipc
0000000 0000 1555 fd40 0000 4152 17ca 0040 0000
0000020 0009 0013
0000024
Remove the shared memory segments by looking at the 7th set of numbers on the od -x output. For example, the 7th set of numbers from above was: 17ca. One would then do an ipcs -a | grep 17ca command to get the list of shared memory segments. This will give an output like the following:
m 5461 0x415217ca --rw-r--r-- root dbaerp root dbaerp
5 4194304 25086 2465 10:30:01 10:30:01 5:41:13
m 5462 0x445217ca --rw-r--r-- root dbaerp root dbaerp
5 33554432 25086 2465 10:30:01 10:30:01 5:41:13
s 589843 0x415217ca --ra-ra-ra- root dbaerp root dbaerp
12 10:30:01 5:41:13
s 589844 0x445217ca --ra-ra-ra- root dbaerp root dbaerp
2 10:30:22 5:41:13
Use the ipcrm command to remove the unwanted memory segments. In this example, one would do the following:
ipcrm -m 5461
ipcrm -m 5462
ipcrm -s 589843
ipcrm -s 589844
WORKAROUND 4:
Check the paramdb file located in $SP_SYS_VARDIR/data and see if the parameters SP_SHS_IPCKEY and SP_QUE_IPCKEY are configured and if not, configure them as the following parameters explains:
The parameters SP_QUE_IPCKEY and SP_SHS_IPCKEY are configured to ensure each sp_cop has its own unique shared memory and semaphores. When one runs multiple instances of Shareplex on the same server, these parameters may be helpful. Here is the description of the parameters:
SP_QUE_IPCKEY
This parameter creates a unique key for the SharePlex queues. This value must be different than the value set for the SP_SHS_IPCKEY parameter.
Default: D
Range of valid values: a character string
Takes effect: as soon as it is activated
SP_SHS_IPCKEY
This parameter creates a unique key for the statistics shared memory segment and semaphore. This value must be different than the value set for the SP_QUE_IPCKEY parameter.
Default: A
Range of valid values: a character string
Takes effect: as soon as it is activated
Either shutdown Shareplex and enter them directly in the paramdb file using a text editor or can follow the procedure below to do that:
1. Within sp_ctrl, set the following parameters:
sp_ctrl> set param SP_QUE_IPCKEY A
sp_ctrl> set param SP_SHS_IPCKEY a
2. Shutdown Shareplex.
3. Verify that there are no processes or shared memory issues.
4. Start Shareplex.
Likewise one can configure these two parameters on each of the Shareplex instance. In doing so one may want to ensure that the values set for one instance are not repeated for another one. For example, we set the values A and a for these two parameters for the first instance of Shareplex. For the second instance, one may want to set another set of characters, say B and b, and so on.
WORKAROUND5:
Check the paramdb file to see if the values of the parameters SP_QUE_SHMDBUF and SP_SHS_SHMSIZE are higher than the default for ones version of Shareplex and set them to their default values if so and try starting Shareplex again.
WORKAROUND6:
If none of the above workaround is successful, bounce the server if feasible, and then try starting Shareplex before contacting Support for any other workaround.
The resolution given can be tried for any shared memory issues since it will not harm anything since cop will just re-create the shared memory. But be careful to make sure when removing shared memory to remove the correct one.