How to resolve disk space shortage
This topic helps you resolve disk space issues that can occur when something interferes with replication. For possible causes see Solve replication problems .
How to conserve disk space on the target
SharePlex captures and processes data much faster than it posts it with SQL statements on the target system, so the target is where most disk problems can occur, assuming the network is operational and data is being sent from the source. If you think the post queue may exceed its disk space, there may be enough free space on the source system to store the data temporarily until the Post queue clears out.
- Stop the Import process.
- Let the data accumulate on the source system until Post processes enough messages to clear the post queue.
- Start Import.
- Continue to stop and start Import until the amount of data accumulating in the post queue levels out.
When you implement this method, monitor the replication services and disk usage on the source system. On Unix and Linux systems, you can use the sp_ps script to monitor processes and the sp_qstatmon monitoring script to monitor the queues. On Windows systems, you can use the Sp_Nt_Mon utility to monitor those components. See Monitor SharePlex for more information.
How to restore disk space
If a queue disk runs out of disk space, you may see messages similar to this in the Event Log:
11/22/07 14:14 System call error: No space left on device bu_wt.write [sp_mport(que)/1937472]
11/22/07 14:14 System call error: No space left on device bu_rls.bu_wt [sp_mport(que)/1937472]
11/22/07 14:14 Error: que_BUFWRTERR: Error writing buffer to file que_writecommit(irvspxuz+P+o.a920a64z-o.a102a64z) [sp_mport(rim)/1937472] 11/22/07 14:14 Error: sp_mport: rim_writecommit failed 30 - exiting [sp_mport/ 1937472]
11/22/07 14:14 Process exited sp_mport (from irvspxuz.domain.com queue irvspxuz) [pid = 1937472] - exit(1)
If a queue disk is almost out of free space, you might be able to add disk space without the need to resynchronize the data.
To restore disk space
- Stop SharePlex on the affected system.
- Add more disk space.
- Start SharePlex.
View the Event Log and look for the messages "queue recovery started" and "queue recovery complete."
- If both messages are there, SharePlex resumes processing where it stopped and the recovery succeeded. If your applications generate high volumes of transactions, there may be numerous backlogged messages in the queues. Depending on the nature of the transactions, how well the target database and the Post process are tuned, and your tolerance for latency, it might be more practical to resynchronize the data instead of waiting for replication to regain parity with transactional activity.
- If one or more queues is corrupted, the Event Log records a message like this: Bad header magic... or peekahead failure. Or, you will see the message queue recovery started, but you will not see the queue recovery complete message that signifies successful queue recovery. In this case, you must restore replication an initial state.
To restore replication to an initial state
- Run db_cleansp to restore the variable-data directory and SharePlex tables. It must be run on all systems in the affected replicationconfiguration. See the utilities documentation in the SharePlex Reference Guide.
- Synchronize the data using your method of choice, then reactivate the configuration. See Activate replication in your production environment.
- You can prevent this problem from occurring again by using the SharePlex monitoring utilities to start unattended monitoring of key replication events, including queue volume alerts. See Monitor SharePlex.