Shareplex replication is used to keep a high availability secondary server ready for planned or unplanned failover, should the need arise. While an unplanned failover can occur without any prior notice, the planned failover is usually scheduled in advance, as the name suggests. There are certain factors which should be adhered to, that would make the planned failover a seamless exercise. While an unplanned failover has an element of surprise in it, the failover preparedness that is discussed in the Resolution section should also help mitigate some of the issues that can delay the unplanned failover.
Best practices when failing over to HA environment.
In order to be able to do a planned failover to secondary, the following requirement/condition should ideally be met:
1. The tables on secondary should not be out of sync with those on primary. If there are any out of sync prior to failover, then after failover the erstwhile secondary (now primary) cannot be assumed to be a trusted source. Once the applications fail over to the new primary and start sending data to the new secondary, it will not be possible to recover the data lost. This is the reason the out of sync tables need to be sync'd up prior to failover. If the config file is not very large and/or the # of tables in replication are few, one can proactively run compare to fix out of sync, if any.
2. Make sure that all the processes for replication from primary to secondary are running and the queues are empty before making the switch. If this is not the case, the failover will be delayed.
3. Make sure that there are no issues with any of the Shareplex processes in the reverse replication, namely from the secondary to primary. While it is not possible to start the stopped processes like Post on primary or Export on secondary to ascertain this, one may want to visually check the processes like Capture to make sure that they are running and current with Oracle, and if not, one may want to wait for that to happen prior to failing over to secondary. Use "show capture detail" to see if Capture is current with Oracle.
4. If the Export queue on secondary or Post queue on primary have messages in them prior to the failover to secondary, then such messages need to be removed from the queue. The messages should not belong here and are most likely due to direct user activity on secondary prior to failover to it. Remove such messages using qview utility (see solution # SOL540 for details or contact Support) since allowing these messages to post to the erstwhile primary (after failover) will result in out of sync.