Both HA (High Availability) FMSs in the same cluster start up as primary HA servers and remain as primaries or a secondary server starts as primary taking over the cluster while the primary server is still running.
The server starts as primary and a single server identified in the cluster.
YYYY-MM-DD hh:mm:ss.SSS INFO [forge-startup] com.quest.nitro.service.ServiceMBeanUtil - Cluster Manager Service Starting...
YYYY-MM-DD hh:mm:ss.SSS VERBOSE [forge-startup] STDOUT -
-------------------------------------------------------------------
GMS: address=FMS-NODE02-33659, cluster=HA_PARTITION, physical address=IP_ADDRESS:7800
-------------------------------------------------------------------
YYYY-MM-DD hh:mm:ss.SSS VERBOSE [forge-startup] com.quest.nitro.service.cluster.jgroups.JGroupsClusterManager - Received new cluster view: [FMS-NODE02-33659|0] (1) [FMS-NODE02-33659]
YYYY-MM-DD hh:mm:ss.SSS INFO [forge-startup] com.quest.nitro.service.ha.HAMembershipService - Running as the primary server of the partition HA_PARTITION with 1 servers in total.
YYYY-MM-DD hh:mm:ss.SSS INFO [forge-startup] com.quest.nitro.service.ServiceMBeanUtil - Cluster Manager Service Started.
Depending on the configuration, it may recognize the other node after a few minutes and take over the cluster, thus triggering a fail-over of the current primary.
YYYY-MM-DD hh:mm:ss.SSS VERBOSE [jgroups-15,HA_PARTITION,FMS-NODE02-33659] com.quest.nitro.service.cluster.jgroups.JGroupsClusterManager - Received new cluster view: MergeView::[FMS-NODE02-33659|1] (2) [FMS-NODE02-33659, FMS-NODE01-45279], 2 subgroups: [FMS-NODE02-33659|0] (1) [FMS-NODE02-33659], [FMS-NODE01-45279|0] (1) [FMS-NODE01-45279]
YYYY-MM-DD hh:mm:ss.SSS INFO [jgroups-15,HA_PARTITION,FMS-NODE02-33659] com.quest.nitro.service.ha.HAMembershipService - Running as the primary server of the partition HA_PARTITION with 2 servers in total.
CAUSE 1
Incorrect configuration.
CAUSE 2
Cluster nodes not able to communicate with each other.
CAUSE 3
Delays in communication can cause a secondary node to take over of the primary role in the cluster during startup and trigger a fail-over once communication is re-established.
RESOLUTION 1
Review configurations under $FMS_HOME/config/jgroups-config.xml for both the HA Management Server Primary and Secondary.
For additional information refer to the Foglight - High Availability Field Guide.
RESOLUTION 2
If using TCP, confirm communication is possible on port 7800 bidirectional between the HA Management Server Primary and Secondary servers (default port).
RESOLUTION 3
Increase the join_timeout property from 2000 to 5000 in $FMS_HOME/config/jgroups-config.xml
Change the following line:
<pbcast.GMS print_local_addr="true" join_timeout="2000" view_bundling="true"/>
To:
<pbcast.GMS print_local_addr="true" join_timeout="5000" view_bundling="true"/>
© ALL RIGHTS RESERVED. Feedback 利用規約 プライバシー Cookie Preference Center