When running Foglight HA mode frequent unexpected HA failovers occurs. The FMSes have large number of Oracle/SQLServer agents. Could this be related to the problem?
The database agents can generate a large number of topology changes. The Foglight HA mechanism is also used to propagate topology changes from the primary to secondary FMS. If the HA mechanism is busy propagating these changes the FMSes can fail to response to liveness checks from each leading to failovers occurring.
Disable the Top SQL collections in the Oracle and SQLServer agents. These collections are a major source of large number of topology changes.
For those experienced in reviewing a FMS diagnostic snaphot this will allow to confirm whether there are large number of changes in the DBO_Top_SQL* and/or DBSS_Top_SQL* objects.
Enhancement request FGL-13161 has been submitted to Development for consideration in a future release of Foglight