Chat now with support
Chat with Support

Foglight 5.9.2 - Federation Field Guide

Topology synchronization

The Federation Master periodically retrieves topology information from Federated Children and merges it into the federation model—a unified model that represents the environment as a whole (see Model union rules ). Because the model in the Federation Master is a union of the other servers, all of the normal UI components in Foglight work as if they were on a non-federated server.

Topology information is retrieved in small chunks in previous releases, which causes the synchronization process takes significant time to complete in large environments for each synchronization cycle. The Management Server, starting with version 5.9.1, has improved the synchronization methodology to implement the synchronization in an incremental way. This means the initial synchronization process still takes much time to complete in version 5.9.1 (or later) but in the subsequent synchronization cycles, the topology object is to be synchronized from Federation Children only when its version varies from its copy existing on Federation Master. Consider for example the topology object is automatically refreshed every 5 minutes (For more information about how to change the interval of topology object refresh, see Step 9 in the Setting up a federated environment procedure). The Management Server checks topology objects on Federation Children after the specified 5 minutes, to see if they are different from the copies on Federation Master. If a topology object’s version gets changed, the Manager Server starts synchronizing the changed objects other than copying all topology objects on Federation Children.

The Federation Master automatically synchronizes certain topology types with Federated Children. For example: property, metric and observation definitions are synchronized, but type and property annotations are not. Topology types are now pulled from Federated Children at the beginning of each synchronization cycle.

The extent of the topology to be synchronized is controlled by the TopologyQueries section in the federation.config file (the default is everything).

The maximum acceptable difference in system time between Federated Children and the Federation Master (in milliseconds) is controlled by the MaxSystemTimeDifference parameter in the federation.config file (the default is 60000 milliseconds, or one minute). Large system time differences can lead to inconsistencies in metrics and alarm data on the Federation Master. Foglight does not provide time synchronization services, therefore, system time among servers in a federation must be kept synchronized by other means, for example by using the Network Time Protocol (NTP).

Model union rules

Objects that exist on two or more Federated Children are merged when pulled by the Federation Master. The union of these object instances includes all contained objects of each respective remote instance.

The following is an example of the model union rules for a given service object defined in two Federated Children:

Metric pull on demand

When you request data for a particular metric through the Foglight browser interface (for example, a time-driven rule or derivation, or a script), the Federation Master searches the Federated Children for the requested data, and retrieves the data for the specified time period.

Federation supports merging metrics from several sources:

The same set of servers is used for servicing subsequent requests for a certain period of time unless the state of the distributed data service is reset in the Federation Master (for example, by a change in the federation.config file). This period of time is (by default) set to three minutes, and is configurable via the instruction:

When the UI sends a new request, which comes three minutes after the original request time, the Federation Master re-determines the source of the metric by contacting all Federated Children again and by checking the local database for data.

Alarm service communication channel

Alarm data used in the Federated Child uses a communication channel that is separate from metric data. While metrics pull data directly from the Federated Children, an intermediate alarm service actually gathers and queues alarm data for use by the Federation Master. Therefore, when the Federation Master requests alarm data, it is actually gathering that data from the alarm service queue, rather than from each of the Federated Children directly. The alarm service is used to help prevent excessive queries of the Federated Children, in order to calculate time-driven rules and derived metrics (on the Federated Child) that depend on alarm data.

The impact of this is that the refresh interval of the alarm service dictates how long it takes for Federated Child alarms to appear on Federation Masters.

In Foglight 5.9.1, the refresh interval is configurable. This leads to delays like the one illustrated in the following example.

Example:

The alarm service refresh time can be adjusted.

In Foglight 5.9.1, the standard method recommended for configuring the alarm refresh rate is by setting the MaxAlarmUpdateDelay parameter in the federation.config file. This property specifies the maximum delay (in seconds) that is allowed in the Federation Master before it checks all Federated Children for alarm changes.

The Federation Master does not run periodic checks; it only retrieves alarm data from the Federated Children, on demand. The following two factors control when the retrieval operation occurs:

The value set for the MaxAlarmUpdateDelay parameter in the federation.config file—this parameter is programmable and has a default value of 50 (seconds).
and has a default value of 15 minutes (therefore, it is recommended to set the MaxAlarmUpdateDelay to a value less than 15 minutes).

Alarm data is refreshed on the Federation Master when the all of the following conditions occur:

The last refresh was done more than MaxAlarmUpdateDelay seconds ago.
Related Documents

The document was helpful.

Select Rating

I easily found the information I needed.

Select Rating