How to configure alarm and action behavior of a rule?
说明
Foglight allows you to configure rule behavior to ensure it does not fire repeatedly. Defining the behavior of rule alarms and actions can help to avoid being overwhelmed with alerts when a rule condition is met many times within a short period.
解决办法
The following options are available:
Fire action if x consecutive evaluations are true. Sometimes, observations are high-strung, causing a single match in the evaluation to lead to more actions than desired. With this option it is possible to enforce a certain number of positive evaluations before a rule finally fires. Setting this option to x causes the rule’s actions (simple and multiple-severity rules) and alarms (multiple-severity rules only) to execute when the number of evaluations defined by x are true. Example: A rule that checks the current CPU utilization is evaluated every 5 minutes. To avoid having short spikes leading to a firing rule, you can configure a damper mechanism to prevent actions and alarms from occurring until a certain number of matches is reached, such as, for example, three consecutive evaluations. That means that three times in 15 minutes (because the rule is evaluated every five minutes) the evaluation has to resolve to true before the rule fires.
Fire actions if x out of y evaluations are true. Similar to the above setting, this option allows to enforce a pattern of the evaluation behavior. When the frequency of positive evaluations matters less than their reoccurrence, this option allows for a less strict damper mechanism. Setting this option to x out of y causes the rule’s actions (simple and multiple-severity rules) and alarms (multiple-severity rules only) to execute after x out of y evaluations are true. Example: A fixed number of positive evaluations in a row is not missing positives matches, however their pattern of 2-0-2-1-0-2 consecutive positive evaluations is not identified. Setting this option to 4 out of 8 allows for identification of a problem across multiple evaluations while not relying on them being consecutive.
Wait at least [hh:mm:ss] hh:mm:ss after first evaluation. Sometimes when evaluations are performed immediately after a topology object creation, or the Foglight Management Server re-start, a number of false positives can occur. There might be a startup-time factor or the monitored entity simply needs to calm down for a minute or two, before reliable analysis can be performed. This option specifies that the first evaluation and all evaluations in the given time frame afterwards are ignored before evaluation results are considered. Setting this option causes the rule’s actions (simple and multiple-severity rules) and alarms (multiple-severity rules only) to execute after the specified period. Example: When a new host becomes available online, a high network utilization is expected, while updates are downloaded or application synchronization commences. To avoid having newly-discovered hosts generate alarms on their bandwidth usage, you can specify a time window, for example, 15 minutes, during which the rule evaluating the network conditions is on hold.
Configure the desired options for the behavior of rule alarms and actions by selecting any of the following check boxes. If multiple options are selected, the actions associated with the rule execute if all of the selected options are true.
In the above example, even though the first behavior condition is met at 1:00, the second behavior condition dictates that the rule waits an hour from the first time it is evaluated before it fires. The rule fires at 2:00, only after both behavior conditions are met.
When changing the behavior, please consider this solution: Foglight is ignoring rule firing behavior as defined in the Alarm & Action Behaviors tab KB 4247278