Threshold is too high for a rule to fire.
Email is not configured or is disabled in the database agent administration panel (Databases | Settings | Administration | Alarms )
Disabled rule(s) in the Rules Management dashboard
Alarms Service is stopped.
FMS lacks sufficient resources to process monitoring data or generate Alarms or Email Notifications.
No matching data has been collected to trigger an alarm.
Blackout is configured
High number of alarms in the alarms table
Email received is from a different FMS
Customized or multiple (custom) copies of the same rule.
Unsuccessful connection to host or database
Collection frequency is set higher than normal or the event occurs between agent collections
Email server settings not configured
Oracle Alert log filtering or SQL Server Error log filtering for individual and summary alarms is set to only fire Fatal or OFF.
Rule is cloned from another rule and is misconfigured or cannot be managed using the Database Administration UI.
Data issues from the query used by collection
Some issues may have been temporary, such as due to a long running query, intensive use of TEMP tablespace, or a tablespace autoextended after the alarm fired.
Alarm is disabled in the alarm template that is assigned to the target.
In situations where data is shown incorrectly in a dashboard or an alarm misfires, the first step in investigating is to differentiate whether the timeouts/missing data is caused because by a collection issue, a configuration setting, the rule, or the FMS itself.
The general flow would be:
All metrics are collected from the Monitored Host and and submitted to the FMS, registered to Metric objects.
A Metric is an object in the FMS that holds the data in several data segments:
The Current data segment and the Latest data segment entries are the same when the user chooses to see the "Last...", like "Last hour" for the selected time period.
INVESTIGATION: To investigate "raw" metrics, users can drill down into the FMS topology through various means including Configuration | Data or Administration | Tooling | Script Console
A collection is typically a SQL statement or OS command(s) that are running by a database agent from the FglAM against the monitored host.
Agent Managers (and some agent types) can be run in debug mode to write additional details to the agent log files.
INVESTIGATION: Review the database agent log file to look for an query errors that correspond to the collection which is being investigated (e.g. Top SQL). Analogous SQL or OS queries can also be run directly against the monitored host to see the results outside of Foglight.
Each agent is managed by the settings configured in the Agent Status Properties for each agent. These settings include
Many database agents (e.g. Oracle, SQL Server, DB2) list all of the agent configuration settings each time a database agent is restarted.
INVESTIGATION: Review the current settings for the agent in question and make relevant adjustments such as
If a rule processes the Metric data and determines that the thresholds point to exception (for example, CPU usage is over 80, and 80 is the threshold which beyond an Alarm should be fired), an Alarm will be triggered.
According to each Rule definition (thresholds, Baseline thresholds, Boolean), an Alarm will be fired.
Rule with thresholds/Baseline thresholds can trigger an Alarm with the below severities:
Color |
Severity |
---|---|
Red |
Fatal |
Orange |
Critical |
Yellow |
Warning |
Those are Multiple Severity Alarms.
Baseline Alarms are considered as Multiple Severity Alarms.
INVESTIGATION: Review the details of the alarm messages itself and look at the history of the alarms to identify any patterns.
The data gathered in each of the Metrics and its data segments is constantly being checked by the Rules.
A rule is defined per each metric, and the rule checks the metric data in order to alert the user, if required.
The alert is in the form of Alarm and mail notification (if configured).
In order for the Rule to "know" if an Alarm should be triggered, the Rule uses either thresholds or Boolean condition.
The types of Rules are:
INVESTIGATION: Review the design of the rule and compare the rule to an "out of the box" copy of the rule to identify a rule has been modified by a user. If they rule has been customized, restore the rule back to its original design to confirm that any issues still exist with the rule. As well, review and adjust the thresholds used by rules to trigger alarm conditions.
Review the thresholds set for the alarm to fire in the Alarm Template that has been assigned to this target agent.
* For testing purposes, temporarily set the threshold to a very low value (for example 1%) to confirm that the alarm is working and fires.
Enable the Alarms Email Notifications as per KB 4310657.
Enable the rules in the Rules Management dashboard by navigating to Administration | Rules & Notifications | Rules
Restart the FMS process
or
Options include:
It's possible on very under-utilized systems that no alarms are fired due to the configured Rule/Alarm criteria or simply that Alarm thresholds are not met. Also If the agents are stopped or are unable to collect data alarms will never fire, except for Agent Health, Credential, Availability or Connection specific rules. Review the Agent dashboards for current expected data.
Confirm if there is an active Blackout set for the instance or Agent
Please refer to KB 4295903 on counting and purging alarms
If the user receives alarms and the alarm or rule has been disabled, please confirm if there is a second Foglight Management Server (FMS) environment.
Rule has been customized (emailaction, fire multiple times, copied rule). Reviewing the rule name and the alarm format and email text to look for non-default text can be an indication that a rule has been customized.
Compare the rule conditions to the same rule on a FMS system that has not been modified and uses the same cartridge version.
Validate connectivity to the host as per KB 4235896 (Oracle), 4229902 (SQL Server) and 4289409 (DB2, see Resolution #1)
Set the database agent collection for the alarm metric to query the host more often. Refer to KB 4308784 for details on changing database agent collection frequencies.
Check the SMTP and email configuration on the FMS server as per KB 4352966.
Set the SQL Server Error Log Filter or Oracle Alert Log Filtering to fire for Warning alarms.
Enable and configure the original "out of the box" rule using the database agent administration panel.
Run a query similar to the database agent's collection and compare the results to the alarm message.
Access the database agent's raw topology metrics using the Configuration | Data pages and choose a time range in the zonar that corresponds to when the alarm was fired and review the actual data to identify any temporary changes in the host system (e.g. afterhour downtime, network issues, high IO due to backups).
Enable the alarm in the alarm template; refer to Editing Alarm Templates (4226517) for more information on how to disable or enable the alarms.
© ALL RIGHTS RESERVED. Nutzungsbedingungen Datenschutz Cookie Preference Center