Disable alarm in the database agent administration panel
Threshold is too high for a rule to fire.
Email is not configured or is disabled in the agent administration panel
Alarm sensitivity is set for some alarm severities not to be displayed.
Disabled rule(s) in the Rules Management dashboard
Alarms Service is stopped.
FMS lacks sufficient resources to process monitoring data or generate Alarms or Email Notifications.
No matching data has been collected to trigger an alarm.
Blackout is configured
High number of alarms in the alarms table
Email received is from a different FMS
Customized or multiple (custom) copies of the same rule.
Unsuccessful connection to host or database
Collection frequency is set higher than normal or the event occurs between agent collections
Email server settings not configured
Oracle Alert log filtering or SQL Server Error log filtering for individual and summary alarms is set to only fire Fatal or OFF.
Defect ORAFOG-681
Rule is cloned from another rule and is misconfigured or cannot be managed using the Database Administration UI.
Data issues from the query used by collection
Some issues may have been temporary, such as due to a long running query, intensive use of TEMP tablespace, or a tablespace autoextended after the alarm fired.
Alarm is disabled in the alarm template that is assigned to the target.
All metrics are collected from the Monitored Host and and submitted to the FMS, registered to Metric objects.
A Metric is an object in the FMS that holds the data in several data segments:
The Current data segment and the Latest data segment entries are the same when the user chooses to see the "Last...", like "Last hour" for the selected time period.
INVESTIGATION: To investigate "raw" metrics, users can drill down into the FMS topology through various means including Configuration | Data or Administration | Tooling | Script Console
Each agent is managed by the settings configured in the Agent Status Properties for each agent. These settings include
If a rule processes the Metric data and determines that the thresholds point to exception (for example, CPU usage is over 80, and 80 is the threshold which beyond an Alarm should be fired), an Alarm will be triggered.
According to each Rule definition (thresholds, Baseline thresholds, Boolean), an Alarm will be fired.
Rule with thresholds/Baseline thresholds can trigger an Alarm with the below severities:
Color |
Severity |
---|---|
Red |
Fatal |
Orange |
Critical |
Yellow |
Warning |
Those are Multiple Severity Alarms.
Baseline Alarms are considered as Multiple Severity Alarms.
The data gathered in each of the Metrics and its data segments is constantly being checked by the Rules.
A rule is defined per each metric, and the rule checks the metric data in order to alert the user, if required.
The alert is in the form of Alarm and mail notification (if configured).
In order for the Rule to "know" if an Alarm should be triggered, the Rule uses either thresholds or Boolean condition.
The types of Rules are:
INVESTIGATION: Review the design of the rule and compare the rule to an "out of the box" copy of the rule to identify a rule has been modified by a user. If they rule has been customized, restore the rule back to its original design to confirm that any issues still exist with the rule. As well, review and adjust the thresholds used by rules to trigger alarm conditions.
Enable the alarm as per KB 103065
Refer to KB 93998 for details on changing thresholds for database monitoring agents.
* For testing purposes, temporarily set the threshold to a very low value (for example 1%) to confirm that the alarm is working and fires.
Enable the Alarms Email Notifications as per KB 90935 and KB 261762 (video).
Review KB 232091 to configure the alarm sensitivity to fire the appropriate severity levels.
* For testing purposes, set the agent to tuning, and confirm that alarms are fired.
Enable the rules in the Rules Management dashboard by navigating to Administration | Rules & Notifications | Rules
Restart the FMS process
or
Options include:
It's possible on very under-utilized systems that no alarms are fired due to the configured Rule/Alarm criteria or simply that Alarm thresholds are not met. Also If the agents are stopped or are unable to collect data alarms will never fire, except for Agent Health, Credential, Availability or Connection specific rules. Review the Agent dashboards for current expected data.
Confirm if there is an active Blackout set for the instance or Agent as per KB 129995 and 155621
As a test, drill down on the instance from the Database dashboard | click on Activity | Real Time | check the Availability to see if there is blackout alert.
Please refer to KB 80646 for more information on counting alarms in the alarms table.
Please refer to KB 66577 for more information on purging the alarms table.
If the customer receives alarms and the alarm or rule has been disabled, please confirm if there is a second Foglight Management Server (FMS) environment.
Rule has been customized (emailaction, fire multiple times, copied rule). Reviewing the rule name and the alarm format and email text to look for non-default text can be an indication that a rule has been customized.
Compare the rule conditions to the same rule on a FMS system that has not been modified and uses the same cartridge version.
Validate connectivity to the host as per KB 114121 (Oracle), 234887 (SQL Server) and 234154 (DB2, see Resolution #1)
Set the database agent collection for the alarm metric to query the host more often. Refer to KB 177914 for details on changing database agent collection frequencies.
Check the SMTP and email configuration on the FMS server as per KB 69979.
Set the SQL Server Error Log Filter or Oracle Alert Log Filtering to fire for Warning alarms.
A fix specific for the "DBO Usability Connect Availability" rule / alarm is available in the Oracle 5.9.2.1 and higher cartridge versions. Please consult KB 259302 for more details.
Enable and configure the original "out of the box" rule using the database agent administration panel.
Run a query similar to the database agent's collection and compare the results to the alarm message.
Access the database agent's raw topology metrics using the Configuration | Data pages and choose a time range in the zonar that corresponds to when the alarm was fired and review the actual data to identify any temporary changes in the host system (e.g. afterhour downtime, network issues, high IO due to backups).
Enable the alarm in the alarm template; refer to Editing Alarm Templates (4226517) for more information on how to disable or enable the alarms.
© 2024 Quest Software Inc. ALL RIGHTS RESERVED. Feedback Terms of Use Privacy Cookie Preference Center