What are the different usability rules for SQL Server and Oracle monitoring for?
What does the Usability and Availability rules check?
What rule does the Oracle Listener use?
The main rules discussing usability and availability can be somewhat confusing, unless a clear explanation is provided.
The following usability and availability alarms are fired based on the usability collection data. The agent uses usability collection to indicate if there is any usability issues that require usability alarms firing. If one of the issues exists, no other collections will be run by agent and no session is supposed to fail following any connectivity or availability issue.
Therefore the information regarding session ID, PID, username or process name is not relevant. It's not a specific session/process failure, it's agent indication if the database is available.
For Oracle:
DBO - Instance Availability - This rule checks if the database is running, testing only the instance, not listener. See Usability Connect Availability Rule below for listener reference. This rule alerts when an Oracle database instance is not up and running. A fatal alarm results.
DBO - Usability Availability - This rule is raised for RAC installations and based on the aggregation of the Usability Availability Single Rule, testing the percent of single instances that are not running, not relates to listener. This rule alerts when the percentage of available RAC nodes is less than 80 percent (warning), 50 percent (critical), or zero percent (fatal) (the thresholds are configurable).
DBO - Usability Availability Single - This rule fires upon detecting that an Oracle database instance is shut down by checking if the database is running, testing only the instance, not listener. An Oracle database instance is indicated as being unavailable if the instance’s PMON process (UNIX) or Oracle service (Windows) are not running. This rule alerts when an Oracle database instance has shut down. A fatal alarm results.
DBO - Usability Conn Time - The amount of time it takes to establish a connection.This rule fires when the average connection time (milliseconds) to the database instance exceeded a predefined registry variable value. Data samples are kept for a predefined period of time, at the end of which the aggregated sample value is being averaged. An alarm will be raised if the resulting average exceeds a pre-configured value. This rule checks if the time taken to connect to the database instance is acceptable. This rule alerts when average connection time to the database instance is more than 5000 milliseconds (warning) or 10,000 milliseconds (critical).
DBO - Usability Connect Availability - This rule is raised if the database is running, but the connection is not possible for any reason. The reason can be also listener related, the appropriate message saying the connection is not available because of the listener will be displayed. This rule raises an alarm when the connection to the database instance has failed.
DBO - Usability OS Connect Availability - This rule fires when the connection to the operating system has failed.
DBO - Usability Response Time - This rule fires when the database average response time exceeds a predefined registry variable value.
Usability Response Time is the amount of time (in microseconds) used by the active cursor for carrying out parsing/executing/fetching operations (there is a query in the agent properties that can manage this, e.g. select 1 from dual). Data samples are kept for a predefined period of time, at the end of which the aggregated sample value is being averaged. An alarm will be raised if the resulting average exceeds a pre-configured value. This rule checks if the time taken to run simple statement, after the connection to the database succeeded, is acceptable. This rule alerts when the database average response time is more than 10 milliseconds (warning) or 50 milliseconds (critical).
DBO - Listener Status - This rule checks if the listener defined in Global Configuration is running. This rule alerts when a listener is not available. A fatal alarm results.
For SQL Server:
Instance Availability - This rule checks if the SQL Server is running, testing only the instance. See Usability Connect Availability Rule below. This rule alerts when a SQL Server instance is not up and running. A fatal alarm will be fired.
Usability Availability - This rule checks if the SQL Server is running, testing only the instance. This rule alerts when a SQL Server instance has shut down. A fatal alarm will be fired.
Usability Conn Time - This rule checks if the time taken to connect to the SQL Server instance is acceptable. This rule alerts when average connection time to the SQL Server instance is more than 5000 milliseconds (warning) or 10,000 milliseconds (critical).
Usability Connect Availability fires when the connection to the instance fails (it returns a record set which includes data regarding network, IO and start time of the SQL Server instance). This rule is raised if the SQL Server instance is running, but the connection is not possible for any reason. The Usability Connection Availability rule raises an alarm when the connection to the SQL Server instance has failed. Note that the connection not possible with the credential that defines at the agent ASP and it check the connection from the FGLAM machine.
Usability OS Connect Availability - fires when the connection to the Operating System fails by checking the state from Win32_Service. This rule fires an alarm when the connection to the operating system fails.
Response Time - This rule checks if the time taken to run simple statement, after the connection to the database succeeded, is acceptable. This rule alerts when the SQL Server instance response time is longer than the last successful connection.
Connection Time vs. Response Time
The time required to create the connection is often much higher than the time required for the database to resolve the query and return results. This is why application servers often manage “pools” of connections that can be re-used by different parts of the application code, thus saving these from having to each re-create a connection for each SQL request.