The Database is Healthy if...
• The number of JDBC connections available is close to the maximum. This means that queries are not getting stuck waiting for the database to return.
• The CPU utilization on the database machine is substantial, but not greater than 90 percent on average, indicating a non-blocked system that has enough resources available.
• The browser interface is performing well on metric queries. This can be tested with any dashboard that graphs a metric.
Database Checks
• First, perform the Hosts dashboard performance test from the CPU section of the health check. Poor browser interface performance coupled with low CPU utilization indicates either a server block or a database problem. To check to see if there is a database problem, proceed to the next check.
• Check the available JDBC connections.
• Check CPU and I/O utilization on the database machine.
• Check the amount of memory allocated to the database:
• On an Oracle® database, check the SGA size:
SELECT SUM(bytes) FROM v$sgainfo WHERE resizeable = ‘Yes’
• On a MySQL database, check the InnoDB buffer pool size:
SHOW VARIABLES LIKE ‘innodb_buffer_pool_size’
• Check the database timing metrics.
Possible Actions
• Tune the database cache parameters. Need to allocate more memory to the database if I/O utilization is high.
• Move the database to a more powerful machine.
• Tune the database optimization.
• Run the following commands to check the database timing metrics:
./fglcmd.sh -usr foglight -pwd foglight -cmd util:topologyexport -f handlers.xml -topology_query CatalystPersistenceHandler
./fglcmd.sh -usr foglight -pwd foglight -cmd util:metricexport -f retrieve-last-n-values-time.csv -metric_query "retrieveLastNValuesTime from CatalystPersistenceHandler for 1 week" -output_format csv
./fglcmd.sh -usr foglight -pwd foglight -cmd util:metricexport -f retrieve-time.csv -metric_query "retrieveTime from CatalystPersistenceHandler for 1 week" -output_format csv
./fglcmd.sh -usr foglight -pwd foglight -cmd util:metricexport -f retrieve-earliest-time.csv -metric_query "retrieveEarliestTimeTime from CatalystPersistenceHandler for 1 week" -output_format csv
• The metrics returned in the CSV file are in milliseconds. Look for times of 500 or greater in AVERAGE, or greater than 10000 in MAX. If these values are consistently high, the database is failing to keep up with the load, causing the Management Server to be slow.