A Foglight Agent Manager (FglAM) running database agents—without a Performance Investigator (PI) repository installed on the same server—exhibits high CPU utilization, sometimes reaching 100%.
This article outlines potential causes and resolutions to mitigate CPU resource consumption on affected FglAM hosts.
⚠️ Note: Sizing issues are the most common root cause of FglAM performance problems.
Cause 1 – Insufficient Sizing
The FglAM is undersized in terms of CPU, memory, or disk resources.
Cause 2 – Outdated or Inactive VMware Tools
VMware Tools are either not installed, outdated, or not running. In one instance, the FglAM was using only 2.5 GB of memory despite having more assigned.
Cause 3 – Resource-Intensive Agent Collection
A specific agent or collection (e.g., Filesystem IO in Oracle agents) may cause spikes. For example, agents monitoring hosts with ~16K disks consume high CPU when running iostat
.
Cause 4 – DataDirect JDBC Driver and Java 8 SSL Conflict
Conflicts between legacy JDBC drivers and SSL/TLS settings in Java 8 environments.
Cause 5 – Simultaneous Agent Collection Execution
All agents may start and run collections at the same time, especially after FglAM restarts.
Cause 6 – Specific Agent Types Impacting Performance
Certain agent types such as MongoDB, MySQL, PostgreSQL, or SSAS may degrade performance if co-located.
Cause 7 – Antivirus Interference
Antivirus software running on the FglAM may cause CPU contention.
Cause 8 – Undetermined Microsoft JDBC Driver Issue
A Microsoft driver issue may result in elevated CPU usage.
Cause 9 – Defect FOG-3307
A known defect contributing to performance issues.
Cause 10 – FglAM Version 6.1.0 Specific Issue
FglAM version 6.1.0 contains an issue requiring a hotfix.
Resolution 1 – Review Sizing
Review the Foglight for Databases Deployment and Sizing Guide to ensure the FglAM meets the minimum recommended CPU, memory, and disk requirements for the number of agents deployed.
Resolution 2 – Update VMware Tools
Ensure VMware Tools are:
Installed
Running
Up to date on the virtual machine hosting the FglAM
Resolution 3 – Isolate the Problem Agent
Disable agents incrementally or in groups to identify which is causing the spike.
Recreate the problematic agent or adjust its properties.
For Oracle agents:
Refer to KB 4308784 to:
Increase the Filesystem IO collection interval to 600–900 seconds
Or disable Filesystem IO collection entirely
Resolution 4 – Remove DataDirect Driver Override
Edit the baseline.jvmargs.config
file to remove legacy references:
-Dsqlserver.driver.class.name=com.dell.dsgi.jdbc.sqlserver.SQLServerDriver
See KB 4295822 for guidance on TLS/SSL configuration for SQL Server agents.
Resolution 5 – Stagger Agent Startup and Collection Timing
Stagger agent startup by groups (20–30 agents every ~17 seconds)
Alternatively, add the following JVM parameter to the baseline.jvmargs.config
file and restart the FglAM:
vmparameter.0 = "-Dagent.collector.schedule.load.max.delay.millis=300000";
This spreads agent activation over 5 minutes instead of the default 2 minutes, reducing collection overlap.
Resolution 6 – Reassign Resource-Intensive Agents
Deactivate and move the following agent types to a separate, dedicated FglAM:
SQL Server Analysis Services (SSAS)
MySQL
PostgreSQL
MongoDB
Resolution 7 – Temporarily Disable Antivirus
Disable antivirus software on the FglAM for ~5 minutes and observe any improvement in CPU usage.
Resolution 8 – Revert to DataDirect Driver
Stop the FglAM.
Edit baseline.jvmargs.config
in {fglam}\state\default\config
:
vmparameter.X = "-Dsqlserver.driver.class.name=com.dell.dsgi.jdbc.sqlserver.SQLServerDriver";
Replace X with the next sequential number after the highest existing
vmparameter.X
value.
Restart the FglAM.
Wait 5–10 minutes and monitor CPU and metric collection.
Resolution 9 – Update SQL Server Cartridge
Upgrade to SQL Server cartridge version 6.1.0.10 or higher to address known issues.
Resolution 10 – Contact Support for Hotfix
Contact Quest Support for a hotfix if using FglAM version 6.1.0.
Disable SQL PI on a single FglAM to confirm if the issue is related to PI-related overhead.
Binary agent reduction:
Shut down half of the agents and observe CPU usage.
Repeat until performance improves.
Alternatively, enable agents in groups of 20–30 and monitor CPU after each batch.
FOG-3307 – High CPU usage under specific configurations
Enhancement Request – N/A
© ALL RIGHTS RESERVED. Terms of Use Privacy Cookie Preference Center