Foglight Evolve 7.1.0 - Performance Tuning Field Guide

Table 1. Topology churn
Motivation	The server is optimized to handle stable topology models. Therefore, it is expected that the Management Server is configured to minimize topology changes.
Symptom	The browser interface performance is poor (that is, it responds slowly). Topology queries (in the browser interface as well as in groovy scripts) are slow. Data is being dropped by the Data Service.
Diagnosis / Verification	Check the System Changes chart on the Alarms System dashboard. Check the batchesDiscarded metric produced by the Data Service that is located in Dashboards > Foglight > Servers > Management Server View > Data Service. Explore the topology model using the Data dashboard located in Dashboards > Configuration > Data and try to locate any noise (that is, potentially redundant) or unexpected objects.
Tuning	In some cases, the agent defaults may be too broad. The agent may be monitoring everything that is visible to it. Configure agents to monitor only what is required. Agent and/or CDT changes may be required. In such cases, support bundles are very helpful in the investigation. Allocate more system resources to the Management Server.
Note	The amount of topology changes that the server can process depends greatly on the overall system configuration (Host OS, database, and hardware).

Canonical Data Transformations (CDTs)

Theoretically, CDTs can be a performance bottleneck. Unfortunately, it is not easy to tune them. In most cases, a cartridge update is required.

Table 2. Canonical Data Transformations (CDTs)
Motivation	CDTs convert data received from agents into the server’s internal representation (that is, into the Canonical Data Form). Although this process is usually fast, it can be computation-intensive and therefore may cause performance issues.
Symptom	The server is generally slow overall. The CDT transformTime metrics are high. Typical values for OS Cartridge agents are in the 0.01 - 0.1 second range, in 15-minute intervals. CDT transformTime metrics can be accessed through the browser interface by going to Dashboards > Configuration > Data > Foglight > All Data > AllTypeInstances > TopologyObject > subTypes > CanonicalDataTransform > instances > ... > transformTime.
Diagnosis / Verification	CDT tasks are visible in thread dumps.
Tuning	Tuning will probably have to be done by the agent development team. A support bundle along with the thread dumps will be very helpful in the investigation.

Agent Weight / Environment Complexity

Foglight® Management Server maintains an internal metric that represents, roughly, the amount of work the server has to do in order to process the data collected by the agents.

This metric is called aggregateAgentWeight. It is available from the EnvComplexityEstimator service in the Management Server Metrics dashboard: Dashboards > Foglight > Servers > Management Server Metrics > <CatalystServer> > Services > com.quest.nitro:service=EnvComplexityEstimator > aggregateAgentWeight.

This metric is derived from the number and types of connected agents according to: <foglight_home>/config/agent-weight.config.

The value of the metric is typically expressed in agent units. Recent server builds generally work well with up to 4000 agent units connected.


	NOTE: The maximum agent weight a server can support depends greatly on the host system configuration and the hardware capacities.

The agent-weight.config set-up is based on Quality Assurance (QA) capacity test results. Generally, it should not be changed. However, if new data on the relative agent weight is available, the configuration file can be adjusted manually. You must restart the server after you change this configuration file.

Large Topologies

Table 3. Large Topologies
Motivation	On occasion, agents may send so much observation data that the resulting topology model becomes too large and causes performance problems in the server.
Note	Java EE Technologies agents are the most likely to cause this type of situation, if they are not carefully pre-configured.
Symptom	The server is in overload condition. Agent data is being dropped. The browser interface performance is poor (that is, it responds slowly).
Diagnosis/Verification	Create a support bundle and check the topology size breakdown. A topology size breakdown by type is available in the diagnostic snapshot files that are part of each Management Server support bundle. Look for large topology object instance counts. NOTE: Acceptable count ranges will differ depending on the amount of resources allocated to the Management Server.
Tuning	1. Stop data collection on agents that produce an excessive number of topology objects. 2. Re-configure those agents to produce reasonable amounts of topology data. 3. Delete any excess topology objects. 4. Resume data collection on the affected agents.
Example	There is a large number of JavaEEAppServerRequestType topology objects. NOTE: When that is the case, there will also be a large number of objects for other topology types. You can reduce the number of JavaEEAppServerRequestType topology objects by adjusting the FilteringRules parameter in recording.config. NOTE: This is a Cartridge for Java EE Technologies topology object example. In principle, any cartridge may cause this kind of performance problem.

Please select your product:

To serve you better, please complete the Purpose of your Chat:

Recommended Solutions for Your Problem

Foglight Evolve 7.1.0 - Performance Tuning Field Guide

Topology Changes and Topology Churn

Canonical Data Transformations (CDTs)

Agent Weight / Environment Complexity

Large Topologies