Foglight 5.9.1 - Performance Tuning Field Guide

Management Server Automatically Restarted

Sometimes, after what appears to be a normal and successful startup, an HA server would be automatically shut down and restarted. The most likely reason for this is a misconfiguration of health check URL in the restart_monitor.config file.

For example, you may have reconfigured the HTTP port of the Foglight™ Management Server or reconfigured the IP address that the server is bound to, but forgotten to reconfigure the health check URL of the restarter. If the restarter cannot contact the Management Server for health check, it considers the Management Server not responsive and restarts it.

You can check the server_restarter_xxxx-xx-xx_xxxxx_xxx.log file to determine if this scenario is what caused the restart. If so, edit the restart_monitor.config file by locating the line beginning with health.check.url and configuring the URL properly.

Other JGroup Related Issues and Information

ERROR org.jgroups.protocols.UDP max_bundle_size (64000_ is greater than the largest TP fragmentation size (8000):

https://jira.jboss.org/browse/JGRP-798

“Cross-talk” can occur between servers on different clusters:

https://jira.jboss.org/browse/JGRP-614

Agent Tuning

This chapter provides information about the agent-related options that can have an effect on performance and describes the applicable performance-related options.

•

Topology Changes and Topology Churn

•

Canonical Data Transformations (CDTs)

•

Agent Weight / Environment Complexity

•

Large Topologies

•

Sampling Frequency

•

XML-HTTP Agent Adapter

•

Dropped Agent Manager Log Messages

•

Java EE Technologies

Topology Changes and Topology Churn

Monitored data in the Foglight™ Management Server can be sub-divided into two areas with distinct properties:

Topology—data representing monitored entities

Observations—data (including metrics) observed about monitored entities

It is generally assumed that topology objects change very little over time. Observations are expected to be highly volatile over time.

The decision of whether to add a particular piece of data to the topology model or to treat it as an observation is made during cartridge development. This decision is expressed in the software using CDT configuration.

The server is generally optimized to handle stable topology models where topology changes are infrequent. If topology changes occur on a more regular basis, this is known as Topology Churn, and it usually results in diminished server performance.

Table 1. Topology churn
Motivation	The server is optimized to handle stable topology models. Therefore, it is expected that the Management Server is configured to minimize topology changes.
Symptom	The browser interface performance is poor (that is, it responds slowly). Topology queries (in the browser interface as well as in groovy scripts) are slow. Data is being dropped by the Data Service.
Diagnosis / Verification	Check the System Changes chart on the Alarms System dashboard. Check the batchesDiscarded metric produced by the Data Service that is located in Dashboards > Foglight > Servers > Management Server View > Data Service. Explore the topology model using the Data dashboard located in Dashboards > Configuration > Data and try to locate any noise (that is, potentially redundant) or unexpected objects.
Tuning	In some cases, the agent defaults may be too broad. The agent may be monitoring everything that is visible to it. Configure agents to monitor only what is required. Agent and/or CDT changes may be required. In such cases, support bundles are very helpful in the investigation. Allocate more system resources to the Management Server.
Note	The amount of topology changes that the server can process depends greatly on the overall system configuration (Host OS, database, and hardware).

Please select your product:

To serve you better, please complete the Purpose of your Chat:

Recommended Solutions for Your Problem

Foglight 5.9.1 - Performance Tuning Field Guide

Management Server Automatically Restarted

Other JGroup Related Issues and Information

Agent Tuning

Topology Changes and Topology Churn