The Foglight console is slow, overload messages are reported, the CPU usage on the FMS host is high, and a script associated with the Event Log rule is taking a lot of invocation time.
The Foglight console is slow.
The CPU usage on the FMS host has increased markedly. The CPU usage varies from about 100% - 400% on the Linux host.
The problem was noted after deploying WindowsAgents.
"overload" messages like the following are reported in the Management server logs. 2016-04-25 20:38:12.919 WARN [QuartzScheduler.Monitor_Worker-4] com.quest.nitro.service.data.DataService - The server has reached an overload state. 4,717 batches of data have been discarded.
The "Total Top Script Invocation Time" graph in the Performance Report indicates a script associated with the Event Log rule is taking a lot of invocation time (on the order of days):
Following is an excerpt of an example thread that references the script: "Data-3-thread-1285184" #2442783 prio=5 os_prio=0 tid=0x00007f93c400f800 nid=0x4ef2 runnable [0x00007f93a587a000] java.lang.Thread.State: RUNNABLE at java.util.Arrays.copyOfRange(Arrays.java:3664) at java.lang.String.<init>(String.java:201) at java.lang.StringBuilder.toString(StringBuilder.java:407) at org.codehaus.groovy.runtime.StringGroovyMethods.plus(StringGroovyMethods.java:2450) at org.codehaus.groovy.runtime.dgm$1009.invoke(Unknown Source) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:271) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:53) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116) at Cartridge_Infrastructure_Rule_Event_Log_Severity_Fire_Expression_message_script_b61c2769d4cf9e50be08bff5058d766e.run(Cartridge_Infrastructure_Rule_Event_Log_Severity_Fire_Expression_message_script_b61c2769d4cf9e50be08bff5058d766e.groovy:18) at com.quest.nitro.service.scripting.groovy.GroovyScript.exec(GroovyScript.java:149) ...
The "Event Log" rule caused the overload. All the data processing threads were invoking the Groovy script for the "message" expression on the rule. All the threads were concatenating strings to assemble the message.
It could be that there were a lot of events, or events with a lot of text, which then caused the rule to generate a very large message which was expensive to assemble.
RESOLUTION 1: Disable the Event Log rule.
RESOLUTION 2: Filter the events collected by the WindowAgent so that only information on Events of interest is collected.