In my Foglight Experience Viewer (FxV), many of theFxV Hits are being being discarded. Almost asmany hits are being discarded as are beingcaptured. So, I know to look upstream to the Foglight Experience Monitor (FxM) appliance for possible problems. That is where FxV is getting its feed. So, Ive got the FxM All Metrics View for System Health (see attachedAllMetricsView_SystemHealth.pdf).
Examining that, the biggest problem I see is in FxV metrics. There are almost as many FxV Discards as there are FxV Hits Transmitted.
Also we see problems in theSSL Metrics too--the error rate is 30% and there are almost as many SSL Connections Dropped as there are SSL Connections Started. So, the high FxV discard rate is due to FxM monitoring SSL traffic which it cannot decrypt--as discussed in SOL35035.I do seethat 3 or 4% of the Segments are missing. I know that will certainly cause SSL Errors and that if we fix FxMs SSL monitoring, the FxV problem will likelybe resolved.
Unlike the Solution mentioned above, there is a 6% packet Drop Rate. So, that means that FxM itself is dropping the packets, correct?Not all the Packets are being dropped in the *Switch*, correct?
The underlying cause of the problem is likely that too much traffic is trying to come into the FxM machine and it is running out of memory.Both metrics (Packets Dropped, Missing Segments Rate) being non-zero can be caused by an overloaded appliance. Your traffic is probably not corrupted before it gets to FxM--it gets corrupted when it reaches FxM due to overload. If you were to check the ecrit log file, you would likely findstatements such as this:
Oct 10 16:13:45 questfxm statmon: (3) ERROR: Memory consumption has exceeded threshold.
Indeed, from the Health Metrics screenshot, we can see that your FxM machinecaptured about 157,000,000 packets during that hour. That calculates out tobeingover 43,000 packets captured per second. (Also theMemory Utilization % is 80%).
If you looked in the ecrit log file, youdfind a statement like the below. It shows over 18 million packets captured in that five minute interval. Thistells the same story as the Health Metrics screenshot--too much traffic trying to come into FxM.
Oct 8 09:35:02 questfxm agent: (6) pkt delta: cap/proc/drop 18685254 5382120 1284908 6.877%
If the Packet Drop Rate is non-zero that could be causing the Missing Segments Rate to go up. The Packet Drop Rate reflects packets dropped at the appliance. That could feed into the missing segments calculations.
Packet Drop Ratemeans we are dropping at the appliance. If Packet Drop Rate is non-zero then the Missing Segments Rate is likely to be non-zero as well. In this case, the Missing Segments Rate is occuring because we are dropping packets at the appliance. If the Packet Drop Rate was zero then you could conclude that the Missing Segments are being dropped before the traffic gets to the appliance (not the case here).
In this case, this behavior is being caused by too much traffic trying to come into FxM. Reduce it. Refer to this Solution/ Knowledgebase article (SOL31887) about the maximum recommended traffic volume the appliance handle: