Metrics are not being displayed in the APM 'Websites and Endpoints' dashboard nor in the 'System Health' dashboard. We have some Sniffer and Archiver appliances that are in this Capture Group but have become broken and no longer reachable by the FMS. Also, we can't even collect an APM support bundle--it times out.
We are tracking this issue as APMDATA-1643: "Agent blocking on queries to offline archivers". Metric queries are being initiated but never returning. And the query to the Archiver (when it is querying for appliance metrics) is never initiated. No metric delta queries are being sent to that archiver
You may experience this problem eventually when you are running with multiple total appliances in one Capture Group, and several of the appliances have gotten into a broken, unreachable, state and been running like that for a long time (though you seem to be able to get away with a couple broken appliances indefinitely).
It may be that you have let the appliance remain in that broken state because you are trying to treat APM appliances as burstable/elastic components. That is not supported.
WORKAROUND
An appliance being down usually denotes a hardware failure and we would advise the bad appliance to be removed from the system configuration so it isn't used for capture.
That said, APM ought to always use reasonable timeouts when trying to contact remote appliances. The agent shouldn't be blocking on metric queries to offline archivers.
Though we are working on making this area of APM more robust so that failed appliances don't affect the entire Capture Group, in the meantime don't let any Archiver appliances "fall off the map" and become unreachable. Don't let it get into this state and continue to run like this with the broken appliances for too long--keep on top of it. Either fix them or remove them from the Capture Group.
STATUS
Waiting for fix in a future version or patch of APM.
© ALL RIGHTS RESERVED. Feedback Terms of Use Privacy Cookie Preference Center