For getting data feed into APM or Foglight Experience Monitor (FxM) and Foglight Experience Viewer (FxV), we strongly recommend using a full-duplex Network TAP rather than a SPAN port or an Aggregation TAP, particularly when there is SSL traffic to be decoded. We've heard that not using a full-duplex Network TAP will likely produce poor results. Why is this?
In our FxM system health metrics, we have a non-zero 'Client Segments Missing Rate %' and 'Server Segments Missing Rate %'. Our 'Packet Drop Rate %' is zero.
Network Switches are manufactured to place a priority on routing packets, not copying them. SPANning a port is a secondary function. So as the less important function, the SPAN ports tend to drop replicated packets. This results in the APM Sniffer or FxM being given a "dirty feed" of traffic (data gaps). In FxM, 'Missing Client Segments' and 'Missing Server Segments' will be present in the system Health metrics even if the appliance itself is dropping zero segments ('Packet Drop Rate %' is zero).
SPAN ports do not scale well and will never replicate all packets successfully--they always have some rate of loss. On low-traffic switches, that rate can be quite low but still happens. On high-traffic switches, the rate of loss will be much higher as the priority of the switch is to route traffic, not replicate traffic. So as load increases, the first thing to get thrown out the window is packet replication on the SPAN ports.
Although the Client Segments Missing and Server Segments Missing would be a side-effect of the APM Sniffer or FxM appliance itself dropping packets due to overload, that is not the situation here because your 'Packet Drop Rate %' is zero. The SPAN port is dropping packets in the SWITCH. That characteristic of SPAN has nothing to do with FxM. The packets are being lost outside of the appliance.
In the case of SSL even one dropped packet can result in the entire session becoming undecryptable. In SSL traffic there is a salt used in each packet that gets used to decrypt the next packet, so if ONE packet gets lost in an entire SSL conversation it can no longer be decrypted. In the right circumstances (i.e. long SSL sessions say over 15 min) a small drop rate like 1% or less could cause 100% of SSL traffic to become unreadable. To sum up, missing segments will result in SSL errors which in turn causes FxM to not decrypt SSL sessions and that causes FxM (and FxV) to lose Hits and User Sessions and have inaccurate metrics.
There are different types of network TAPs. "Aggregation" Network TAPs take a full duplex link and merge the ingress and egress streams into one half-duplex stream so that only one monitoring NIC is needed on a monitoring device. This type of TAPs effectively work just like a SPAN port and drop packets. So we do NOT recommend aggregation TAPs.
True full-duplex Network Taps are strongly recommended because they do not drop network packets. These TAPs split the ingress and egress out into two separate streams that feed into two monitoring NICs on a monitoring device.
We have a free trial program so that you can evaluate a Network Tap at no cost or risk. Refer to this Solution for details:
https://support.quest.com/foglight/kb/157922
Note: Some of the high-end smart/intelligent/buffered TAPs have a lot of capabilities such as can do real-time pre-filtering of the traffic before sending the packets to APM or FxM. These taps can also split or aggregate a full duplex link. So if you are using a smart TAP, make sure that it is actually splitting the streams.
Note: Even full-duplex Network TAPs will drop packets if that TAP is tapping a link that is connected to a SPAN port or Aggregation TAP somewhere upstream on your network.
For a more discussion of TAPS and SPAN ports:
http://communities.quest.com/community/foglight/experts/blog/2009/10/22/foglight-eum-using-network-taps-versus-a-switch-span-port
And then open up that tcpdump in Wireshark and go to 'edit' | 'Find Packet'| and then click on the 'By String' radio button and put in the 'filter' field either of the below.
[TCP ACKed unseen segment]
[TCP Previous segment not captured]
And then search. Wherever the search finds either of those messages above, that is indicative of missing segments.
© 2021 Quest Software Inc. ALL RIGHTS RESERVED. Feedback Terms of Use Privacy