Customer was testing hit/session search based on a custom field but they found some sessions/hits weren't showing in the search results. The symptom happened inconsistently, which means some sessions could be found without issue but some others could not.
From the collector metrices, we could see both collectors have high amount of "INTERCEPTOR_BATCHES_DISCARDED_QUEUE_FULL" and "INTERCEPTOR_QUEUE_MEMORY_USED_PERCENT" almost 100%.
They are using a FxM probe + FxV ( archiver ) pair in each of their 2 sites and there is a standalone FxV appliance to be a server ( so totally 2 FxM + 3 FxV appliances involved in 2 physically locations ). However there is only one valid capture group including 2 collectors and 2 archivers and the FxM is actually using eth0 to communicate with FxM though they seem to have a cross-over cable connection in between.
Their problem is due to the collector group configuration. Each collector is trying to balance the traffic between both archivers. It can only access its local archiver so half the traffic on each collector is being discarded.
They should update their collector group configuration so that they have 2 collector groups with one collector and one archiver in each group.
Also, as you pointed out, they are using eth0 to upload traffic from the collector to the archiver. If possible, they should use eth1.