User seeing many errors as such:
"The SQL query has been executing for longer than its timeout limit but has not timed out. It may be hung."
Are there any plans to improve the performance of the Fragmented indexes drilldown?
Version 11.5 added detection of scripts that have executed for longer than the timeout time without timing out. It appears that SQL Server doesn't always honor the script timeout setting and allows a script to continue executing past the point where it should be terminated with a timeout error. In past versions of Spotlight these scripts would just have blocked data collection, possibly forever, or caused other collection errors such as 'There is already an open DataReader associated with this Command which must be closed first'.
If these timeout violations persist the Diagnostic Server restarts the OOPCollector which creates a completely new set of connections to the monitored instances. This however may leave orphaned Spotlight on SQL Server sessions on those instances.
WORKAROUND:
Fragmentation Overview is potentially a very long running script because SQL Server calculates fragmentation by reading some portion of all the index files. If there are very large indexes or I/O performance is poor or both then it can run for hours and it seems the particular DMV that is called doesn't honor the script timeout setting. The workaround it is disable the Fragmentation Overview collection for those connections that are experiencing this error. Spotlight design will change the way this collection works so that it only focuses on a key subset of the indexes and hence executes in less time. Increasing the timeout time is problematical. The intent of this collection is to show index fragmentation that might cause more I/O that necessary. If one allows the script to run for longer this is causing more I/O which is the very thing the collection is trying to assist in reducing.
The other collections appear to be simply blocked by other SQL Server processes. The Diagnostic Server has terminated those collections and monitoring should continue. The alarms are raised only to show that some collections failed and some data is missing. If these alarms clear at the next data collection then the problem is transient and can be ignored however if the alarms continue to be raised they will need to find out why the blocking is happening. This might be tricky because the Diagnostic Server terminates the collection when it detects this problem and that may cause the blocking to end. So there might be only a few minutes where the blocking is visible. Given that Spotlight itself is blocked during this time one can't use Spotlight to find the blocking so user will need to execute the blocking script manually. To do this locate the file {DS install dir}\Agent\conf\PACKAGE\sqlserver_spotlight\2005\QS_BlockingList.sql and execute it against the instance/s in question. One may have to execute it repeatedly over a period of time before they see the blocking.
If this collection still have this error after block has been resolved, then the future version of Spotlight will resolve this issue.
STATUS:
Issue fixed in the current release of Spotlight on SQL Server Enterprise. Latest version of Spotlight can be downloaded from here.