What is "lock timeout rate"? How is the "Long Lock Running" metric calculated in SQL Server environment?
The #lock_timeouts# (for rule “Lock Timeout Rate”) is taken from the Locks object as follows:
SELECT *
FROM sys.dm_os_performance_counters
WHERE LOWER ( [object_name] ) LIKE '%locks%'
AND LOWER ( [counter_name] ) LIKE '%lock%timeouts/sec%'
AND LOWER ( [instance_name] ) = '_total'
Calculating a rate means: the time interval (an interval which is using the same duration as the starting and ending point of the metric) divided by the metric.
For example: in 60 seconds time interval, the sum of lock timeouts were 30. Therefore the rate would be 0.5 lock timeouts through the graph (30/60).
"Lock Timeout Rate" will fire upon the number of times the instance timed out that exceeded the threshold defined. By default, the threshold is 5 so if the number of times the instance timed out exceeded these 5, an alarm will fire.
Looking at SQL Activity - Blocking History will show all blocks who exceeded a pre-defined period of time (90 seconds by default). If a block exceeded this period, data is collected and an alarm is raised (“Long Lock Running”).
“Long Lock Running” shows that a new long lock was discovered. It is called long for a reason. if it is long, it my start to cause bottlenecks where the “Lock Timeout Rate” is saying many actions on the database are getting timed out due to blocking.
The Foglight rule is based on perfornamce counter "Lock Timeouts/sec ", but there is also another counter called "Lock Timeouts (timeout>0)/sec ". According to the following Microsoft document, "Lock Timeouts/sec " records some internal lock probes, so it may not show that there is a problem.
"Lock:Timeout" events with a duration of 0 are commonly the result of internal lock probes and are not necessarily an indication of a problem. The Lock:Timeout (timeout > 0) event can be used to ignore time-outs with a duration of 0."
http://msdn.microsoft.com/en-us/library/ms189107.aspx
Microsoft documentation on the Lock Timeout(>0):"Number of lock requests per second that timed out, but excluding requests for NOWAIT locks.", as referenced in http://msdn.microsoft.com/en-us/library/ms189107.aspx. We are also monitoring the Lock Timeout(>0) in our cartridge but we do not use it for the application but for customer use.
"Lock:Timeout" events with a duration of 0 are commonly the result of internal lock probes and are not necessarily an indication of a problem. The Lock:Timeout (timeout > 0) event can be used to ignore time-outs with a duration of 0."
Sometimes the "SET LOCK_TIMEOUT 0" setting can be used, which means if any wait occurs then an error is returned. This is reason why this data should be monitored. There might be applications that should not wait, as Microsoft indicates, but there is a reason as to why this data should be monitored.
The data for these metrics at 5.5.8 version is retrieved from WMI.
© 2024 Quest Software Inc. ALL RIGHTS RESERVED. Feedback Terms of Use Privacy Cookie Preference Center