Chat now with support
Chat with Support

Foglight for SQL Server (Cartridge) - User Guide

Introduction to this Guide Using Foglight for SQL Server
Viewing the Databases Dashboard Overview Dashboard Advisories Dashboard Monitoring Data Replication Monitoring SQL Performance Reviewing Memory Usage Reviewing the Instance Activity Reviewing Database Usage Reviewing the Services Using the HADR Drilldown Using the Logs Drilldown Reviewing Configuration Settings Viewing User-defined Performance Counters and Collections
Monitoring Business Intelligence Services Administering Foglight for SQL Server
Configuration Settings Managing Foglight for SQL Server Agent Settings Reviewing Foglight for SQL Server Alarms Generating Reports Monitoring SQL Server instances on VMware servers
Access methods Adhoc SQL Plans Alarm Alert Allow updates Anonymous subscription Authentication AutoClose AutoGrow Automatic Discovery AutoShrink B Batch BCP (Bulk Copy Program) Blocking Books Online Bound trees Buffer Buffer cache Buffer pool Bulk copy Bulkinsert Bulk load C Cache CAL Calibration Cardinality Cartridge Chart Checkpoint process Client network utility CLR Compile Connect Connection Connectivity software CPU Usage Cursors D Data access components Data file Data storage engine Database Database object DataFlow DBCC DBID DBO Deadlock Disk queue length Disk transfer time Disk utilization DiskPerf Distributing instance Distributor DMO Drilldown DTC DTS E Error log Event alert Execution contexts Extended stored procedures Extent External procedures F File Filegroup File cache Foglight Agent Manager Foglight Management Server Foreign key Forwarded records Free buffers Free list Free pages Free potential Free space Freespace scans Full text search G GAM Ghosted records Growth increment GUID H Hash buckets Hashing Heap Hit rate Hit ratio Host name Host process I I/O Index Indid Integrated security Intent Locks J Job K Kernel memory Kernel mode Kill L Latch Lazy writer Licensing Lightweight pooling Lock Lock area Lock escalation Lock mode Log Log cache Log writer Logical I/O LRU LSN M Master MaxSize MDAC Metric Misc.normalized trees Model Monitor page file N Named pipes Net library NIC Null O OBID Object plans OLAP OLAP service OLTP Optimizer Optimizer cache osql P Page life expectancy Paging Panel Parse Parser Per seat licensing Per server licensing Performance alert Physical I/O Physical read Physical write PID Pinned Plan Plan cache Potential growth Prepared SQL plans Primary key Privileged mode Procedure cache Procedure plans Process Profiler Publication database Publisher Publisher databases Publishing server Pull subscription Pulse Push subscription Q Query plan R RAID Random I/O Read ahead Recompile Referential integrity Relational data engine Replication procedure plans Role Rollback S sa Schema locks Sequential I/O Session Severity SGAM Shared locks Show advanced options SMP Sort, Hash, Index Area SPID Spike Spinner SQL Agent Mail SQL Mail SQL Plans SQL Server Agent SQL Server authentication SQL Server books online Standard deviation Stolen pages Stored procedure Support service SYSADMIN role T TDS TempDB Temporary tables and table variables Threshold Torn page detection Transaction Trigger Trigger plans Truncate Trusted U UMS Unused space User connection area User mode V Virtual log file VLF W Waitfor Windows authentication mode Working set
SQL Performance Investigator Metrics
Active Time All SQL Agents CPU Usage All SQL Agents Resident Memory Usage Availability Average Physical I/O Operations Average SQL Response Time Backup Recovery Wait Blocked Lock Requests Checkpoint Pages CLR Wait CPU Usage CPU Wait Cursor Synchronization Wait Database Replication Wait Deferred Task Worker Wait Degree of Parallelism Disk Utilization DTC CPU Usage DTC Resident Memory Usage Distributed Transaction Wait Executions Ended Executions Started External Procedures Wait Full Scans Full Text Search CPU Usage Full Text Search Resident Memory Usage Full Text Search Wait Free Buffer Wait Hosted Components Wait IO Bulk Load Wait IO Completion Wait IO Data Page Wait IO Wait Latch Buffer Wait Latch Wait Latch Savepoint Wait Lazy Writes Lock Wait Lock Bulk Update Wait Lock Exclusive Wait Lock Intent Wait Lock Requests Lock Schema Wait Lock Shared Wait Lock Update Wait Lock Wait Log Buffer Wait Log Flushes Log Other Wait Log Synchronization Wait Log Wait Log Write Wait Memory Wait Network IO Wait Network IPC Wait Network Mirror Wait Network Wait Non SQL Server CPU Usage Non SQL Resident Memory Usage OLAP CPU Usage OLAP Resident Memory Usage OLEDB Provider Full Text Wait Other CPU Usage Other Miscellaneous Wait Other Wait Overall CPU Page Life Expectancy Page Splits Parallel Coordination Wait Physical I/O Physical Memory Used Physical Page Reads Physical Page Writes Probe Scans Plan Cache Hit Rate Range Scans Rec Ended Duration Remote Provider Wait Run Queue Length Samples Service Broker Wait Session Logons Session Logoffs SQL Agent CPU Usage SQL Agent Resident Memory Usage SQL Executions SQL Mail CPU Usage SQL Mail Resident Memory Usage SQL Recompilations SQL Response Time SQL Server Background CPU Usage SQL Server Cache Memory SQL Server Connections Memory SQL Server Connections Summary SQL Server Foreground CPU Usage SQL Server Resident Memory Usage SQL Server Swap Memory Usage Synchronous Task Wait Table Lock Escalation Target Instance Memory Total CPU Usage Total Instance Memory Virtual Memory Used
Rules Collections and Metrics
SQL Server Agent's Default Collections Access Methods Agent Alert List Agent Job List Always On Availability Groups Backup Locations Blocking History Blocking List Buffer Cache List Buffer Manager CLR Assemblies Cluster Summary Configuration Database Index Density Vectors Database Index Details Database Index Fragmentation Info Database Index Histogram Database Index List Database Information Database Properties Database Sessions (Session List) Database Summary Database Tables List Databases Deadlock DTC Information Error Log Error Log List Error Log Scan File Groups File Data Flow Statistics File Groups Files Files Drive Total Files Instance Summary Full Text Catalog InMemory OLTP (XTP) Instance Wait Categories Instance Wait Events Job Messages Latches and Locks Lock Statistics Locks List Log Shipping Log Shipping Error Logical Disks Long Running Session Memory Manager Mirroring Mirroring Performance Counters Missing Indexes Plan Cache Distribution Plan Cache List Replication Agents Replication Agent Session Actions Replication Agent Session Merge Articles Replication Agent Sessions Replication Agent Sessions by Type Replication Available Replication Publications Replication Subscriptions Reporting Services Resource Pool Session Data Session Trace SQL PI Instance Statistics SQL Server Connections Summary SQL Server Global Variables SQL Server Host SQL Server Load SQL Server Services SQL Server Throughput SQL Server Version Info SSIS OS Statistics SSIS Summary Statistics Top SQLs Top SQL Batch Text Top SQL Long Text Top SQL Plan Top SQL Short Text Top SQL Summary Traced SQL PA Usability User-defined Performance Counters User-defined Queries Virtualization XTP Session Transactions Statistics

Alarms Displayed in the Database Details Panel

Several alarms can be investigated using the Database Details panel of the Databases drilldown, as follows:

The Recent Backups alarm becomes active when Foglight for SQL Server detects that any SQL Server database has not been backed up in the last few days.

The Database Unavailable alarm becomes active when Foglight for SQL Server detects that a SQL Server database is not available for reading. Users attempting to access an unavailable database receive an error message.

This alarm detects unusual database statuses, including Suspect, Offline, Recovering, Loading, Restoring, Emergency Mode, and others.

When this alarm occurs, you should:

Determine which databases are unavailable. Check the Databases table on the Databases drilldown. The Status column shows which databases are unavailable.

Some of the more common unavailable statuses are detailed in the following sections:

Setting databases offline can only be carried out manually, using the sp_dboption procedure. If any databases are Offline, consider using sp_dboption or ALTER DATABASE to bring the database online again.

Databases marked as Loading or Restoring are currently being restored by a RESTORE DATABASE or RESTORE LOG command. The database cannot be accessed by anyone while these commands are executed.

This status is also assigned to databases that have been restored using the NORECOVERY option. Specifying this parameter on a RESTORE statement notifies SQL Server that additional transaction logs need to be restored, and that no access to the database is permitted until these transactions are executed.

Check the Sessions panel on the SQL Activity drilldown for active sessions that are processing a RESTORE command (where the Last Command column contains Restore). If no sessions are processing a RESTORE command, the most likely reason for the database’s unavailability is that the last restore was carried out using the NORECOVERY keyword.

Removing the Loading/Restoring status requires completing the RESTORE process. This can involve either waiting for the active RESTORE command to complete, or restoring the remaining transaction logs. The last transaction log should be restored without the NORECOVERY keyword. If the database is mirrored, a Restoring status is shown on the mirror.

Databases are Recovering (or InRecovery) for a while when SQL Server is restarted, or the database is first set online. This is the status SQL Server uses for indicating that it is re-applying committed transactions, or removing uncommitted transactions after a SQL Server failure.

Normally, re-applying these transactions should take only a short time; however, if any long-running transactions were open when SQL Server ended abnormally, this procedure can take an extended period.

In some cases, it is advisable to bypass the SQL Server recovery process. For example, it would make much more sense to skip a lengthy recovery process when planning to drop the database as soon as the recovery process completes. For details on skipping the recovery process, see Bypassing SQL Server recovery .

Databases can be Suspect if they fail SQL Server's automatic recovery. This status most commonly appears after a SQL Server restart, when the automatic recovery process carried out during restart has failed. Databases can also be marked as Suspect when serious database corruption is detected.

The first measure that should be taken when a Suspect database is detected is to check the SQL Server error log, and look for error messages indicating recovery failure or database corruption. These messages should indicate the problem’s cause.

To correct a suspect database, consider taking the following measures:

Use the sp_resetstatus stored procedure (documented in Microsoft SQL Server’s Books Online) to reset the database status.
If the Suspect status was caused by a full disk during recovery, free up disk space and use the sp_resetstatus stored procedure (documented in Microsoft SQL Server’s Books Online) to reset the database status. SQL Server should then be restarted to initiate recovery.

In most cases, a suspect database is best handled by restoring the database from the last good full database backup and transaction logs.

Emergency mode is a special status, which can be set on an individual database, thereby causing SQL Server to skip recovery for this specific database. In some cases, taking this measure can make the corrupt database available in order to extract data that cannot be retrieved in any other way.

Activating emergency mode causes SQL Server to skip the recovery of this database, thereby preventing the database being made suspect. However, the database may contain partially-complete transactions, and there may be inconsistencies between data and indexes (logical and physical corruptions). Do not carry out any database changes or updates when SQL Server is started in this way. Emergency Mode is documented at:

Another high risk option to access a suspect database is to start SQL Server with Trace Flag 3608. This trace flag causes SQL Server to skip its automatic recovery process on ALL DATABASES when it starts. Again, this procedure may be sufficient for extracting data that cannot be retrieved in any other way.

Use the sp_resetstatus stored procedure (documented in Microsoft SQL Server’s Books Online) to reset the database status of any Suspect databases.
Stop SQL Server, and then start it from a command line with Trace Flag 3608 and minimal startup (sqlservr.exe -f -c -T3608). This setting causes SQL Server to skip its automatic recovery at startup, thereby preventing the database from being made suspect. However, the database may contain partially-complete transactions, and there may be inconsistencies between data and indexes (logical and physical corruptions). Do not carry out any database changes or updates when SQL Server is started in this way.

With both Emergency Mode and Bypassing SQL Server Recovery, you may then be able to extract your data using BCP.EXE and/or script the database to get the latest database definitions. This can then be loaded into a new database using BCP.EXE or BULK INSERT. Be aware that the extracted data may not be complete.

File Group Utilization Alarm

The File Group Utilization alarm becomes active when a non-fixed size data file (that belongs to the file group) in any database is in danger of running out of space to grow.

This alarm is invoked whenever the space utilization percentage of a specific file group exceeds a predefined threshold value.

Under the Databases drilldown, click the Data Files panel.

Resolve this issue by freeing up disk space on the disk on which the file resides.

The File Group Utilization alarm is raised when the following scenario takes place:

Log Flush Wait Time Alarm

The Log Flush Wait Time alarm becomes active when the duration of the last log flush for a database exceeds a threshold.

Because users make modifications to SQL Server databases, SQL Server records these changes in a memory structure called the Log Cache. Each SQL Server database has its own log cache.

When a user transaction is committed (either explicitly, by means of a COMMIT statement, or implicitly), SQL Server writes all changes from the Log Cache out to the log files on disk. This process is called a log flush. The user that issued the commit must wait until the log flush is complete before they can continue. If the log flush takes a long time, this degrades the user's response time.

Foglight for SQL Server checks the log flush wait time for the last log flush performed for each database. If a database has a slow log flush, and then has no update activity (and therefore no more log flushes) for a long time, Foglight for SQL Server continues to report this as an alarm until another log flush is performed for that database.

On the Databases drilldown, select the Summary panel to review the Log Flush Wait Time counter in the Database History graph. The database with the high graph values is the one experiencing the problem. If a database has a consistently high value that never changes, run SQL command CHECKPOINT on that database to force another log flush and check the value in Foglight for SQL Server again.
Select the Transaction Logs panel on the Databases drilldown to find the disks on which the log for this database resides.

Disk Queue Length Alarm

The Disk Queue Length alarm becomes active when the disk queue length of any disk exceeds a threshold. Sustained high disk queue length may indicate a disk subsystem bottleneck, and usually results in degraded I/O times.

Disk queue length is a Windows-based metric. Therefore, occurrence of the Disk Queue Length alarm does not necessarily indicate a problem with the SQL Server instance, and can be the result of I/O operations carried out by non-SQL Server processes. Nevertheless, SQL Server, as well as any other application running on the computer for which this alarm is raised, is affected by slower disk throughput.

On the SQL Activity drilldown, click the SQL I/O Activity panel and look at the SQL Server Physical I/O chart, to view whether SQL Server is generating high amounts of disk activity. This chart displays the rate (I/O per second) for each type of I/O that SQL Server is performing. If SQL Server is not generating a lot of I/O activity, the high disk queue length is most likely being caused by some other Windows process, or by Windows itself.
On the SQL Activity drilldown, click the Sessions panel to see which SQL Server processes are executing at the time the alarm was raised, and the SQL currently being executed.
On the SQL Activity drilldown, click the SQL I/O Activity panel and look at the SQL Server Physical I/O chart, to view the Checkpoint statistic. If the Checkpoint process is generating a lot of I/O, review the Recovery Interval setting in the Configuration drilldown.
Related Documents

The document was helpful.

Select Rating

I easily found the information I needed.

Select Rating