Chat now with support
Chat with Support

Foglight 6.3.0 - Administration and Configuration Guide

Administering and Configuring Foglight Extending Your Monitoring Reach with Foglight Cartridges Administering Foglight Configure Rules and Metric Calculations to Discover Bottlenecks Customizing Your Foglight Environment with Tooling

Persistence Handler Overview

The Persistence Handler Overview dashboard contains charts that display information about the persistence handler such as the number of rows in the table, data length, index length, batch insert time, and query time.

1
On the navigation panel, under Dashboards, click Management Server > Servers > Persistence Handler Overview.

Data Management

Use the Object Cleanup dashboard when you need to inspect, purge, and delete data objects, and particularly for removing objects that are no longer needed.

As an example, deleting agent instances from the Services dashboard removes the agent definitions. However, you can still view the agent entries in other dashboards such as the dashboard for Oracle. To remove the agent entries from these dashboards, you need to delete the agents using the Object Cleanup dashboard to remove the services and OracleModel Instances that were created for the Oracle dashboard.

Foglight builds topology models at run-time using the data collected from your monitored systems, and keeps information related to monitored types indefinitely. Global Default setting can be used so objects are not kept indefinitely. Objects that are not expected to change or update are considered stale. These stale objects consume monitoring resources and can be cleaned up. If you have a very large number of stale objects, this may indicate a misconfiguration of your monitoring environment.

Use the Object Cleanup dashboard to remove stale objects, and to reconfigure your URL transformation rules if necessary.

On the navigation panel, under Dashboards, click Management Server > Servers > Object Cleanup.

The following walkthrough demonstrates how to locate and manage stale types based on the default settings provided. In this example, we are using Catalyst Serviceability Object as the type with too many objects.

2
Select CatalystServiceabilityObject in the list.
The CatalystServiceabilityObject pane appears.
IMPORTANT: After you type a new number in the since box, you must press Apply to ensure the value is updated.
The % usage (current/capacity) and Filtered Objects (those that match your selection) numbers update automatically based on your selection.
If you know that there are stale objects that were created but never updated, you can filter on Inactive since x time range, where the time range could be days, weeks, months, or years. This case is useful for cleaning up older types.
IMPORTANT: After you type a new number in the since box, you must press Apply to ensure the value is updated.
The % usage (current/capacity) and Filtered Objects (those that match your selection) numbers update automatically based on your selection.

You can configure the number of days each type object is allowed to remain inactive before being automatically deleted.

2
Click Auto-delete.
5

Use the Retention Policy dashboard to set system-wide policies for managing database growth through automatic data aggregation and data purges. You can use standard policies (for example, short-term, long-term, or managed roll-ups) or create a custom retention policy.

When a specific retention policy has not been defined, the system-wide retention policies are used by default.

If you find that the standard retention policies do not meet your needs (that is, long-term policy retains granular data for a longer period than what is required) you can create a custom retention policy.

For example, if you want to change the retention policy to have more granular data (that is, for at least one day) such as for capturing reporting data. After a day, you want to roll up that data to an hourly average and kept for a week. Beyond that, you want to keep a daily average for 15 months.

You can also navigate to an advanced dashboard to set rollups per object, using the Manage Retention Policies dashboard in the Administration module. For more information, see Manage Data Retention.

To create a custom retention policy:

1
On the navigation panel, under Dashboards, click Management Server > Servers > Retention Policies.
2
In the Retention Policy area, click Edit.
The Retention Policy dialog box opens.
4

Use the Manage Monitored Objects table to look at some of the common objects (organized by domain) in Foglight. For example, you can see that some old host and agent objects should be removed as they were part of the early testing of deployment. For each object you can see the amount of data for that object and its children in the total size. This is a handy breakdown as it gives you an idea of what data is returned.

1
On the navigation panel, under Dashboards, click Management Server > Servers > Object Cleanup.
2
Click the Data Management Browser link in the bottom right-hand corner.

Object

Name

Name of the object or its children. Click object to view a popup for additional details about the object.

Direct database size

 

Estimate of the metric instances for the object and its children.

Topology

Displays the size of data (accumulated and directly) and its dependent objects.

Observations

Gives an idea of the direct (topology) size and any non-containment property.

Purged

 

Shows whether the current node has been purged. If Yes appears in the column, you can click it to see the purge details.

Last Updated

 

Shows the last time at which data was collected for a particular object and its children.

This column ignore model alarm count calculations.

Use the Delete, Purge for objects, or Purge for types options in the header of the table to manage data objects.

Delete — Use this option to immediately delete the object, the object metric, or delete the object tree. You can choose whether to delete an object individually or delete its tree and also choose whether or not to delete the metrics associated with an object. For example, if you choose to delete a host, all data associated with the host is deleted.

Purge for objects — Use this option to schedule the removal of the observations for a selected object during the database off-hours maintenance cycle. By default, this maintenance is scheduled at the end of the month. For more information, see Purge data objects.

Purge for types — Use this option to schedule the removal of objects by types during the database off-hours maintenance cycle. By default, this maintenance is scheduled at the end of the month. For more information, see Purge data objects.

Use the Direct Database Size area to evaluate the current storage allocation for each object in the system and to purge/delete unwanted objects from the system. For example, you can figure out where space is used by viewing how much and where the space is being used.

By viewing the size of the direct database you can see immediately how much space is being consumed by the database, as well as the number of metrics. You can also see a breakdown of space by metric age. For example, from the table you can tell there is a lot of old data and you can decide if you want to keep some old data, or which data to delete or purge.

Use the purge feature if you want to keep the topology object for a nightly cycle. Within a 24 hour period, the data object should be gone from the topology tree.

1
On the navigation panel, under Dashboards, select Management Server > Servers > Object Cleanup.
2
Click the Data Management Browser link in the bottom right-hand corner.
5
Click Purge for objects or Purge for types, as necessary.
7
Click Set.

When you delete an object, you actually remove the topology object immediately from the tree, including all of its children. For example, you may want to delete a data object if you have old data on a decommissioned host and want to remove the topology objects. If you delete an object that is still active, it will be re-created.

1
On the navigation panel, under Dashboards, choose Management Server > Servers > Object Cleanup.
2
Click the Data Management Browser link in the bottom right-hand corner.
4
Click Delete beside the object to remove the object and all of its children.
5
Click Yes to the confirmation message Are you sure you want to delete this object and all of its children?.

Management Server View and Management Server Metrics

The Management Server View dashboard is useful for examining the server performance. It can help you diagnose potential performance problems and identify bottlenecks in your environment.

1
On the navigation panel, under Dashboards, choose Management Server > Servers > Management Server View.
2
Click Choose a Foglight Server and select the server from one or more list entries that appear.
Rule Service: Shows the number of rule and derived metric instances bound to data objects.
Data Service: Shows the ability of the server to keep up with incoming data.
JVM: Shows the memory performance of the JVM in which the server is running.
JDBC Connection Pool: Shows the number of database connections that are in use or available at any time.
Derivation Service: Shows the number of derivation rulettes, errors, and evaluation counts.
Email Sender Service: Shows the number of emails that are sent from the selected Management Server.
FMS Database Size: Shows the size of the database.

Object Groups

Most of the object groups needed for service monitoring already exist. Administrators can use the Object Groups dashboard to classify a group of data and then use the Tier Definition dashboard to subscribe to which data groupings you want to use as tiers. Object groups do not bind to data.

An object group is a mapping to a certain set of data types of the objects you are interested in. Once an object group is defined, its mapping can be evaluated to return these objects. Because the result depends on what exist in Foglight by the time the evaluation is done, its membership list may vary.

Foglight ships with a number of default object groups and subgroups. These groups are marked as Created by Foglight and cannot be changed or deleted.

If an object group is used as a tier, it cannot be deleted. If an object group or object subgroup is used in a rule to define a dynamic group of components in a service, it cannot be deleted.

An object group cannot be used to define a rule in the Service Builder unless it contains more than one subgroup. Conversely, you are not allowed to delete all subgroups from an object group if the object group is used to create a rule to dynamically include a group of components in a service.

1
From the navigation panel under Dashboards, click Services > Object Groups.
The Create Object Group dialog box opens.
4
Ensure the Is Disabled check box is cleared to enable the object group activation.
5
Click Create.

After creating an object group, you need to add one or more subgroups to that group. For more information, see Create an object subgroup.

An object subgroup contains one or more objects, selected using a common criteria, that make up that subgroup.For example you can create a subgroup that contains all agent instances whose names start with “tor”.

The Create Object SubGroup dialog box opens.
Name: Type a name for the object sub-group that is unique in Foglight.
Description: Optional — Type a description of the object subgroup.
Data Type: Type a regular expression that selects the data type of the subgroup. This is a mandatory value. For example, the following expression selects all Foglight agents with the exception of WindowsAgent instances:
Query Conditions: Optional — Type a filter expression that selects specific object instances only. For example, the following expression selects only those object instances that are already selected by the Data Type expression and whose name starts with ‘tor’:
Is Disabled: Ensure the check box is cleared to enable the subgroup activation.
4
Click Test.
5
Click Create.

Use the Tier Definitions dashboard to associate an object group to a tier. A tier is another logical component that references one or more object groups, based on some common criteria. This type of logical structure helps you investigate performance problems associated with multiple objects. For example, to investigate the state of the monitored components within a tier, drill down on a tier and investigate the hosts that are related to it. For more information about services, see the Foglight User Guide.

The object groups that are made available in the Tier Definitions dashboard are based on object groups that were created using the Object Groups dashboard. For more information, see Create an object group.

The Object Groups that you select in the Tier Definitions dashboard appear in the Tier drop down list when selecting a new local or global service using the Service builder dashboard. These tiers are also used in the Service Operations Console, where they show service breakdown by tier.

NOTE: New tiers are hidden by default in the Service Operations Console. To display a new tier, in the Service Operations Console, click Customize Service Operations Console, open the Tier Selector tab, and ensure the newly-created tier is selected. For more information about the Service Operations Console, see the related help page.
1
On the navigation panel under Dashboards, click Services > Tier Definition.

The FMS Overview view is a starting point for performance analysis of the Foglight Management Server. It displays information about the database activity, data service performance, JVM performance, and server load in one unified view.

In federated environments, explore the diagnostics for each individual server: Federation Master, Federation Child, or a standalone Foglight Management Server. To do that, click Choose a Foglight Server and select a host from the list that appears.

This view provides more information on:

Data load summary (Data Service, Inserts per 5 minutes views). In the Inserts per 5 min embedded view, num_rows_inserted_per5min shows the number of rows inserted every five minutes, while avg_insert_per5min shows one-hour average of num_rows_inserted_per5min. Spikes on the Data Service embedded view often indicate some kind of data processing lull such as a high amount of incoming data (seen as an increase in batchesProcessed), or insufficient resources to process incoming data decrease in batchesDiscarded).
Whether there is enough memory (JVM Memory Usage view). In this view, memory_usage shows the heap memory usage, totalMemory shows the amount of total heap memory, and freeMemory is the free heap memory. A sawtooth on the JVM memory graph is normal, but if the amount of memory freed by a garbage collect (such as the height of the sawtooth) is small, you may need more memory on your system. A large topology results in a high amount of incoming data and a decrease in free memory. The amount of free memory should never be zero. If it does reach zero this is an indication to adjust the server configuration.
Whether the load on the server is too high (Server Load Status view). Server load is mostly determined by the memory usage in the JVM old generation. If the server runs out of memory it will go into an “overloaded” state. A server can be in this state and have some memory available in the JVM new generation. If that is the case, the server discards incoming agent data but can continue to service browser interface requests to some degree. If the server remains to be overloaded, it likely needs to be re-configured to have more memory available, or less workload coming from the browser interface and/or the agents.

Database insert spikes, slow processing

Incoming data spikes (batchesProcessed increases), slow processing (batchesDiscarded increases)

Too large topology, too much incoming data (freeMemory increases)

Load is higher than 0.7

The Agents view is designed to see what agents you are running, listed by type. By selecting agent types in the tree, a graph showing the number of agents as a function of time is plotted. This graph is handy for viewing changes to the agents, for example if a number of agents that you did not know about were added, or checking if old agents are properly removed.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Agents.
3
To view agent connectivity, in the Agent Type column, drill-down to the agent type to view its connectivity.

The Baseline Measurements view shows internal database I/O statistics related to baseline computation. A metric baseline is the expected metric range for the given time period. When configured, IntelliProfile computes baselines for metrics, and stores that data in the database.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Baseline Measurements.
The Baseline Measurements view appears in the display area.
Table 6. Look for the following types of information in the Baseline Measurements view.

High peaks in the graph usually indicate an impact on the database performance, depending on the database capabilities.

High peaks in the graph usually indicate an impact on the database performance, depending on the database capabilities.

High values in the graph may be a sign that baselining activities have a high impact on the database performance, depending on the database capabilities.

This graph shows zero values at all times, unless baseline profiles are explicitly removed for trouble-shooting purposes, usually requiring manual intervention.

This graph shows zero values at all times, unless baseline profiles are explicitly removed for trouble-shooting purposes, usually requiring manual intervention.

Normally these values should correspond to those appearing in the Store Call Count graph. If they are greater that means that some attempts failed due to database-related errors.

High peaks in the graph usually indicate an impact on the database performance, depending on the database capabilities.

High peaks in the graph usually indicate an impact on the database performance, depending on the database capabilities.

Outstanding baseline management operations can include enabling or disabling baselines for a particular metric. This graph can be expected to show zero values at all times.

The Connectivity view shows the Foglight connectivity for the JDBC connection pool and the user session count. It summarizes the database connection state from the perspective of the Foglight Management Server. By looking at this view, you can determine if there are enough JDBC connections to service application requests.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Connectivity.
The Connectivity view appears in the display area.

Overused pool indicates a slow database.

High number of sessions results in a high browser interface load, indicating a slow browser interface/server.

The Database Overview view displays the metrics that represent the reads, inserts, and updates of the database rows. These metrics can be used in performance load tests to track the unit load.

Contents of this view includes a drop down to select the host name. Here, you can select a Federation Master, Federation Child, or standalone server. To do that, click Choose a Foglight Server and select a host from the list that appears. It also contains the following embedded views:

The name of the metrics that contain database reads, inserts, and updates depend on the type of the supported database that Foglight uses to store information, as indicated in the following table.

num_rows_deleted

num_rows_inserted

num_rows_updated

innodb_rows_inserted

innodb_rows_inserted_per5min: #innodb_rows_inserted# * 300

avg_inserts_per5min: avg(#innodb_rows_inserted_per5min for 60 minutes#)

qa_db_inserts : select sum(inserts) from sys.dba_tab_modifications where TABLE_OWNER='xxxx'

delta_db_inserts: delta(#qa_db_inserts from QA_DB_INSERTS_Agent_Table_DB_INSERT_COUNT#)

delta_avg_per5min: avg(#delta_db_inserts for 60 minutes#)

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Database Overview.
The Database Overview view appears in the display area.

High number of database operations may indicate a possible database bottleneck. The cause could be related to a problem on the server sid, such as topology changes.

High load may indicate a possible database bottleneck. Check the database specifications.

May indicate a dirty database buffer, possibly because the innodb pool buffer is too small.

Check if there is anything more than row operations.

This view shows the invocation counts of and time spent on garbage collection. It contains two embedded views:

Copy: This view shows the activity of the ParNew garbage collector, used to reclaim memory in the JVM new generation part of the heap. In most cases, there is no need to be concerned with the data on this view unless the numbers become unreasonable (one hour or higher).
ConcurrentMarkSweep: This view shows the activity of the ConcurrentMarkSweep garbage collector, used to reclaim memory in the JVM old generation part of the heap. In general, the values appearing on this view should stay low. An occasional spike does not indicate any problems (such as once a day). If spikes become frequent, the Foglight Management Server memory allocation may need to be adjusted (for example, by balancing the size of new and old generations). In some cases it might be required to increase the overall Foglight Management Server heap.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Garbage Collector.
The Garbage Collector view appears in the display area.

The Java Virtual Machine (JVM) view provides full details on the JVM memory activity. This view is particularly useful if you are performing detailed JVM tuning for the different memory spaces.

To find out more about the JVM heap generation, observe the spikes on the embedded views. Big spikes with a slow climb are better than small spikes (too little available memory) or a quick climb (frequent garbage collection).

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Java Virtual Machine Memory.
The Java Virtual Machine Memory view appears in the display area.

This view shows the number of rulettes (rule instances) that are bound to specific object instances. This view is useful for understanding the environment complexity.

The Rulette and Topology view shows how many rules and rule instances are running. Many rulettes, especially as a function of the number of objects, can indicate that you are overloading your server or if some old rules are still running.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Objects - Rulette and Topology.
The Objects - Rulette and Topology view appears in the display area.

This view lists Foglight scripts with the script ID, script name, sum (ms), count, the difference between minimum and maximum invocation time, and the name of the cartridge in which the script is defined. A plot summary of the script invocation time also appears.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Script Invocation Time.
The Script Invocation Time view appears in the display area.

This view shows the Foglight Management Server load metrics and JVM performance.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Server Load.
The Server Load view appears in the display area.

This view highlights the data and metrics that are being processed in time or getting discarded.

The Service - Data & Message view shows details on data handling of the Foglight Management Server such as the number of skipped messages and the number of discarded metrics. The top-right embedded view shows the percentage of data that is dropped by the server over time, while the top-left embedded view shows the counts of total and skipped messages over time. A high percentage of dropped data can often indicate a server overload. The embedded view in the bottom right is useful for indicating how long data processing is expected to take. If that value grows over time, that is a good indication that the Foglight Management Server cannot keep up with the load from the agents.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Service - Data and Message.
The Service - Data and Message view appears in the display area.

The Service - Derivation and Query view shows activity in the derivation and query service such as the number of evaluations, rulettes, errors, and cache performance.

This view shows a mix of graphs that indicate the cost of metrics such as how many derived metrics are running in the Foglight Management Server, how many metric evaluations have occurred, and information on finding metrics in memory. Metrics are resource-demanding if there are a lot of derived metrics compared to the number of topology objects, or if the memory settings are such that no metric history is being kept in memory.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Service - Derivation and Query.
The Service - Derivation and Query view appears in the display area.

The Service - Persistence view shows metric details for the Foglight Management Server that is related to:

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Service - Persistence.
The Service - Persistence view appears in the display area.

This view shows the number of topology changes, skipped messages, and total messages. This view is useful for understanding the topology structure relating to agent activity.

The Service - Topology & Agent Manager is a two-part view.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select Service - Topology and Agent Manager.
The Service - Topology and Agent Manager view appears in the display area.

The UI Query Time view is similar to the Script Invocation Time view (see page 205) except that it shows tables for a query instead of a script.

This view shows a query table listing the query name, query ID, amount of time spent on invocation, count, and difference between minimal and maximal invocation time. A graph summarizing query invocation time also appears.

1
From the navigation panel, click Dashboards > Management Server > Diagnostic > Performance.
2
In the display area, click FMS Overview, and in the list that appears, select UI Query Time.
The UI Query Time view appears in the display area.

Some Foglight dashboards have reports associated with them. This allows you to run a report based on the current dashboard. You can generate the report using the Reports menu in the top-right corner.

The Performance Overview dashboard and Management Server View are associated with the Management Server Performance Summary Report. Run this report by choosing Management Server Performance Summary Report from the Reports menu, and specifying the input parameters in the report wizard.

The report wizard provides more information about the Management Server Performance Summary Report and instructions on how to set the input values. For more information about reports in Foglight, see the Foglight User Help.

Related Documents

The document was helpful.

Select Rating

I easily found the information I needed.

Select Rating