Chatta subito con l'assistenza
Chat con il supporto

Foglight 5.9.1 - Performance Tuning Field Guide

Overview Hardware and Operating System Tuning Management Server Tuning Java Virtual Machine Tuning Backend Database Tuning High Availability (HA) Tuning Agent Tuning Appendix: Performance Health Check Appendix: Analyzing a Support Bundle

Derivation Related Issues

This service is most often the cause of diagnostic snapshots that are 500 to 600 MB in size. Large file sizes indicate you may have an issue with an excessive number of derivations.

Search the file for: with .* derivation rulettes. The results should resemble the following:

DATA_DRIVEN with 115 derivation rulettes

DATA_DRIVEN with 115 derivation rulettes

DATA_DRIVEN with 187230 derivation rulettes

DATA_DRIVEN with 97 derivation rulettes

DATA_DRIVEN with 115 derivation rulettes

The third line (187230 derivation rulettes) indicates an issue because the number of derivation rulettes is significantly larger than any other rulettes listed. Locate the complex derivation definition (located above the with .* derivation rulettes section in the file), and make a note of the topology type and metric name. For example:

Complex derivation definition: DBSS_Total_Elapsed_Time_Per_Exec (null) : DerivationCalculation for DBSS_Top_Sql

where DBSS_Top_Sql is the topology type and DBSS_Total_Elapsed_Time_Per_Exec is the metric name.

If there is evidence of derivation problems, contact Quest Support with this information for further assistance and to determine if the issue has been resolved in the latest version of the affected cartridge.

---- com.quest.nitro:service=Topology

This is the topology service.

The most useful part of this service is in the extra information, which lists all topology types and, for each type, the number of instances, the number of instance versions, the maximum versions, and the effective instance versions. This information helps you determine if the topology is too large, or if there is a topology churn. Look for a high number of versions of instances, as well as a high maximum versions.

The topology table has the following six columns:

Topology Related Issues

Topology churn is defined as the constant changing and creation of new versions of existing topology objects. Each time a property is updated on an instance, a new version of that instance is created. Topology churn can cause high CPU usage as the Management Server propagates the changes across the rest of the topology model.

Topology growth is defined as the continuous creation of new instances of a type of topology object. Topology growth can cause high CPU usage as models and rulettes are updated, as well as increased JVM heap usage. The entire topology model is stored in memory, so as the number of objects added increases, so does the heap usage.

If the values in columns five and six in the table above are greater than 5000, examine the highest numbers and work your way down the list. Resolving issues with the higher ones can sometimes resolve other churn issues, since topology changes to one object can cause changes in other objects.

For example, consider the sample rows of the topology table below:

| DBO_Alert_Log | 76 | 2 | 38 | 76 | 38 |

This is an example of a good model. There are 38 instances (column 4), with a maximum of 2 versions (column 3), for a total of 76 versions (column 2). The numbers are in balance.

| DBO_Datafile | 5816 | 2 | 2908 | 2 | 1 |

This is also an example of a good, stable model. Even though the numbers are higher, 2908 x 2 = 5816, so the numbers are in balance. Additionally, in the last 7 days, there was only 1 new object, with 2 changes. There is no large growth or churn in this example.

| DBO_Undo_Activity_Info | 393761 | 16810 | 39 | 0 | 0 |

This is an example of a model that was bad but has become good. There are 393761 total versions in history, but no new changes (0) in the last 7 days.

| HostNetwork | 238231 | 4472 | 846 | 234543 | 42 |

This is a bad topology model. A large number (234543) of new versions have been created in the past 7 days.

| VMWESXServerPhysicalDisk | 28652 | 3 | 3530 | 10590 | 5295 |

This is also a bad model. In the past 7 days, 5295 new instances have been created. Column 4 indicates that some stale object cleanup has been done, but unless the root cause is found, the instances will keep being created.

If there is evidence of topology problems, contact Quest Support with this information for further assistance and to determine if the issue has been resolved in the latest version of the affected cartridge.

---- com.quest.nitro:service=DataCacheEviction

This service lists metrics that are being held in the JVM waiting to be written to the database permanently. This information is located in the Cache Policies section of the diagnostic snapshot.

If many (thousands) of metrics are held in memory for long periods of time, they cannot be cleaned up by a garbage collector (GC) because they are active/live objects. Therefore, a large portion of memory is used simply by data that should be written into the database instead. This lead to JVM heap exhaustion, and performance problems.

The following is an example of the Cache Policies section of the diagnostic snapshot:

Cache Policies:

cbc82b6a-1f8c-4fa8-a88a-fcb07af2854e:file_physical_io_pct - age:259200000 granularity:300000 cached duration:259500000 num values:123 delay:192764

c25e1d2-c20b-400e-b87c-b9749c28899a:DBO_File_Avg_Read_Time_Ms - age:259200000 granularity:300000 cached duration:259500000 num values:122 delay:43089

1a6c1537-3050-499d-9e93-1f702b1ab77f:file_read_time - age:259200000 granularity:300000 cached duration:259500000 num values:140 delay:11026

1279f9c8-c72d-4deb-9080-73eeed73d70a:DBO_Datafile_File_Write_Requests_Rate - age:259200000 granularity:300000 cached duration:259500000 num values:118 delay:12308

71d83485-c441-45e9-9cc6-ccd3bb510f71:file_physical_writes - age:259200000 granularity:300000 cached duration:259500000 num values:136 delay:37018

Each line can be broken down as follows:

cbc82b6a-1f8c-4fa8-a88a-fcb07af2854e—topology object ID

file_physical_io_pct—name of the metric

age:259200000—length of time the metric is kept in memory (in ms)

granularity:300000—rawness of the metric value (in ms)

num values:123—number of values of this metric on this object

delay:19276—length of time the metric has been in memory

You can search the diagnostic snapshot for the metric name, and locate its parent topology in the XML schema. For example, for the metric detailed above:

<property name='file_physical_io_pct' type='Metric' is-many='false' is-containment='true' unit-name='count'>
<annotation name='UnitEntityName' value='percents'/>
</property>

This metric is contained in the following XML tag:

<type name='DBO_Datafile_IO_Activity'
extends='DBO_Instance_Alarm_Object'>

This indicates that the file_phyiscal_io_pct metric is part of the DBO topology.

Contact Quest Support with this information for further assistance and to determine if the issue has been resolved in the latest version of the affected cartridge.

Related Documents

The document was helpful.

Seleziona valutazione

I easily found the information I needed.

Seleziona valutazione