Foglight has generated an alarm for the rule DBO - Redo Archive Average Redo Write
. The alert message indicates that the average time spent writing redo log entries to disk was high, even though little or no redo data was actively being generated in the database.
How is Redo Archive Average Redo Write time calculated?
Why may the alert trigger even when redo activity appears low?
You receive a critical or warning alarm such as:
DBO - Redo Archive Average Redo Write
Severity: Critical
Message: The average time spent writing a redo log entry to the log files is 253.52 ms.
The alarm is based on average redo log write latency, not the volume of redo generated.
Foglight calculates this using the following formula:
DBO_Avg_Redo_WriteTime = log_write_waits_aux / redo_writes_aux
Where:
log_write_waits_aux
is the total log-related wait time (in milliseconds) from v$system_event
.
redo_writes_aux
is the total number of redo write operations from v$sysstat
.
This value represents the average time (in ms) spent writing each redo entry.
The Foglight rule logic is:
condition = (
avg(#DBO_Avg_Redo_Write_Time for 15 minutes#) >= registry("DBO_Avg_redo_write_time_Medium")
);
This means:
Foglight calculates the 15-minute average redo write time.
It then compares it against the configured threshold (e.g., 60 ms
).
If the average exceeds the threshold, the alarm fires.
There is no baseline model or percentage comparison used.
Even if only a few redo writes occur, the average write time can be inflated due to:
Occasional I/O spikes.
Archive or redo-related wait events.
Low workload (a small number of writes with one long stall can skew the average).
Example:
If only 2 redo writes happen during 15 minutes, and one takes 400 ms, the average becomes:
(400 + 10) ms / 2 = 205 ms
This exceeds the default threshold and triggers the alarm.
Users can view the topology in the following location, each collection result is displayed. Set the zonar for the appropriate time range.
Note
Tune the threshold:
Increase DBO_Avg_redo_write_time_Medium
to a more suitable value (e.g., 95 ms).
Adjust depending on system I/O characteristics and tolerance.
Let's use this point in time query as an example:
select LOG_WRITE_WAITS_AUX / REDO_WRITES_AUX
At 17:40, that might give:
227,390 / 22,583 = 10.07 ms
That looks fine — so would Foglight send an alarm like this?
“Average redo write time = 235.8 ms” at 17:40
Foglight uses changes over time, not the total value at one moment.
Let’s say this is what Foglight saw every 5 minutes:
Time | LOG_WRITE_WAITS_AUX | REDO_WRITES_AUX |
---|---|---|
17:25 | 194,060 | 22,283 |
17:30 | 194,540 | 22,351 |
17:35 | 209,170 | 22,554 |
17:40 | 227,390 | 22,583 |
Foglight does this:
Between 17:35 and 17:40:
(227,390 - 209,170) / (22,583 - 22,554)
= 18,220 / 29 = 628.28 ms
Between 17:30 and 17:35:
14,630 / 203 = 72.07 ms
Between 17:25 and 17:30:
480 / 68 = 7.06 ms
Then it averages the 3 intervals:
(628.28 + 72.07 + 7.06) / 3 = ~235.8 ms
Even though the overall value looks normal at a glance (10 ms), Foglight sees that in one 5-minute window, redo writes were slow — and that spikes the 15-minute average.
That's why an alarm can be triggered.
This metric is not directly related to archive log size or frequency. A low number of archives (e.g., 2 logs of 100MB) does not imply redo writes are fast.
The alarm is driven by LGWR wait times, not redo volume.
© 2025 Quest Software Inc. ALL RIGHTS RESERVED. Terms of Use Privacy Cookie Preference Center