"IO Error: Socket read interrupted" in Oracle agent logs causing DBO Usability alarms to fire every few minutes
説明
DBO Usability alarms fire every few minutes with messages similar to the following printed in the Oracle database agent log files. This happens for Oracle instances at the 12.2.0.0.0 and higher versions.
ERROR [AGENTNAME-lowPriorityPool-629-[DBO_Top_SQLs][]] com.quest.qsi.fason.core.common.utils.SQLTextExecutor - The processor that was supposed to get the SQL text of {hash value=670665138} hash value failed to run on AGENTNAME instance instance of DBO_Top_SQLs collection. java.lang.RuntimeException: Failed to execute collection [DBO_Top_SQLs], reason=IO Error: Socket read interrupted, Authentication lapse 69953 ms.- Profile:OracleProfile{host='HOSTNAME', service='SERVICENAME', username='USERNAME', asSysDBA=true, ports='1521', useSSL=false, properties={oracle.jdbc.ReadTimeout=900000}}
原因
This is not a Foglight issue and is related to the Oracle's Metalink Document "12.2.0.1 and Above JDBC Connections Sometimes Fail With: IO Error: Socket Read Interrupted (Doc ID 2612009.1)"
The 12.2.0.1 and above JDBC driver uses Java NIO calls in blocking mode, which can be impacted by any interrupt() calls being made by the application. This differs from previous versions of the JDBC driver, which used stream-based I/O API calls that were not affected by calls to interrupt(). Note that this is a deliberate / intentional change beginning in the 12.2.0.1 JDBC driver, rather than a bug.
対策
Edit the Foglight Agent Manager baseline.jvmargs.config file and add a new line similar to the following below, incrementing the .0 with a higher number if other lines already exist in the file.