Monitor events with sp_eventmon
The sp_eventmon monitoring script monitors the SharePlex Event Log (event_log) at set intervals for entries relating to key replication events. You can define the scan interval and the error messages you want the script to detect. Each scan starts where the previous one stopped to keep the impact on the system minimal and prevent duplicate warnings.
The sp_eventmon script takes the following actions after each scan of the Event Log:
- When sp_eventmon detects an error that you defined, it prints a notification to the error.splex log file and an e-mail message, if that option is enabled.
- It logs each error, the error Event Log line number of the error, the sp_cop instance name (typically the port number), and the time and date of the error.
The script relies on the iwgrep program, the error_list file (described later), and a marker file named username.mrk (where username is derived from the string that you enter with the -s argument when you run sp_eventmon). These three components must be kept in the same directory as the script, or it will not function.
Note: The username.mrk file prevents duplicate warning messages from being sent to the log and to your e-mail or pager. Without this file, the script starts scanning the Event Log from the beginning every time it starts. Warnings that were previously generated are sent again. |
Prepare to run sp_eventmon
Before running the script, perform the following tasks.
Satisfy requirements
See Requirements for using the monitoring scripts before using this script.
Note: The script must be run in the ksh shell.
Define error messages
The sp_eventmon script scans for events listed in the error_list file, located in the util sub-directory of the SharePlex product directory. View that file for more information about the supported errors. You can add custom error strings to the error_list file by editing it in any ASCII text editor. Open the file and place each error string on a separate line.
Set IW_HOME
The IW_HOME variable in the script must be set to the correct value on each machine. This variable must point to the directory in which the monitoring scripts and iwgrep reside.
If the path is not correct:
- Open the script in the app-modules directory of the SharePlex product directory.
-
Set the path as shown in the following example:
IW_HOME=/export/home/splex/monscripts
Define e-mail addresses
To use the e-mail notification feature, define the e-mail address(es) in the script before running it.
- Open the script in the app-modules directory of the SharePlex product directory.
-
Add any number of address strings after the MailUserName= variable. Use the full e-mail and/or pager address. Separate multiple entries with a comma, as shown in the following example:
MailUserName=scott@company.com,12345678910@pageservice.com
Run sp_eventmon
NoteS:
- If you are running multiple instances of sp_eventmon, each instance must be run under the name of a different operating system user. Each username.mrk file will have a different username.
-
Use the truncate log command in sp_ctrl to truncate the Event Log frequently when you are running the sp_eventmon script. If the log grows too large, the iwgrep program cannot grep from it properly. When you issue the truncate log command, remove the username.mrk file. The next time you run sp_eventmon it will create a new file. See the SharePlex Reference Guide for more information about the truncate log command.
-
When there is an existing Event Log with errors in it and the script is running, issue the truncate log command and then delete the sp_cop_name.mrk file, where sp_cop_name is the value used in the -s argument when the script was run. This file is in the util sub-directory of the SharePlex product directory. |
To run sp_eventmon
Run the script from the util sub-directory of the SharePlex product directory, not from app-modules. When you run it from the util directory, you actually make a soft link that runs a utility which first sets up the correct environment before running the script itself.
Syntax:
nohup sp_eventmon -s 'sp_copname' -t interval -p path [-n name ] [-m] /dev/null &
Table 8: Required arguments
nohup sp_eventmon |
Directs the script to continue running in the background if the user logs out. This ensures continuous monitoring. The sp_eventmon component runs the script. |
-s 'sp_copname' |
Sets the name of sp_cop that was used when sp_cop was started with the -u option. The name of sp_cop must be enclosed within single quote marks. You can use this parameter more than once to monitor multiple sp_cop instances on a system. Without this parameter, sp_eventmon will not start. |
& |
Runs the script in the background. |
-t interval |
Sets the time interval between scans in seconds. The value can be any positive integer. |
Table 9: Optional Components
-p path |
Sets the path to the SharePlex variable-data directory. Without this variable, sp_eventmon assumes the default path. |
/dev/null |
Redirects the notification output to the /dev/null device on the local system so that the monitoring process continues to run in the background and generate output. To have the output appear on screen, omit this argument. |
-n name |
Sets the name of the Event Log if it is something other than the default name “event_log.” |
-m |
Enables the e-mail/paging option. Without this option, sp_eventmon only logs errors to the log file. |
Monitor processes with sp_ps
The sp_ps monitoring utility monitors all SharePlex processes, including child processes, associated with a specified sp_cop instance. It scans the processes at regular intervals and reports abnormal conditions to one or more log files. It can monitor multiple installations of SharePlex on one or more systems, and it supports uni-directional and bi-directional (peer-to-peer) configurations.
Prepare to run sp_ps
Before running the script, perform the following tasks.
Satisfy requirements
See Requirements for using the monitoring scripts before using this script.
Note: The script must be run in the ksh shell.
Set the scan interval
The scan interval specifies how long the sp_ps program waits between checks. The default is 2,000 seconds. To specify a different scan interval, follow these steps.
- Open the sp_ps file in the app-modules directory of the SharePlex product directory.
-
Set the interval= parameter to the required scan interval. Use any positive integer, for example:
interval=1500
Define email addresses
To use the e-mail notification feature, define the e-mail address(es) in the script before running it.
- Open the script in the util directory of the SharePlex product directory.
-
Add any number of address strings after the MailUserName= variable. Use the full e-mail and/or pager address. Separate multiple entries with a comma, as shown in the following example:
MailUserName=scott@company.com,12345678910@pageservice.com
Note: The e-mail/paging option is enabled by default for sp_ps, but confirm that it was not changed. In the script, MAILOPTION=TRUE enables e-mail notifications and MAILOPTION=FALSE disables them.
Run sp_ps
Run the script from the util sub-directory of the SharePlex product directory.
Syntax:
nohup sp_ps ['sp_cop -u name'] CONFIGURATION [> /dev/null] [ &]
Table 10: Required arguments
nohup sp_ps |
Directs the script to continue running in the background if the user logs out. This ensures continuous monitoring. The sp_ps component runs the script. |
'sp_cop -u name' |
Use this parameter if you are running more than one sp_cop process. Use it to specify each one of those processes that you want to monitor. This argument must reflect exactly the same name that was used when sp_cop was started with the -u option. It must be enclosed within single quote marks.Without the -uname option, sp_ps assumes you want to monitor the sp_cop that uses the default SharePlex port of 2100. |
CONFIGURATION |
Specifies the type of configuration of the SharePlex instance being monitored. This value must be entered in CAPITAL letters. Valid values:
SOURCE — Use for uni-directional replication to monitor the Capture, Read and Export processes on the source system.
TARGET — Use for uni-directional replication to monitor the Import and Post processes on the target system.
MULTI-SOURCE — Use for peer-to-peer replication. It directs the script to monitor the Capture, Read, Export, Import and Post processes on each system.
Note: If replicating between source and target tables on the same system, there are no Export or Import processes. |
> /dev/null |
Redirects the notification output to the /dev/null device on the local system so that the monitoring process can continue to run in the background and generate output. To have the output appear on screen, omit this argument |
& |
(Ampersand) Runs the script in the background. |
Monitor queues with sp_qstatmon
The sp_qstatmon script monitors the status of the capture and post queues for message backlogs. You can configure the script to alert you if the number of messages in a queue exceeds a defined threshold (limit), indicating that there is a potential data, system or network problem. This gives you time to correct the problem before the queues exceed their allocated space on the filesystem.
After each analysis of the queues, the sp_qstatmon script prints a notice in the capstat.log file for the capture queue or the poststat.log file for the post queue, as well as an e-mail message if that option is enabled.
Prepare to run sp_qstatmon
Before running the script, perform the following tasks.
Satisfy requirements
See Requirements for using the monitoring scripts before using this script.
Note: The script must be run in the ksh shell.
Assign permission to create temporary files
The script creates some temporary files in the util sub-directory of the SharePlex product directory. Assign write permission to that directory to the sp_qstatmon module.
Define email addresses
To execute sp_qstatmon with e-mail notification, you must first must define the e-mail address(es) in the script. Notification messages are sent to all addresses coded in the script. Unless email notification is enabled, sp_qstatmon only logs errors to the log file.
You can specify as many addresses as you want using the following procedure:
- Open the sp_qstatmon script in any ASCII text editor. The script is in the .app-modules directory in the SharePlex installation directory.
-
Add the address strings after the MailUserName= variable. Use the full e-mail and/or pager address. Separate multiple entries with a comma, as shown in the following example.
scott@company.com, 12345678910@pageservice.com
- Save and close the file.
Run sp_qstatmon
Run the script from the util sub-directory of the SharePlex product directory, not from app-modules. When you run it from the util directory, you actually make a soft link that runs a utility which first sets up the correct environment before running the script itself.
Syntax:
nohup sp_qstatmon -v path -t n -p port_number [-c integer ] [-d integer ] [-m] > /dev/null &
Table 11: Required arguments
nohup sp_qstatmon |
Directs the script to continue running in the background if the user logs out. This ensures continuous monitoring. The sp_qstatmon component runs the script. |
-v path |
Sets the path to the SharePlex variable data directory for the instance of sp_cop that you want to monitor. Without this variable, sp_qstatmon fails and prints an error message requesting a valid path. |
-t n |
Sets the time interval between scans in seconds. This value can be any positive integer. |
-p port |
Sets the port number for the instance of sp_cop that you are monitoring. You can monitor different SharePlex instances by running sp_qstatmon for each one, using different values for this argument. |
& |
Runs the script in the background. |
Table 12: Optional arguments
/dev/null |
Redirects the notification output to the /dev/null device on the local system so that the monitoring process continues to run in the background and generate output. To have the output appear on screen, omit this argument. |
-c integer |
Sets the number of messages in the capture queue at which the script issues a warning message. This value can be any positive integer. Without this parameter, sp_qstatmon defaults to 100 messages. |
-d integer |
Sets the number of messages in the post queue at which the script issues a warning message. This value can be any positive integer. Without this parameter, sp_qstatmon defaults to 100 messages. |
-m |
Enables the e-mail/paging option. Without this parameter, sp_qstatmon only logs errors to the log file. |
Monitor with SNMP
Monitor Replication with SNMP
SharePlex provides agent support for Simple Network Management Protocol (SNMP) on all Unix and Linux platforms supported by SharePlex replication.
Note: SharePlex provides only agent support for SNMP. It only sends SNMP traps. SharePlex does not provide an SNMP signal daemon (SNMP manager) to intercept the traps. Use the SharePlex SNMP feature only if you have a Network Management Station (NMS) to manage SNMP signals. The SharePlex SNMP agent is named snmptrap and is installed with SharePlex in the bin sub-directory of the SharePlex product directory. Do not run this program.
Enable SNMP
To enable SNMP monitoring of SharePlex replication, set the SP_SLG_SNMP_ACTIVE parameter to 1. By default, the parameter is set to 0 (disabled).
Configure the SNMP agent
The following parameters configure the SNMP agent to communicate with the NMS. Each parameter must have a value if the SP_SLG_SNMP_ACTIVE parameter is enabled.
SP_SLG_SNMP_HOST |
The name of the system (host) to which the traps will be sent |
SP_SLG_SNMP_COMMUNITY |
The community security string |
SP_SLG_SNMP_MJR_ERRNUM |
The major error number to be used by the traps |
SP_SLG_SNMP_MNR_ERRNUM |
The minor error number to be used by the traps |
Custom MIB parameters
The following parameters specify required information for a custom MIB.
SP_SLG_SNMP_ENTERPRISE_OID |
The enterprise object identifier to send with the trap. The default is 1.3.6.1.4.1.3.1.1 . |
SP_SLG_SNMP_TRAP_OID |
A custom object identifier to bind to the trap. The default is 1.3.6.1.2.1.1.1.0. |
SP_SLG_SNMP_TRAP_PROGRAM |
The name of the trap program. The default is iwsnmptrap. |
Configure the SNMP traps
The following parameters configure the SNMP agent to send traps for specific replication events. The message or error text for the event is included in the trap and is the same error that appears in the Event Log.
To enable an SNMP trap for an event, set the corresponding parameter to a value of 1. By default all traps are disabled (parameter value of 0).
SP_SLG_SNMP_INT_ERROR |
SharePlex logic errors and errors that cause processes to exit |
SP_SLG_SNMP_SYS_ERROR |
System-related errors encountered by SharePlex |
SP_SLG_SNMP_ERROR |
Other SharePlex errors |
SP_SLG_SNMP_OUT_OF_SYNC |
Replication is out of synchronization |
SP_SLG_SNMP_STARTUP |
SharePlex starts up |
SP_SLG_SNMP_SHUTDOWN |
SharePlex shuts down |
SP_SLG_SNMP_LAUNCH |
A SharePlex process starts |
SP_SLG_SNMP_EXIT |
A SharePlex process stops |