Chat now with support
Chat with Support

InTrust 11.4.1 - Contingency Planning Guide

Contingency Planning Overview

Looking at any InTrust organization infrastructure, it is possible to determine the components which are the most critical for the InTrust operation. In case these components are damaged due to any kind of disaster, the whole system will fail, and valuable data will be irrevocably lost. So, it is strongly recommended that you back up the following:

  • InTrust Servers
  • Configuration Database
  • Repository

Generally, Audit database failure is not as critical as other components’ failures, because typical workflow presumes that data is collected to repository. Repository backup will help to restore your Audit database: after you recover the repository, you can easily import the necessary events into the database. However, it is recommended that you periodically back up your Audit database and other InTrust components, as described in this guide.

Backup Procedures for InTrust

To minimize the risk of irrevocable data loss, it is strongly recommended that you perform backup procedures for your InTrust components, as follows:

  • InTrust Servers: either weekly, or after new agents are added.
  • Configuration database: always after any configuration changes; periodically to take into account newly installed agents (daily backup recommended). Alternatively, set up configuration database replication, as described in Replication of the InTrust Configuration Database. This lets you ensure InTrust configuration consistency across the enterprise and increase you InTrust organization's fault tolerance.
  • Repository: after each gathering process, i.e. depending on gathering process schedule; at least daily backup recommended.
  • Audit database: depending on gathering process schedule (frequency), daily backup recommended.
  • Alert database: daily backup recommended.
  • InTrust agents: recommended—two times a week.

InTrust provides InTrust Server failover capabilities, which allow for automatic operation switching. It is recommended to activate this feature, as follows:

  1. Configure two InTrust Servers in your InTrust organization:
    • A production InTrust Server that performs gathering and real-time monitoring
    • A standby InTrust Server that will take over the operation if a production Server goes down.
  2. Create an InTrust site containing the standby InTrust Server, and specify this server name when prompted for InTrust Server responsible for processing the site.
  3. To monitor for the state of production InTrust Server, you need to enable the “InTrust server is down” monitoring rule (located in InTrust Internal Events | InTrust server failover rule group) on the standby server, and activate the response action (failover script execution) of this rule.
  4. When configuring this rule, select to perform matching on server side. Also, you can specify:
    • Which InTrust Servers to monitor
    • How long to wait for response from a monitored server before it is considered to be down
  5. Create and activate a monitoring policy involving this rule and the InTrust site created on step 2.

If the production InTrust Server failure occurs, the standby InTrust Server takes over the sites and tasks processed by InTrust server that went down.

Caution: To ensure the availability and integrity of InTrust databases and repositories, it is recommended to locate them separately from the InTrust Servers. This will help minimize the risk of their failure if any of the InTrust servers go down.

If your agents are planned to be installed manually (for example, automatic agent install is not allowed by your organization's policies), then you should establish agent-server communication for both the production and standby InTrust servers when you install and configure the agents. This will allow agents to connect to a standby server if a failover occurs. (For details, refer to Installing Agents Manually).

IMPORTANT: InTrust server recovery may be incomplete if the failed server was configured to receive forwarded Syslog messages. In this case, a failover operation can cause your Syslog collections to reference the wrong Syslog-receiving InTrust servers. If this happens, open the properties of the affected Syslog collections in InTrust Deployment Manager and select the right Syslog-receiving servers for them.

How to Recover Your InTrust

The following topics give you an idea of the problems which may occur due to a disaster, how they can be solved if you have properly backed up your data, and what if you have not.

InTrust Server Recovery

InTrust Server and Its Temporary Files Corrupted due to Disk Failure

Backup Copy

Solution

Disk backup available

Restore InTrust Server and temporary files to the location where they resided.

No backup

Use InTrust failover capability to switch to other InTrust server in your organization. For that, you should enable the “InTrust server is down” real-time monitoring rule (from the “InTrust server failover” rule group) on the standby InTrust server to monitor for current InTrust server status:

  1. On the General tab of the rule’s Properties dialog, make sure the rule is enabled.
  2. On the Response Actions tab, make sure the Failover script execution is selected. Save the settings, and commit the changes.

If a failure occurs, you will get a notification, and standby server will take over the sites and tasks processed by InTrust server that went down.

You can perform a failover manually by launching Server Switching Wizard:

  1. In InTrust Manager, select Configuration | InTrust Servers, and from your current InTrust server’s shortcut menu, select Failover | Switch. Follow the steps of the wizard:
  2. Select the InTrust sites and jobs to be switched.
  3. Specify the InTrust server that will take over the operations.
  4. Finish the wizard and commit the changes.

After restoring the InTrust server, you can roll back this switching session (switch sites and jobs back to the server initially responsible for their processing):

  1. Start the Rollback Wizard by selecting Failover | Roll Back from the restored server’s shortcut menu, and select the session to roll back.
  2. Commit the changes after finishing the wizard.

Notes and Caveats

  • If you are using role-based administration in your InTrust deployment, consider that to run Server Switching wizard, a user must have Modify permission for switched sites and jobs (their nodes in InTrust Manager), and for the InTrust Server node (the one you are switching from)
  • By default, passwords for agent-server connection expire in three days after they were set. Thus, if you make a daily backup of InTrust program folder, and you restore it on the new server within 3 days timeframe, the agents should be still able to connect to server. Agent password expiration policy can be adjusted in the configuration database.
  • If an InTrust Server that went down was hosting any Data Stores used by the jobs which were running at that moment, then such jobs will fail, and you will have to create them anew. For example, if a gathering job was using an Audit database located on the failed server, it has to be created anew.

System Disk Failure InTrust Server Computer

Backup Copy

Solution

Disk backup available

Restore files from backup.

No backup Use InTrust failover capabilities, as described above.

InTrust Server IP Address Changed

Details: After the server is restarted, connection with the agents is lost.

Agents

Solution

No agents installed on the computers over the firewall.
  • If an agent had been installed automatically, then it is recovered, and agent-server connection is re-established automatically after the heartbeat interval, or when gathering process starts.
  • If an agent had been installed manually, and agent-server connection had been also established manually, then it is re-established automatically after the heartbeat interval, or after the gathering process starts (it is assumed that gathering is performed using agents). However, make sure the account (under which the InTrust server runs) can access the target computers—otherwise, you need to establish agent–server connection manually. For details, see Installing Agents Manually.
Several agents installed on the computers over the firewall. Agent-server connection for these agents must be established manually. For more details, see Installing Agents Manually.

Caution: After recovery, an agent tries to connect to InTrust server whose name (NetBIOS name, FQDN, or IP address) was provided to this agent during the installation procedures (that is, when the server was registered on agent).

If you have specified the FQDN (recommended), then the agent will search for the InTrust server using this name, and connect to the server automatically.

However, if the server's IP address had been specified (for example, in case of DMZ, or some DNS problems) that was later changed, you should re-register that server on the agent, as described in the Establishing a Connection with the Server topic in Installing Agents Manually.

Self Service Tools
Knowledge Base
Notifications & Alerts
Product Support
Software Downloads
Technical Documentation
User Forums
Video Tutorials
RSS Feed
Contact Us
Licensing Assistance
Technical Support
View All
Related Documents