Adjusting the Task-Based Gathering Workflow for Repository Indexing
Getting the most out of InTrust repository indexing requires that you organize your gathering workflow in a particular way. To follow the best-practice recommendation of using a short-term repository and an archive repository, take the following steps:
- If you already have a production repository with data, make an archive repository based on it, as described in the To convert an existing repository to a short-term repository procedure below. If you only have an empty default repository (in a fresh InTrust deployment), simply create an additional repository for archival purposes.
- Prepare your production repository to be used as the short-term repository, as described in the To convert an existing repository to a short-term repository procedure below.
To convert an existing repository to a short-term repository
- In InTrust Manager, open the properties of the repository and note its path. You will need this path later.
- Change the path. The new path will be used for short-term storage.
- Create a new repository in InTrust. In the New Repository wizard, specify the old repository location from step 1.
The resulting repository is ready to be used for audit data archival, and the original repository is ready for regular data extraction and cleanup.
Note: In InTrust Manager, access to repositories is based on IDs, not names or paths. This means it does not lose track of a repository no matter how you rename or relocate it. However, if you delete a repository and recreate it with the old name and path, it will not be recognized as the old repository. Therefore:
- To repurpose a repository without disrupting the workflows associated with it, rename it.
- To relocate a repository without disrupting the workflows associated with it, change its path.
- If you don't mind losing associations with any current workflows, create a new repository. You can use the path from a previously deleted repository to populate it.
To organize archival of repository data
- Decide on the retention period in the short-term repository—for example, 90 days. To make the decision, take into account how far back the events you view usually date.
- In InTrust Manager, create a task with the following jobs:
- A consolidation job that copies the entire Windows network-related contents of the short-term repository to the archive repository; make this job the successor of the repository cleanup job
- If applicable, a consolidation job that copies the entire Unix network-related contents of the short-term repository to the archive repository; make this job the successor of the repository cleanup job
- A repository cleanup job that clears from the short-term repository all data older than your preferred retention period
- Schedule the task to run at intervals equal to your retention period.
Handling the Archive Repository
The archive repository has a lower priority than the short-term repository. It is up to you whether you want the archive repository to be indexed. If you need it indexed, you can speed up the process by setting up dedicated indexing, as described in the Dedicated Indexing topic.
Remember to provide ample hard disk space, because it is going to be consumed by the gradual repository growth and, if indexing is enabled, by the index.
Tests have shown that using a dedicated index-processing computer makes indexing take roughly half as long, compared to using an InTrust server that is also involved in other activity. It also helps avoid load spikes on the production InTrust server at certain stages of the indexing process.
To decide whether dedicated indexing is really needed, answer the following questions:
- Can your existing InTrust servers handle the indexing management, or do you need one or more dedicated computers just for the indexing?
- What other work will the InTrust servers do besides indexing?
The Estimating the Resources Required for Indexing topic can help you find the answers.
If you choose dedicated indexing, ensure the following:
- An InTrust agent is deployed on the computer or computers you want to use for indexing.
Caution: If the index-managing InTrust server and the indexing computer are separate machines and the indexing computer is also an InTrust server, make sure the indexing computer does not list the index-managing server as an agent. Otherwise, all activity where the two servers involve each other (gathering, indexing and so on) will fail due to circular server–agent dependencies.
To see the list of an InTrust server's agents, select the Quest InTrust Manager | Configuration | InTrust Servers | <server_name> | Agents node.
- The following ports are open for incoming traffic on the indexing computers:
- RPC ports: 135, 445
- Dynamic ports: 1024–65535
- The index-managing InTrust server and the indexing computers must be accessible to each other by DNS name.
Enabling Dedicated Indexing
After you have planned dedicated indexing and prepared the computers, perform the configuration in InTrust Manager.
First, create a dedicated site with a meaningful name such as “Indexing helpers” and include your dedicated indexing workers in the site.
Next, configure indexing for the repository:
- Open the repository properties, and select the Indexing tab.
- Specify the index-managing InTrust server.
- Select the On agents from this site option.
- Specify the site you have created.
- Specify the account to use for indexing activity. You can use the InTrust server account, which has all the necessary privileges. If you would prefer a less powerful account for indexing, make sure it has the permissions listed for "Perform indexing of a production repository" in Minimal Rights and Permissions Required for InTrust Operations.
Indexing Idle Repositories Without InTrust
The workflow described above cannot apply to non-indexed idle repositories that are not managed by any InTrust servers. To create an index for such a repository, use a standalone instance of the IndexingTool.exe command-line utility.
This is the utility that InTrust servers and agents use automatically to perform indexing activity, but it can also be installed separately using the INDEXING_TOOL.*.*.*.msi package or as part of the Repository Viewer setup using IT_RV.*.*.*.msi.
The syntax for indexing of idle repositories is as follows:
- Create an index:
indexingtool.exe -create -local <index_path> <rep_path> [-threads [how_many]]
- Synchronize the index:
indexingtool.exe -sync -local <index_path> <rep_path> [-threads [how_many]]
- Delete the index:
indexingtool.exe -delete -local <index_path> <rep_path> [-threads [how_many]]
The -threads option sets how many threads indexing will use. If this option is omitted, the number of threads will be the same as the number of CPU cores. The following values can be used:
Use as many threads as there are CPU cores.
- -threads -1
Use one less threads than there are CPU cores, but no less than one thread.
- -threads N
Use N threads.
Values less that -1 will cause errors. The default value is -1 (one less threads than there are CPU cores).
- When the IndexingTool.exe utility is used in standalone mode, a repository can be processed by only one instance of the utility at a time.
- You cannot open an idle repository in Repository Viewer while it is being indexed.
Tracking Indexing Progress
Indexing progress is recorded to the InTrust Server event log. For a list of specific events, see Events from InTrust Repository Services.
To track the progress, gather the InTrust Server log and view the events in Repository Viewer.