Overview
Automatic Migration Restart is a process by which any abnormal termination of the Notes Migrator migration engine (commonly referred to as a ‘crash’) is identified, and the migration job is automatically re-queued. If this occurs, the migration is restarted and resumed on the next document in the message store, until either the migration completes process the message store, or the maximum number of retries has been exceeded.
A new migration status, “Migration terminated abnormally” indicates the migration engine has crashed while processing a job. This status arises when the process handle for the migration engine (CMTProxy.exe) is no longer available to the migration worker application, indicating that the migration engine has crashed.
Automatic Migration Restart works in conjunction with the existing migration recovery features in Notes Migrator to resume a migration on the next document immediately after the document which caused the crash.
Terminology
This following table lists and describes the terms commonly used when describing the automatic restart process.
Term Description
CMTe - Notes Migrator application environment, encompassing the server instance.
Job - A migration event. Within the context of CMTe, each migration job is a unique event.
Migration server - Refers to the system running the Notes Migrator web services. This is the primary instance where the web server and Monitor are installed.
Notes Migrator Database (or Notes Migrator.nsf) - Used for configuring settings, managing user records, and queuing users to the migration server.
Migration worker - An instance of the CMT_MigrationWorker.exe application running on a migration workstation. This application runs continuously on the migration workstation, polls the migration server for pending migration jobs, and initializes an instance of the migration engine (CMTProxy.exe) for each migration job.
Migration engine (or CMTProxy) - An instance of CMTProxy.exe, which is initialized and managed by the Migration worker. CMTProxy.exe performs the actual data migration.
Crash - Abnormal termination of the migration engine, resulting in a status of “Migration terminated abnormally.”
Configuring Automatic Restart
Automatic restart is controlled by a parameter in the Web.config file for the migration server. This file is located in [installDirectory]\CMT for Exchange\CMT_XMLServer\Web.config , for example “C:\Program Files (x86)\BinaryTree\CMT for Exchange\CMT_XMLServer\web.config”.
The Web.config file contains the startup parameters for the Migration Server. Located in this file is an \appSettings\MaximimumRetryCount XML node, which indicates the maximum number of times a migration will be automatically restarted.
The MaximumRetryCount is set to 10 by default. A value of 0 or a negative number will disable the Automatic Restart functionality.
If the MaximumRetryCount is changed or disabled, the Migration Server instance must be restarted for the change to take effect. This can be done by opening a command prompt on the migration server and typing “iisreset” at the prompt, or manually restarting the IIS service.
Migration Worker basics
The Migration Worker is an application that runs in the task bar of windows. This application is responsible for queuing, migrating, and updating the status of a migration job on the migration server.
The Migration Worker has three basic states:
Offline – the migration worker is not started.
Online – the service is running, and actively checking for migration jobs
Migrating – a migration job is in progress.
When the Migration Worker application is online, the application periodically checks the Migration Server for pending migration jobs. If a job is available, the migration server passes back the job number from the migration queue, along with settings and user information required to start the migration. The worker then starts the migration by calling CMTProxy.exe as a separate process with the appropriate parameters, which begins the actual data migration. Once the migration starts, the migration worker application sends notification back to the migration server that the workstation is in Migrating status. During the course of the migration, the migration worker periodically communicates throughput and migration time statistics back to the migration server.
When the migration has completed, the Migration Worker updates the Notes Migrator server and then uploads the history of migrated messages and relevant log files.
The key concept is that the Migration Worker is responsible for checking the Migration Server for work, triggering a migration job when required, updating the Migration Server during the migration, and finally uploading log files once the job has been completed.
Recovery basics
During a migration, the migration engine reads and catalogs the unique identifier of each message it encounters before the message is processed. This identifier is stored in memory to prevent migrating a document multiple times, and is also immediately written to disk in a “ProcessedNoteID-username.txt’ file on disk, where username is the unique key of the user on the migration server.
In the event of a crash, the ProcessedNoteID file on disk contains the list of unique identifiers of each document processed during the failed migration. This file is read by the Migration Worker application, merged with any existing migrated message information residing on the server for the user, and stored back on the migration server prior to the next migration. This ensures that the migrated message table on the Migration Server is up to date when the migration is restarted.
When the migration engine is restarted, the migrated message table is read and all previously processed messages (including the ID of the message that was being processed when the migration crashed) are skipped in subsequent runs. This allows Notes Migrator to resume where it left off, and prevents re-migrating documents repeatedly on subsequent runs.
Detecting abnormal termination of a migration
The Migration Worker application is a key component in detecting when a migration crashes. When the Migration Worker application starts a migration job, it launches the migration by calling CMTProxy.exe in a separate process. While the migration is running, the Migration Worker application periodically sends information back to the migration server indicating migration time and throughput for the current migration.
If the migration engine (CMTProxy.exe) crashes, the handle to the migration engine becomes invalid, and the Migration Worker application knows that the currently running migration has crashed. At this point, the Migration Worker reads the ProcessedNoteID table from the temporary directory, merges it with any existing information in the Migrated Message table on the Migration Server, and updates the status of the job to “Migration Terminated Abnormally.”
The Migration Worker does not actually re-queue the job, it reports only “Migration Terminated Abnormally” status back to the Migration Server. The migration server is responsible for determining if the job will be re-queued.
Automatic re-queue of a migration job
The sections above describe how the Migration Worker application identifies a crash in the migration engine and returns “Migration terminated abnormally” status. This section describes how the migration server re-queues the user and resulting migration status values.
If the migration worker application returns a value of “Migration terminated abnormally” to the migration server, the migration server determines if the RetryCount exceeds the MaximumRetryCount configured for the server.
If a migration has been automatically restarted by this process, the Notes Migrator.nsf database will only show the status of the last migration that was restarted. Consider the situation where a migration crashed, was automatically restarted, and continued to completion without further errors. The migration status in the Notes Migrator Database will report “Migration completed successfully” even though there was a crash on the initial run. The migration history will indicate the first migration completed with a status of “Migration terminated abnormally” status, but this may not be evident in the Notes Migrator.nsf Lotus Notes application interface. Advanced or customized installations of the Lotus Notes application may retrieve and summarize the complete migration history for a user, but a complete migration history is not included in the default Notes Migrator.nsf application interface.
Summary
In this section, we have reviewed how the Notes Migrator Migration Worker application detects a crash of the migration engine, how the Migration Server processes the “Migration Terminated Abnormally” status and requeues the job if necessary, and ultimately how the migration automatically restarts on the same workstation.
In this manner, the Automatic Restart process in Notes Migrator allows migration administrators to queue jobs to the migration server, and know that if a crash were to occur the migration will automatically be restarted without intervention up to the maximum number of times indicated by the MaximumRetryCount parameter. This feature reduces the amount of resources required to monitor migrations, and eliminates the need for administrator intervention in the case where the migration engine has crashed.
Again, it should be noted that the Lotus Notes application interface (Notes Migrator.nsf) will report the last migration status for the user. Any automatic restart events will be evident in the complete migration history, but will not be reflected directly in the interface. In the case where the maximum restart count has been exceeded, the last migration event will be reported as “Migration terminated abnormally.”
Addendum
We have seen a few cases where there was no crash on the worker. The workers were never receiving the migration job. It was acting just like a stuck in pending, but returned a status of "Requeued: Retry 1", but it would remain in this status and never retry even though recovery was enabled.
This is caused by corruption in the SQL table for these users reporting this status: "Requeued: Retry 1"
Perform the following steps to get the migrations running again
Open SQL to the Notes Migrator database.
Edit these 3 tables; 'CMT_User', 'MigrationQueueTable'; 'Migration_Details'
Edit and delete any and all records related to these users
Close the SQL query windows
From the MCC, open a command prompt CMD, and run iisreset
Start migrations for these users
© 2024 Quest Software Inc. ALL RIGHTS RESERVED. Feedback 이용 약관 개인정보 보호정책 Cookie Preference Center