Chat now with support
Chat with Support

ChangeBASE 6.2.2 - User Guide

Introduction Before You Start... Installing ChangeBASE Configuring ChangeBASE The Pre-Loading of Packages into ChangeBASE Import Options Applications Options Checks Options Dependency and Conflict Checker Dashboard Reporting Repackaging and Virtualization Web Capture

Web Crawler

Important: The Web Crawler utility is designed to capture Web applications for assessment, rather than large commercial Websites, and so it is advisable to keep to the limits in Web Capture Settings. Otherwise, the performance of the utility may be adversely affected.

To open the Web Crawler

  1. Select Dell > ChangeBASE > Web Capture. The Web Crawler utility is displayed by default.

To add a Web application for capture:

  1. Click in the Web Applications panel to create a new row.
  2. Check Auto Logon if you wish to you to use Forms-based Authentication.

Caution: When using Auto Logon, it is highly recommended that you use an account specifically created for the purposes of testing. This is to avoid a situation whereby a link is automatically followed that could cause information in a live environment to be deleted or modified. Ideally, this account should not have high levels of authority, thereby minimizing the risk.

Refer also to HTTP Authentication.

  1. Enter a Name for the captured application.
  2. Enter the URL of the application to be captured.
  3. If you checked Auto Logon, go to the Auto Logon Settings panel and set up and link the Forms-based Authentication as shown immediately below. Otherwise, go straight to Click Start to begin capturing data.
    1. Click in the Auto Logon Settings panel.
    1. In the Auto Logon Description field, enter a name that can be used to identify this group of settings. It must contain only alphanumeric characters, spaces, hyphens and underscores.
    2. In the Logon URL field, enter the URL of the sign-in page where Forms Authentication is required.
    3. Click to load the Logon URL into the browser window. The URL is disabled while the page is being loaded.

    1. Set the Heartbeat URL field to a page that can be reached by the import process once it has been authenticated against the application.
    2. To allow the Web crawler to distinguish heartbeat pages from pages that can be accessible by guests before they are authenticated, a unique string must be configured in the Heartbeat Unique Text field. This unique string could, for example, be the full name of the user shown following logon. The string should not be shown by the application prior to authentication of the import process, as this may give a false positive.

    IMPORTANT: The heartbeat string will be matched against the HTML source for the heartbeat page. It is therefore important to ensure that the string which is visible on the page is also searchable in the page source. Otherwise, it will not be possible for the Auto Logon crawler to match the heartbeat page. To check the page source, you can right-click on the browser window and click View Source. To check that the unique string is searchable, simply search for the unique string in this text.

    Note: Once a request to navigate to a page has been initiated by clicking , the Browser Status box shows the status of the browser. The main window shows the currently loaded logon page. It is used to allow you to verify that the logon page has been successfully located and loaded, and also to select the logon controls in the next step.

    1. A list of the available controls for the page is loaded into the Available Controls box on the bottom right of the screen. Select a minimum of the following controls from the list, and in order to move them to Selected Logon Controls, click the adjacent arrow .
      1. A text (input) box used to enter the username or email address.

      2. A text (input) or password box to enter the user’s password.

      3. The button or link that needs to be clicked by the user in order to initiate the login request.

    Note: To make the process easier for you, it is possible to click on the control to be selected in the browser window above. If the control does not already exist in the list of Available Controls, then the newly selected control will move to the top of this list. You can then click the adjacent arrow as above.

    1. Click OK once your input is complete.
    2. In the Web Applications panel, select the application to which the Auto Logon details are to be linked, and then click Link to Application at the top of the Auto Logon Settings panel.
  4. Click Start to begin capturing data.

If you specified Auto Logon for Forms-based Authentication, you are first prompted with the controls that you set up in Auto Logon Settings.

Enter the logon details requested.

Refer also to HTTP Authentication.

  1. Click Stop to end the data capture.

    The data captured appears in the Previously Captured Sessions panel, where the first part of the File Name is the path specified in the Export Location for Data Files in Web Capture Settings. To show all captures in the specified export location, ensure that Show All is checked.

  2. Click Delete to remove highlighted sessions.

  3. The Capture Log window shows the log file for the current capture session. If there is no current session, then the log shown is for the highlighted previous session.
  4. If the Export Location for Data Files points to a network share that can be accessed by the ChangeBASEWeb CapturePackage Source, then the data automatically appears in the Import Window as files that can be imported and assessed against the browsers specified in Platform Setup.

HTTP Authentication

During an import, if a page or area of the Website being crawled is protected by HTTP Authentication, then the following dialog will be shown:

Enter your username, password and/or any other details required for authentication, and click OK. If you click Cancel, the import process is halted.

Note: To avoid confusion between sites using HTTP Authentication and Forms Authentication, the Auto Logon configuration screen will block any requests for HTTP authentication and display the following message.

In this situation, it is recommended that you do not configure Auto Logon settings for sites that only require this type of authentication, because the logon screen will automatically be displayed for HTTP authentication. However, if the site requires both types of authentication, then you will need to configure the Auto Logon settings.

Troubleshooting the Web Crawler

Examine the Web Crawler Capture Log to see whether any of the following limits or checks have resulted in errors.

Statistic

Description

Limits / Retries Allowed

Failed Forms Authentication Logins

Incorrect credentials were entered or the login page could not be reached.

3

Failed HTTP Authentication Login

Incorrect credentials were entered for HTTP authentication.

3

Failed Heartbeat URL Check

The URL of the heartbeat page could not be reached.

3

Failed Heartbeat Unique String Check

The Unique string configured by the user could not be found on the heartbeat page.

3

Failed Download

A download failed to complete successfully, for example due to network failure.

20

User Skipped Page

User requested to cancel authentication when prompted.

5

Crawler Skipped Page

The crawler skipped a page due to it being outside of the relevant domain or having an excluded MIME type.

0

Downloading Skipped Page

When downloading, it may be necessary to skip a page if its MIME type is excluded. Downloads are also skipped if a page returns HTTP Status 204 (No Content).

0

If none of these appear in the log, but the log does not otherwise give you any indications as to any problems you might be encountering, then you should relay the contents to Professional Services.

Directed Web Capture

The Web Crawler option is suitable for static Web data; however, it is advisable when capturing more dynamic Web pages, for example your corporate Intranet site, to use Directed Web Capture.

Note: Data generated by scripts within the Web pages, such as data generated via Ajax, is not captured.

 

Important: Directed Web Capture requires additional setup and needs to be integrated with third party systems (for example, the corporate proxy server), so you should seek guidance from Professional Services. It is advisable to use a test environment, and, within this environment, to install, wherever possible, the version of the browser to which you wish to migrate. This will avoid the accumulation of minor browser compatibility issues, for example issues relating to W3C and CSS Standards, and give you a clear view of any major issues.

To use Directed Web Capture

  1. Select Dell > ChangeBASE > Web Capture and click on the Directed Web Capture tab.

To capture dynamic Web data

  1. Click Start.
  2. You are prompted for the name of the file in which to store the data.

This name will automatically be appended to the Export Location for Data Files that you specified in Web Capture Settings. Therefore, enter a string by which you will be able to recognize the automatically generated file.

Note: You may be prompted to allow Directed Web Capture through your firewall. If so, select the type of Websites (Private or Public) to which you want the utility to have access.

  1. Click Start to begin capturing data.
  2. Refresh the web page that you wish to capture, to ensure that Web traffic is being generated.
  3. Click Stop to end the data capture.

The data captured appears in the Previously Captured Sessions panel, where the first part of the File Name is the path specified in the Export Location for Data Files in Web Capture Settings. To show all captures in the specified export location, ensure that Show All is checked.

Click Delete to remove highlighted sessions.

The Capture Log window shows the log file for the current capture session. If there is no current session, then the log shown is for the highlighted previous session.

If the Export Location for Data Files points to a network share that can be accessed by the ChangeBASEWeb Capture Package Source, then the data automatically appears in the Import Window as files that can be imported and assessed against the browsers specified in Platform Setup.

Related Documents

The document was helpful.

Select Rating

I easily found the information I needed.

Select Rating