Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Go to the System Status page by clicking 'Systemstate'. Click on the application ! HarvestControllerServer. The most recent log record will give status information from Heritrix. You can find more application information by clicking on 'Show all' in the Index column.

...

To try this, go back to the viewerproxy status page and click 'Start collecting URLs'. Now browse in the collected material until you find a page or image that did not get harvested. Go back to the viewerproxy status page and click 'Show collected URLs'.

Image Removed

The list will contain several URLs, including the ones you just requested and found missing during collection of URls.

...

You will now get a page with information used when harvesting that domain. In this case, we wish to add the collected URLs to the list of seeds we start our web harvests from.

Image Added

On the domain definition page, click 'Edit' next to the seed list.

...