Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. In the Heritrix GUI, change some parameters for the domain netarkivet.dk e.g. max-hops 15 and delay-factor 1.5
  2. Click on "Resume" on the Heritrix Console
  3. Confirm that the job is running again in the NAS System overview.

Restart The System

  1. Stop and Restart NAS. After some time, a job should appear in the state "Failed".
  2. Resubmit the job. A new job should be created.
  3. Wait for the job to finish.

Check the Overrides are Applied

When For the failed job is finished, go to the QA interface check the order template for the job as listed in the reports (or login to the bitarchive and look directly in the metadata arcfile). Check that the overrides are visible. The easiest way to do this is from test@kb-prod-udv-001:

Code Block
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001 grep max-hops /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.arc
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001 grep delay-factor /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.arc

 

Restart The System

...

...

Check that Alias Domains are not Harvested

...

Code Block
<map name="http-headers">
 . <string name="user-agent">Mozilla/5.0 (compatible; heritrix/1.5.0-200506132127+http://netarkivet.dk/website/info.html)</string> <string name="from"> netarkivet-svar@netarkivet.dk </string>
</map>

This can be done by grepping with a command like

Code Block
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001 grep netarkivet.dk /netarkiv/0001/TEST2/filedir/*-metadata-1.arc | grep -v 'metadata:'

 

Check Byte Limits for the Second Harvest

...