Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001.statsbiblioteket.dk grep max-hops /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.warc
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001.statsbiblioteket.dk grep delay-factor /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.warc

CLARIFY - WHAT DOES THIS MEAN?: (Note that there should be two setup/order reports. The one containing a timestamp in its name is the original order.xml, the one called simply
metadata://netarkivet.dk/crawl/setup/order.xml is the final modified version.)

Check that Alias Domains are not Harvested

...

Code Block
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001.statsbiblioteket.dk grep netarkivet.dk /netarkiv/0001/TEST2/filedir/*-metadata-1.arcwarc | grep -v 'metadata:'

or by scp'ing the metadata file to kb-prod-udv-001 and inspecting it with "less".

...

Check that there was no Deduplication

Using CLARIFY:  Using a browser setup for ViewerProxy access, check the processors-report for one of the snapshot-harvest jobs. Confirm that there was no DeDuplicator report with a "Duplicate found" line.

Stop the Test and Clean-Up

...