...
Code Block |
---|
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001.statsbiblioteket.dk grep max-hops /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.warc [test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001.statsbiblioteket.dk grep delay-factor /netarkiv/0001/TEST2/filedir/<jobno>-metadata-1.warc |
CLARIFY - WHAT DOES THIS MEAN?: (Note that there should be two setup/order reports. The one containing a timestamp in its name is the original order.xml, the one called simply
metadata://netarkivet.dk/crawl/setup/order.xml is the final modified version.)
Check that Alias Domains are not Harvested
...
Code Block |
---|
[test@kb-prod-udv-001 ~]$ ssh netarkiv@sb-test-bar-001.statsbiblioteket.dk grep netarkivet.dk /netarkiv/0001/TEST2/filedir/*-metadata-1.arcwarc | grep -v 'metadata:' |
or by scp'ing the metadata file to kb-prod-udv-001 and inspecting it with "less".
...
Check that there was no Deduplication
Using CLARIFY: Using a browser setup for ViewerProxy access, check the processors-report for one of the snapshot-harvest jobs. Confirm that there was no DeDuplicator report with a "Duplicate found" line.
Stop the Test and Clean-Up
...