Netarkivet Daily harvest template
Some help to harvest newspapers : a way to stop automatically the harvest after a day (exactly after 23 hours so the deduplication has time to be finished). Here is the extract of the order.xml for this purpose:
A processor that halts further progress once a fixed amount of time has elapsed since the start of a crawl.
<newObject name="RuntimeLimitEnforcer" class="org.archive.crawler.prefetch.RuntimeLimitEnforcer"> . <boolean name="enabled">true</boolean> <newObject name="[[RuntimeLimitEnforcer#decide-rules.22|RuntimeLimitEnforcer#decide-rules"]] class="org.archive.crawler.deciderules.DecideRuleSequence"> . <map name="rules"> </map> </newObject> <long name="runtime-sec">82800</long> <string name="end-operation">Terminate job</string> </newObject>