Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Using the GUI, find the job number and the name of the harvest machine for the job in which kum.dk is being harvested.
  • Download the attached script and modify it to point at the correct harvester and job number
  • Copy the script to kb-prod-udv-001.kb.dk and run it. It monitors the "warcs" directory and as soon as the first warcfile is uploaded it detects that uploading has started and shuts down the test instance.

Check that at least one file has been uploaded.

...

  1. Go to harvest status page at http://kb-test-adm-001.kb.dk:8076/HarvestDefinition and find the Job for kum.dk.
  2. In the system overview finde the harvester running the job. The information will appear in the log column when the job has been started.
  3. Run the attached script to stop the test system after the first arcfile has been uploaded. Note that the script needs to be updated with the relevant job number and harvester.

...

  1. Log on to the harvester, eg. ssh kb-test-har-001.
  2. Verify that a meta data fil exists at ~/TEST?/harvester_low/{crawldir}/metadata/
  3. Copy the file to /tmp

...

Jira Legacy
serverSBForge
keyNAS-2162

...

Save the Metadata Warcfile

  • Log into the harvester where kum.dk was being harvested
  • Find the crawldir in TEST6/harvester_low
  • Find the metadata warcfile in the metadata subdirectory and copy it to /tmp

Create a Fake Crawl Dir

  • ssh netarkiv@sb-test-har-001.statsbiblioteket.dk
  • cd TEST6/harvester_high 
  • cp -r ~netarkiv/testdata/TEST6/23-fakejobdir
  • mkdir 23-fakejobdir/logs
  • touch 23-fakejobdir/logs/crawl.log

Wait 3 Hours then Restart the

...

System

Wait 3 Hours then Restart the System

  1. Verify the restarted system. On kb-test-adm-001
    1. Check the log for warnings and errors.

      Code Block
      cd /home/test/$TESTX/log/
      grep SEVERE *.log.0
      grep WARNING *.log.0

      The following entries are normal: 

      Code Block
      arcrepositoryapplication0.log.0:WARNING: AdminDataFile (./admin.data) was not found.
      guiapplication0.log.0:WARNING: Refusing to schedule harvest definition 'netarkivet' in the past. Skipped 18 events. Old nextDate was Mon Dec 18 14:29:30 CET 2006 new nextDate is Tue Dec 19 09:29:30 CET 2006
      GUIApplication0.log.0:WARNING: Job 2 failed: HarvestErrors = dk.netarkivet.common.exceptions.IOFailure: Crawl probably interrupted by shutdown of HarvestController

      The following warning may occur after a while: 

      Code Block
      WARNING: Error processing message '
      Class:                  com.sun.messaging.jmq.jmsclient.ObjectMessageImpl
      getJMSMessageID():      ID:40-130.225.27.140(d2:1:3:b1:10:de)-46478-1197902260630
      getJMSTimestamp():      1197902260630
      getJMSCorrelationID():  null
      JMSReplyTo:             null
      JMSDestination:         TEST6_COMMON_THE_SCHED
      getJMSDeliveryMode():   PERSISTENT
      getJMSRedelivered():    false
      getJMSType():           null
      getJMSExpiration():     0
      getJMSPriority():       4
      Properties:             null'
      dk.netarkivet.common.exceptions.UnknownID: Job id 23 is not known in persistent storage
              at dk.netarkivet.harvester.datamodel.JobDBDAO.read(JobDBDAO.java:294)
              at dk.netarkivet.harvester.scheduler.HarvestSchedulerMonitorServer.processCrawlStatusMessage(HarvestSchedulerMonitorServer.java:103)
              at dk.netarkivet.harvester.scheduler.HarvestSchedulerMonitorServer.visit(HarvestSchedulerMonitorServer.java:285)
              at dk.netarkivet.harvester.harvesting.distribute.CrawlStatusMessage.accept(CrawlStatusMessage.java:133)
              at dk.netarkivet.harvester.distribute.HarvesterMessageHandler.onMessage(HarvesterMessageHandler.java:67)
              at com.sun.messaging.jmq.jmsclient.MessageConsumerImpl.deliverAndAcknowledge(MessageConsumerImpl.java:330)
              at com.sun.messaging.jmq.jmsclient.MessageConsumerImpl.onMessage(MessageConsumerImpl.java:265)
              at com.sun.messaging.jmq.jmsclient.SessionReader.deliver(SessionReader.java:102)
              at com.sun.messaging.jmq.jmsclient.ConsumerReader.run(ConsumerReader.java:174)
              at java.lang.Thread.run(Thread.java:595)
  2. Go to the system overview page and check that all the expected applications are listen and are without warnings or errors.

Check that a job can be resubmitted

  1. Check that you can reject a job for resubmission using the "Reject?" button so that it is no longer visible when you list failed jobs.
  2. Check that you can see the rejected job when you now list all jobs.
  3. Click on one or more "Genstart"/"Resubmit" buttons. Note that you only can resubmit jobs failed due to harvesting errors, not due to upload errors.
  4. Check that the job-status changes to "resubmitted" and that a new Job is made from the same harvestdefinition with the same configurations.
  5. Check that resubmitted jobs contain information about which job they were resubmitted (FR770)

...