Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Using the GUI, go to "Harvest status"→"All Jobs", and by clicking each Job ID for the snapshot harvest in turn, find the job ID for the job in which kum.dk is being harvested.
  2. Go back to "Harvest status"→"All Jobs", and reload the page until the job you just identified has status "Started"... then immediately go to "Harvest status"→"H3 Remote Access", keep reloading the page until the job ID found above appears, and click the job ID (this may take several tries until it is ready) then immediately pause the job.
  3. Go to "Harvest status"→"H3 Remote Access" and click the job ID you identified, then click "View/Search in cached Crawllog", then "Update cache". Go to "Harvest status"→"All Running Jobs" and search for "kum.dk" to find the job, then note down the name of the harvest machine (Host) for that job.
  4. Download the attached script and modify it to point at the correct harvester and job number
  5. Copy the script to kb-prod-udv-001.kb.dk:/home/devel/ , give it a "chmod 755" then run it. It monitors the "warcs" directory and as soon as the first warcfile is uploaded it detects that uploading has started and shuts down the test instance.
  6. Log on the Heritrix3 GUI, and unpause the job (no explicit logout is necessary)
  7. Wait for the job to complete, after which the TEST6 instance is stopped, starting with the apps on machine harvesting kum.dk

...