Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Panel

We're preparing our first broad crawl for 2023. For this purpose we're writing a Python program to automate creation of new harvest passes based on a short YAML config file containing values for maxBytes, maxObjects, maxSeconds and ordertemplate per harvest pass. Eg:

auto:
  P1:
    objectscomment: 3
    bytes: 1000
  this is an automatically created harvest pass
  seconds objects: 36003
    commentbytes: |-1000
     seconds: this3600
is a comment
    autostart: true
    previous: truefalse
    template:
      name: broad_harvest_type_1
      placeholder_namespace: KB.
      placeholders:
        MAX_OBJECT_SIZE_BYTES: 400000000
        EXTRACT_JAVASCRIPT: false
P2:
previous: true
objects: ...


Next meetings

  • April 11th
  • May 9th
  • June 6th
  • July 4th
  • September 5th
  • October 3rd
  • November 7th
  • December 5th
  • January 9th 2024

...