Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Remove the JMS dependency from the controller.
    • Instead use a REST interface or some other means of exposing a simple extendable API.
  • Remove the notion of channels from the controller.
    • The management of organizing controllers into groups is left to the user of these APIs.
  • Make the code independent of the rest of NAS so it can be used not only by NAS.
  • Controllers should be deployed independently of the rest of NAS.
  • Use a plugin architecture for core functionality. (Use classloaders)
    • configure harvester
    • build progress reports
    • build metadata files when the job is complete
    • upload data to persistant storage
  • A controller is built for a specific harvester; H1, H3, API
  • Extendable using custom commands that the plugins add to the controller. (Thinking beyond H3...)
  • The API should include all required functions to control the harvest manager
    • Submit job.

    • Upload configuration files.

    • Upload additional files; indexes etc.

    • Start job.

    • Get progress/report.

    • Stop job.

    • Initiate metadata generation.

    • Initiate upload.

    • uploading new versions of the plugins.
    • uploading new versions of the harvester package. (h3 bundle)
  • Offer base client implemention. (Used by a job manager/monitor)

...

Only when the worst happens should i be required for a person to fiddle with the server.

 

 

Netarkivet would then migrate existing code into plugins and other users could use these as a reference to adapt then to their own infrastructureThe existing code in the harvest control manager that must be migrated into one or more plugins.

  • move files to h1/h3 folder.
  • build metadata reports.
  • build progress reports.
  • upload data to the bitarchive

The plugins can form the base for other people adapting the code to suit their own needs.

The only state the manager should know about is it's harvest.

Everything else is up to the caller. This way you can use "channel" or not. And more importantly you can manage "channel" dynamically without having to reconfigure and redeploying harvest controll managers.