Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Activating a snapshot harvest on page HarvestDefinition/Definitions-snapshot-harvests.jsp calls the SnapshotHarvestDefinition#flipActive() method.

This method has the following logic if deduplication is enabled:

log.info("Snapshot harvest #{} activated. Requesting preparation of deduplicationIndex before jobgeneration can commence", harvestId);
Set<Long> jobSet = hdDaoProvider.get().getJobIdsForSnapshotDeduplicationIndex(harvestId);
jobIndexCache.requestIndex(jobSet, harvestId);

This sends a IndexRequestMessage to the IndexServer or more specifically the IndexRequestServer for a deduplicationIndex for the given list of jobs.

After processing the request, it sends a IndexReadyMessage to the HarvestJobManager with either indexOK=true (index is ready), or indexOK=false (The server failed to generate the index)

(See dk.netarkivet.harvester.indexserver.distribute.IndexRequestServer#doProcessIndexRequestMessage(), ll. 416-423)


The HarvestJobManager receives the response to the IndexreadyMessage in method HarvestSchedulerMonitorServer#processIndexReadyMessage()
Here the 'isindexready' field in the table 'fullharvests' is set to true, if the 'indexOK' field in the IndexReadyMessage is true, otherwise it is set to false.

Selecting the list of jobs included in the deduplicationIndex

The method HarvestDefinitionDBDAO#getJobIdsForSnapshotDeduplicationIndex is responsible for computing the list of jobs included in the deduplication index.
It uses the getPreviousFullHarvests() method


Classes involved in this workflow:

  • harvester/harvester-core/src/main/java/dk/netarkivet/harvester/webinterface/SnapshotHarvestDefinition.java, ll. 251-299 (esp. 267-282)
  • harvester/harvest-scheduler/src/main/java/dk/netarkivet/harvester/scheduler/HarvestSchedulerMonitorServer.java, ll. 196-224
  • harvester/harvester-core/src/main/java/dk/netarkivet/harvester/indexserver/distribute/IndexRequestServer.java, ll. 416-423
  • harvester//harvester-core/src/main/java/dk/netarkivet/harvester/datamodel/HarvestDefinitionDBDAO.java, ll. 1167-1187, 1189-1233
  • No labels