...
Code Block |
---|
log.info("Snapshot harvest #{} activated. Requesting preparation of deduplicationIndex before jobgeneration can commence", harvestId); Set<Long> jobSet = hdDaoProvider.get().getJobIdsForSnapshotDeduplicationIndex(harvestId); jobIndexCache.requestIndex(jobSet, harvestId); |
This sends a message IndexRequestMessage to the IndexServer
// fhd.setIndexReady(true); // will be set to true later, when Indexserver announces, that deduplication index is ready
// by sending a IndexReadyMessage back to the scheduler (i.e. the HarvestJobManager).
// or more specifically the IndexRequestServer for a deduplicationIndex for the given list of jobs. (See IndexRequestClietn
After processing the request, it sends a IndexReadyMessage to the HarvestJobManager with either indexOK=true (index is ready), or indexOK=false (The server failed to generate the index)
(See dk.netarkivet.harvester.indexserver.distribute.IndexRequestServer#doProcessIndexRequestMessage(), ll. 416-423)
The HarvestJobManager receives the response to the IndexreadyMessage in method HarvestSchedulerMonitorServer#processIndexReadyMessage()
Here the 'isindexready' field in the table 'fullharvests' is set to true, if the 'indexOK' field in the IndexReadyMessage is true, otherwise it is set to false.
Note that the 'isindexready' field is used by HarvestDefinitionDBDAO#getReadyHarvestDefinitions() to only return the ready snapshot harvestsdefinitions where isindexeady is true.
Selecting the list of jobs included in the deduplicationIndex
The method HarvestDefinitionDBDAO#getJobIdsForSnapshotDeduplicationIndex is responsible for computing the list of jobs included in the deduplication index.
It uses the rather complex (and maybe wrong) getPreviousFullHarvests() method also in HarvestDefinitionDBDAO
Code Block |
---|
/**
* Get list of harvests previous to this one.
*
* @param thisHarvest The id of this harvestdefinition
* @return a list of IDs belonging to harvests previous to this one.
*/
private List<Long> getPreviousFullHarvests(Long thisHarvest) {
List<Long> results = new ArrayList<Long>();
try (Connection c = HarvestDBConnection.get();) {
// Follow the chain of originating IDs back
for (Long originatingHarvest = thisHarvest; originatingHarvest != null;
// Compute next originatingHarvest
originatingHarvest = DBUtils.selectFirstLongValueIfAny(c, "SELECT previoushd FROM fullharvests"
+ " WHERE fullharvests.harvest_id=?", originatingHarvest)) {
if (!originatingHarvest.equals(thisHarvest)) {
results.add(originatingHarvest);
}
}
// Find the first harvest in the chain (but last in the list).
Long firstHarvest = thisHarvest;
if (!results.isEmpty()) {
firstHarvest = results.get(results.size() - 1);
}
// Find the last harvest in the chain before
Long olderHarvest = DBUtils.selectFirstLongValueIfAny(c, "SELECT fullharvests.harvest_id"
+ " FROM fullharvests, harvestdefinitions," + " harvestdefinitions AS currenthd"
+ " WHERE currenthd.harvest_id=?" + " AND fullharvests.harvest_id "
+ "= harvestdefinitions.harvest_id"
+ " AND harvestdefinitions.submitted " + "< currenthd.submitted"
+ " ORDER BY harvestdefinitions.submitted " + HarvestStatusQuery.SORT_ORDER.DESC.name(),
firstHarvest);
// Follow the chain of originating IDs back
for (Long originatingHarvest = olderHarvest; originatingHarvest != null; originatingHarvest = DBUtils
.selectFirstLongValueIfAny(c, "SELECT previoushd FROM fullharvests"
+ " WHERE fullharvests.harvest_id=?", originatingHarvest)) {
results.add(originatingHarvest);
}
} catch (SQLException e) {
log.warn("Exception thrown while updating fullharvests.isindexready field: {}",
ExceptionUtils.getSQLExceptionCause(e), e);
}
return results;
} |
Classes involved in this workflow:
- harvester/harvester-core/src/main/java/dk/netarkivet/harvester/webinterface/SnapshotHarvestDefinition.java, ll. 251-299 (esp. 267-282)
- harvester/harvest-scheduler/src/main/java/dk/netarkivet/harvester/scheduler/HarvestSchedulerMonitorServer.java, ll. 196-224
- harvester/harvester-core/src/main/java/dk/netarkivet/harvester/indexserver/distribute/IndexRequestServer.java, ll. 416-423
- harvester//harvester-core/src/main/java/dk/netarkivet/harvester/datamodel/HarvestDefinitionDBDAO.java, ll. 1167-1187, 1189-1233
- harvester/harvester-core/src/main/java/dk/netarkivet/harvester/indexserver/distribute/IndexRequestClient.java, ll. 358-383