Make it possible to use either ARC or WARC as the harvesting format.

Description

It seems, that it is a wish, that NetarchiveSuite should be able to do harvesting in both formats, either ARC or WARC.

This should be a system-wide setting in the harvester settings.

Attachments

1

Checklist

Activity

Show:

Nicholas ClarkeSeptember 5, 2012 at 2:08 PM

Tested on two installations one using ARC the other using WARC.
Harvesting and batch jobs work with both ARC and WARC.

Nicholas ClarkeAugust 24, 2012 at 2:58 PM

Standalone NAS deployment file which is set to used WARC. Can be used as in the QUICKSTART.

Nicholas ClarkeAugust 24, 2012 at 2:55 PM

Setting the heritrix crawler format is done with the following setting
settings.harvester.harvesting.heritrix.archiveFormat=<arc/warc>

Settings the metadata format is done with the following setting
settings.harvester.harvesting.metadata.metadataFormat=<arc/warc>

Fixed

Details

Assignee

Reporter

Accuracy of estimate

Rough

Original estimate

Time tracking

No time logged3d 3h remaining

Components

Fix versions

Priority

Checklist

Created September 30, 2011 at 11:12 AM
Updated February 16, 2016 at 5:28 PM
Resolved September 5, 2012 at 2:08 PM