From NetarchiveSuite 7.0, the software supports an alternative backend based largely on off-the-shelf components, which reduces the maintainance burden of the NetarchiveSuite installation itself. This architecture has been successfully implemented at the Danish Netarkivet, leveraging existing experience in usage of bitrepository.org and hadoop software. It must, however, be emphasised that it is a complex operation to establish such an architecture and that the necessary services come with their own maintenance burden. In the following we give a brief description of the components involved. Anyone considering implementing such an architecture themselves is advised to contact Netarkivet at the Royal Danish Library for further advice.
Bitrepository Configuration
The new architecture is enabled by specifying dk.netarkivet.archive.arcrepository.distribute.BitmagArcRepositoryClient
as the value of the configuration parameter settings.common.arcrepositoryClient.class
.
With this set, the rest of the arcrepositoryClient settings look like:
<arcrepositoryClient> <class>dk.netarkivet.archive.arcrepository.distribute.BitmagArcRepositoryClient</class> <bitrepository> <storeMaxPillarFailures>0</storeMaxPillarFailures> <store_retries>3</store_retries> <retryWaitSeconds>1800</retryWaitSeconds> <tempdir>arcrepositoryTemp</tempdir> <collectionID>netarkiv</collectionID> <usepillar>netarkivonline1</usepillar> <getTimeout>300000</getTimeout> <getFileIDsMaxResults>10000</getFileIDsMaxResults> <keyfilename>client-certkey.pem</keyfilename> <!-- <settingsDir> element is set per location or machine below --> </bitrepository> </arcrepositoryClient>
Note in particular that <settingsDir> points to a directory containing a complete set of bitrepository.org RepositorySettings and ReferenceSettings. The specified keyfile lies inside this directory. The RepositorySettings must grant relevant permissions to the NetarchiveSuite application based on the provided key. For HarvestController applications, these are PutFile permissions to enable upload. For the NetarchiveSuite webGUI, Viewerproxy, and IndexServer application these are GetFileIDs and GetFile permissions.