Pluggable parts
Contents
Some points in NetarchiveSuite can be swapped out for other implementations using a plugin architecture.
RemoteFile
The RemoteFile interface defines how large chunks of data are transferred between machines in a NetarchiveSuite installation. This is necessary because JMS has a relatively low limit on the size of messages, well below the several hundred megabytes to over a gigabyte that is easily stored in an ARC or WARC file.
The RemoteFile interface is defined by the RemoteFile interface.
JMSConnection
The JMSConnection provides access to a specific JMS connection. The default NetarchiveSuite distribution contains only one implementation, namely JMSConnectionSunMQ which uses Sun's OpenMQ. We recommend using this implementation, as other implementations have previously been found to violate some assumptions that NetarchiveSuite depends on.
The JMSConnection interface is defined by the abstract class JMSConnection.
Implementations of this interface needs to implement the four abstract methods in this interface:Â getConnectionFactory(), Â getDestination(String destinationName), onException(JMSException e), and getQueueSession().
ArcRepositoryClient
The ArcRepositoryClient handles access to the Archive module, both upload and low-level access.
The ArcRepositoryClient interface is defined by the interface ArcRepositoryClient
IndexClient
The IndexClient provides the Lucene indices that are used for deduplication and for viewerproxy access. It makes use of the ArcRepositoryClient to fetch data from the archive and implements several layers of caching of these data and of Lucene-indices created from the data. This is particularly important for deduplication during snapshot harvesting as many harvest jobs need to reuse the same large deduplication index. It is advisable to perform regular clean-up of the cache directories.
The IndexClient interface is defined by the Java interface JobIndexCache
Archive Admin DBSpecifics
Defines functionality specific to the type of database for the Archive Admin database, see javadoc for details.
Harvester DBSpecifics
Defines functionality specific to the type of database used for the Harvester module, see javadoc for details.
Notifications
The Notifications interface lets you choose how you want important error notifications to be handled in your system. Two implementations exist, one to send emails, and one to print the messages to System.err
. Adding more specialised plugins should be easy.
The Notifications interface is defined by the abstract class Notifications.
HeritrixController
The HeritrixController interface defines our interface for initialize a running Heritrix instance and communicate with this instance. In NetarchiveSuite 5.0 only one implementation is supported - that for Heritrix 3.
ActiveBitPreservation
The ActiveBitpreservaton interface defines our interface for initializing bitpreservation actions from our GUI. We have a file-based (now deprecated) and a database-based implementation. Both these implementations communicate with the archive through the ArcRepository interface.
The ActiveBitPreservation interface is defined by the Java interface ActiveBitPreservation
Â