Note that the this documentation is for the old 5.0 release.
For the newest documentation, please see the current release documentation.

Configuration Basics - NetarchiveSuite Settings

Contents

It is possible to control much of the behaviour of NetarchiveSuite tools and applications using settings. Some settings need to be updated for a distributed system to work, others work best with their default settings.

Below, the basics of settings and default settings are described. For description of how to tailor the configurations to the applications, please refer to the Installation Manual.

Setting basics

All NetarchiveSuite applications are based on the same type of configuration: Keys can be mapped to values, and the mappings can be set either in a settings file written in XML, or on the command line. If no value is specified for a given configuration key, a default value is used.

The keys are defined in a hierarchy. When naming the keys, we separate the levels in a key with dots, for instance:

    settings.common.http.port=8076

When describing the same keys in XML, we use the XML hierarchy:

<settings>
  <common>
    <http>
      <port>8076</port>
    </http>
  </common>
</settings>

Setting keys with multiple values

Some settings allow a list of values, rather than just one value. For instance:

<settings>
  <archive>
    <bitarchive>
      <baseFileDir>/mnt/storage1</baseFileDir>
      <baseFileDir>/mnt/storage2</baseFileDir>
    </bitarchive>
  </archive>
</settings>

It is only possible to specify multiple values using configuration files. This cannot be done on the command line.

If you specify more than one settings file, the first settings file to contain a value for the key specifies all values. Values from the settings files will not be merged.

As an example, consider the following two settings files:

settings1:

<settings>
  <archive>
    <bitarchive>
      <baseFileDir>/mnt/storage1</baseFileDir>
      <baseFileDir>/mnt/storage2</baseFileDir>
    </bitarchive>
  </archive>
</settings>

settings2:

<settings>
  <archive>
    <bitarchive>
      <baseFileDir>/mnt/storage3</baseFileDir>
      <baseFileDir>/mnt/storage4</baseFileDir>
    </bitarchive>
  </archive>
</settings>

The following command will give the value

/mnt/storage5

:

  java -Ddk.netarkivet.settings.file=settings1.xml:settings2.xml -Dsettings.archive.bitarchive.baseFileDir=/mnt/storage5 dk.netarkivet.common.webinterface.GUIApplication

The following command will give the values

/mnt/storage1

and

/mnt/storage2

:

  java -Ddk.netarkivet.settings.file=settings1.xml:settings2.xml dk.netarkivet.common.webinterface.GUIApplication

The following command will give the values

/mnt/storage3

and

/mnt/storage4

:

  java -Ddk.netarkivet.settings.file=settings2.xml:settings1.xml dk.netarkivet.common.webinterface.GUIApplication

Default Settings

The NetarchiveSuite package includes default XML setting files with values for the settings that are used to initialize classes if they are not overwritten by separate settings files or on the command line (please refer to Installation Manual).

The NetarchiveSuite has five main levels under the top settings level:

  • common
  • harvester
  • archive
  • monitor
  • wayback

All settings are defined within these five main levels. In addition there is a separate set of settings used only by the deploy application.

The NetarchiveSuite package includes default values for most defined settings. These are defined in XML setting files that are used to initialize classes, one for each main level and one for each plug-in. (TODO: Name the exceptions). The default settings files can be found in the NetarchiveSuite source tree. For each setting there is a corresponding Java variable or constant, and the settings are documented in Javadoc in the relevant classes. The settings file and the relevant classes are as follows

Settings FileJava Class(es)
./common/common-core/src/main/resources/dk/netarkivet/common/settings.xml
dk.netarkivet.common.CommonSettings
dk.netarkivet.common.utils.Settings
./harvester/heritrix3/heritrix3-controller/src/main/resources/dk/netarkivet/harvester/heritrix3/settings.xml
dk.netarkivet.harvester.heritrix3.Heritrix3Settings
./archive/archive-core/src/main/resources/dk/netarkivet/archive/settings.xml
dk.netarkivet.archive.ArchiveSettings
./monitor/monitor-core/src/main/resources/dk/netarkivet/monitor/settings.xml
dk.netarkivet.monitor.MonitorSettings
./wayback/wayback-indexer/src/main/resources/dk/netarkivet/wayback/settings.xml
dk.netarkivet.wayback.WaybackSettings

 

The meaning of the different settings are documented in the javadoc of the associated setting classes as listed below.

Common part

In the common part of the settings, we have general purpose settings (e.g. settings.common.tmpDir, settings.common.http.port), and settings, that allow us to select plug-ins and their associated arguments (e.g. settings.common.RemoteFile.class, settings.common.jms.broker, settings.common.arcrepositoryClient, and settings.common.indexClient.class). Futhermore, there are other dedicated common default values for specific plug-in classes defined in the following setting files. All of these are referred to as part of the common part, but are defined with the plug-in itself. Please see section #Plug-in Default Settings.

Harvester part

In the harvester part of the settings, we have settings configuring the harvesting process: scheduling, job splitting etc. Most of these settings are used by the scheduler in DefinitionsSiteSection of the GUIApplication

Archive part

In the archive part of the settings, we have settings related to archive-access (e.g. certain timeouts, replicas and their credentials are defined here). Also behaviour of the BitarchiveApplications is set here.

Monitor part

In the monitor part of the settings, we have settings for the monitoring shown in the System State in the form of e.g. JMX user name and password and number of shown logged lines.

Plug-in default settings

At the moment, the following plugins have associated default settings defined in the following classes, where their documentation can be found in the javadoc: