Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
simpleRelease Date: 17th September 2015

Excerpt

Planned release: Beginning of January 2015. Currently planned milestones are:

Child pages (Children Display)
excerpttrue
excerptType

Contents

Table of Contents
minLevel2

Highlights

Heritrix 3

This is the first production release for NetarchiveSuite based on the new Heritrix3to support harvesting with Heritrix 3. Harvesting with Heritrix 1 is not supported in this version but may be supported in later versions.

Java 8

NetarchiveSuite now requires java 8 to run.

More detailed jar file structure

The jar files now fit the module module application structure better. This means the deploy script should be updated to the follow application <-> jar file classpath definitions:

  • Global deployClassPath show only contain one element: 

    Code Block
    <deployGlobal>
      <deployClassPath>lib/netarchivesuite-monitor-core.jar</deployClassPath>
      ....
  • GUIApplication: 

    Code Block
    <applicationName name="dk.netarkivet.common.webinterface.GUIApplication">
      <deployClassPath>lib/netarchivesuite-harvest-scheduler.jar</deployClassPath>
      <deployClassPath>lib/netarchivesuite-archive-core.jar</deployClassPath>
      ....
  • ArcRepositoryApplication: 

    Code Block
    <applicationName name="dk.netarkivet.archive.arcrepository.ArcRepositoryApplication">
      <deployClassPath>lib/netarchivesuite-archive-core.jar</deployClassPath>
      ....
  • BitarchiveMonitorApplication: 

    Code Block
    <applicationName name="dk.netarkivet.archive.bitarchive.BitarchiveMonitorApplication">
      <deployClassPath>lib/netarchivesuite-archive-core.jar</deployClassPath>
      ...
  • HarvestJobManagerApplication: 

    Code Block
    <applicationName name="dk.netarkivet.harvester.scheduler.HarvestJobManagerApplication">
      <deployClassPath>lib/netarchivesuite-harvest-scheduler.jar</deployClassPath>
      ....
  • BitarchiveApplication: 

    Code Block
    <applicationName name="dk.netarkivet.archive.bitarchive.BitarchiveApplication">
      <deployClassPath>lib/netarchivesuite-archive-core.jar</deployClassPath>
      ...
  • HarvestControllerApplication: 

    Code Block
    <applicationName name="dk.netarkivet.harvester.harvesting.HarvestControllerApplication">
      <deployClassPath>lib/netarchivesuite-heritrix1-controller.jar</deployClassPath>
      ...
  • IndexServerApplication: 

    Code Block
    <applicationName name="dk.netarkivet.harvester.indexserver.IndexServerApplication">
      <deployClassPath>lib/netarchivesuite-harvest-scheduler.jar</deployClassPath>
      <deployClassPath>lib/netarchivesuite-archive-core.jar</deployClassPath>
      ...
  • ViewerProxyApplication: 

    Code Block
    <applicationName name="dk.netarkivet.viewerproxy.ViewerProxyApplication">
      <deployClassPath>lib/netarchivesuite-harvest-scheduler.jar</deployClassPath>
      <deployClassPath>lib/netarchivesuite-archive-core.jar</deployClassPath>
      ...
  • ChecksumFileApplication: 

    Code Block
    <applicationName name="dk.netarkivet.archive.checksum.ChecksumFileApplication">
      <deployClassPath>lib/netarchivesuite-archive-core.jar</deployClassPath>
      ....

. See Updating deploy file jar definitions to 5.0.

Heritrix3 bundler zip now needs to be supplied in the deploy

The Heritrix3 crawler code has been moved into a separate zip file to allow more flexibility for choosing a concrete H3 version to use. This means that the HarvesterControllers deployment now needs to be configured with a extra H3 bundler zip. This can either be done as an argument to the deploy call or by defining a bundler for each HarvestControllerApplication in the deploy configuration.

Switch to Maven as build tool

The project is now build with Maven. This means that the jars, source-jars and javadoc jar can be found at the https://sbforge.org/nexus/content/groups/public/ repository and also that netarchivesuite modules can be added as maven dependencies in other projects, for example:

Code Block
<dependency>
  <groupId>org.netarchivesuite</groupId>
  <artifactId>nas-module</artifactId>
  <version>${nas.version}</version>
</dependency>

Source moved to Github

The Netarchivesuite source code is now located at github here: https://github.com/netarchivesuite/netarchivesuite.

Full list of issues resolved in this release

Jira Legacy
serverSystem JIRA
columnstype,key,priority,summary
maximumIssues20
jqlQueryproject = NAS AND issuetype in standardIssueTypes() AND fixVersion = 5.0 AND NOT component = Test ORDER BY priority DESC, created ASC
serverId81c76265-cab2-3ba5-b74d-ee7cd9a2765e

Known issues

Jira Legacy
serverSystem JIRA
columnstype,key,priority,summary,fixversions
maximumIssues20
jqlQueryproject = NAS AND issuetype = Bug AND affectedVersion = 5.0 ORDER BY priority DESC, cf[10010] ASC, fixVersion ASC
serverId81c76265-cab2-3ba5-b74d-ee7cd9a2765e