Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

7.0 Release Date: 2021-03-19

7.1 Forthcoming

Highlights in 7.1

  1. Added 3 new link extractors (from the British Library) to heritrix :
    • org.archive.modules.extractor.ExtractorRobotsTxt

    • org.archive.modules.extractor.ExtractorSitemap
    • org.archive.modules.extractor.ExtractorJson
  2. Added caching of crawl logs when hadoop is used for processing
  3. Added caching of metadata-file indexes when hadoop is used for processing
  4. Added retry functionality to improve the robustness of the WarcRecordClient
  5. Fixed a bug whereby files uploaded from a harvester were not being deleted when the Bitmagasin backend is in use

Highlights in 7.0

NetarchiveSuite 7.0 introduces an entirely new backend storage and mass-processing implementation based on software from bitrepository.org and hadoop. The new functionality is enabled by defining the following key in the settings file for all applications: 

<settings>
   <common>
      <arcrepositoryClient>
         <class>dk.netarkivet.archive.arcrepository.distribute.BitmagArcRepositoryClient</class>   

and additionally 

<settings>
   <common>
      <useBitmagHadoopBackend>true</useBitmagHadoopBackend>

The older arcrepositoryClient implementation dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient will be deprecated in future releases. (The developers are unaware of any other organisations currently using the older client, but please contact us if you still rely on it.)

The new architecture introduces many new keys and external configuration files. There is therefore a separate Guide To Configuring the NetarchiveSuite 7.0 Backend.

Upgrading From Previous NetarchiveSuite Releases

For those using either JMSArcRepositoryClient or LocalArcRepositoryClient there should be no special requirements to upgrade.

Issues Resolved in Release 7.0


Most-recent updates for 7.0:
  • No labels