Heritrix and NAS 7.1 version statements in WARC metadata and data files are inconsistent

Description

In 7.1, there are some inconsistencies in NAS and Heritrix version statements:

 

In the current 6.0 release:
For metadata:
in warcinfo, we have:
ssoftware: NetarchiveSuite/Version: 6.0 (https://github.com/netarchivesuite/netarchivesuite/commit/c8213774bc921069050ae87283294a587b0b3365)/https://sbforge.org/display/NAS

in warc records:
WARC-Target-URI: metadata://netarchivesuite.bnf.fr/crawl/setup/crawler-beans.cxml?heritrixVersion=3.4.0-20200518-NAS-6.0&harvestid=33&jobid=36951

For data:
in warcinfo:
software: Heritrix/3.4.0-20200518-NAS-6.0 http://crawler.archive.org

#added by NetarchiveSuite Version: 6.0 (https://github.com/netarchivesuite/netarchivesuite/commit/c8213774bc921069050ae87283294a587b0b3365)

 

In the 7.1 release:
For metadata:
in warcinfo, we have:
software: NetarchiveSuite/Version: 7.1 (https://github.com/netarchivesuite/netarchivesuite/commit/1d53f8bcdc078160b94774ca5bceb31263ca8355)/https://sbforge.org/display/NAS

in warc records:
WARC-Target-URI: metadata://netarchivesuite.bnf.fr/crawl/setup/crawler-beans.cxml?heritrixVersion=3.4.0-20200518-NAS-6.0&harvestid=64&jobid=36909

For data:
in warcinfo, we have:
software: Heritrix/3.4.0-NAS-7.1-SITEMAP-SNAPSHOT-2021-05-25T06:26:16Z http://crawler.archive.org

#added by NetarchiveSuite Version: 7.1 (https://github.com/netarchivesuite/netarchivesuite/commit/1d53f8bcdc078160b94774ca5bceb31263ca8355)

 

We should harmonize the Heritrix version statements. Proposal:
Heritrix/3.4.0-NAS-7.1-SITEMAP-SNAPSHOT-2021-05-25T06:26:16Z ==> Heritrix/3.4.0-NAS-7.1
heritrixVersion=3.4.0-20200518-NAS-6.0 ==> heritrixVersion=3.4.0-20210525-NAS-7.1

Checklist

Activity

Show:

Details

Assignee

Reporter

Organization

BNF

Inspector (migrated)

Components

Affects versions

Priority

Checklist

Created July 16, 2021 at 2:50 PM
Updated July 16, 2021 at 2:50 PM