Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A status update from the begining of August was sent to the PWG and is accessible from this link: jhonas-project-status-aug.pdf

Testing of WARC implementation in 3.21.

...

Panel
titleJHoNas JWAT/JHove2 status

All JHove2 Modules seem to work. Thomas Ledoux is working on containerMD.xsl.

Thomas Ledoux has been testing the different modules and a bunch of issues have been fixed in JWAT/Jhove2.

Current issues: WARC-Target-URI validation is too strict, unit test modules, jhove2 does not remove temp files with -t option.

And of course the usual, finish JWAT library...

Panel
titleJHoNas NAS status

NAS-1965
--------
Make it possible to use either ARC or WARC as the harvesting format.

Done, needs unit testing.

NAS-1960
--------
Extend our BatchJob framework to handle WARC-files on record level

Done, needs unit testing.

Besides a WARCBatchJob also ArchiveBatchJob has been implemented for batch jobs running on both ARC and WARC.

NAS-1958
--------
Replace the "ARCWriterProcesser" with "WARCWriterProcessor" in our Heritrix templates.

Tested in local installation.

NAS-1959
--------
mplement CDX-generating code, that also works for WARC-files

Done, needs unit testing.

NAS-1962
--------
Store the contents of the metadata-1.arc files as WARC-records

Done, needs unit testing. Problems with WARC and content-length=0.

NAS-1964
--------
Upgrade of Indexserver system

Done, needs unit testing. Problems with WARC and content-length=0.

NAS-2091
--------
Add documentation for WARC usage in Netarchivesuite

N/A

NAS-2090
--------
Add documentation for ARC usage in Netarchivesuite

N/A

NAS-2061
--------
Define the layout of the metadata warc file

Currently it is a mirror of the ARC file.

NAS-2055
--------
Extend the built-in WARCWriterProcessor to allow for functionality required by NetarchiveSuite

N/A

NAS-2070
--------
WARC enable the dk.netarkivet.wayback.NetarchiveResourceStore

N/A

NAS-1961
--------
NAS-1720 Upgrade or remove dk.netarkivet.viewerproxy.LocalCDXCache (deprecated, and uses inline CDXCacheBatchJob)

N/A

Moved sourcecode to GitHub?

...