Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 25 Next »

Information on the Bnf-Netarkivet.dk workshop at KB with the purpose of defining the WARC implementation work in NAS.

  • Place: KB
  • Time:  April 2 09:15 to 13:00??.
  • Participants:
    • BnF: Clément, Sara and Sophie
    • KB: Nicholas & Søren
    • SB: Mikis 

Agenda

  • (1 hour) Recap on JHove2 module status.
    • Status for merge to HEAD of Nicholas's code.
      • Martha is aware of the problems with merging 3rd party code to HEAD, and as the Jhove2 is a high priority from IIPC will hope this will be adressed before or under the GA in Washington.
    • Status for JHove2 milestone, including demo.
      • A propasal for criterias for a validation of the prototype release, is that the only the output of Jhove2 modules should be used (the code itself will will be tested as the part of the road to the final release).
      • Clement will mail Aaron regarding payment for the initial technical specification.
      • BNF will test the JHove2 release in May, so we can validated the first milestone.
      • As Nicholas has removed the Jhove2 ability to run in parallel, the performance aspects of Nicholas's code need to be tested.
      • Nicholas will mail the basic code release to Aaron for testing (Tomas(BnF) and also by Steve?), so we can be prepared for input to the final release.
      • Nicholas and Clement should have a technical discussion regarding Nicholas's code during the GA. Subjects here would be code merge to HEAD, parallazation. Perhaps Monday at 17:00.
      •  
      • KB will be using the Jhove2 WARC for digital document characterization as part of the preservation. 
      • As WARC is currently used more for none-web archiving, Clement is very interested in input to extensions to the WARC ISO standard.
      • We talked a bit about the possibility of a PDF module which will be needed by BnF. Perhaps a job for Nicholas?
      • BnF will send sample WARC files Nicholas can generate Jhove2 output for inspection by BnF. 
  • (10 minutes) Discuss Jhonas presentation at GA : project update (10-15 presentation on Tuesday) + half day presentation at the PWG
    1. General presentation of the project (WARC, Jhove2, NAS), why, who.
    2. Summary of current status.
    3. Refere people to detailed sessions.
    • Agenda for PWG (Clement / Nicholas):
    • More detailled breakdown of what is extracted by the Jhove2 WARC module.
    • More detailled walkthrough of the metadata model which will be used in NetarchiveSuite (including ARC-WARC mapping) and the metadata is handled in general in NAS.
    • Demo of the module.
    • A priority here is to expose the value of this project for 3rd parties and listen to ideas for additional features.
  • (30 minutes) Discussion about NetarchiveSuite workshop at IIPC GA.
    • We should consider breaking the last part into a discussion track and a demo/handons. Annick and Nicholas might handle the demo/handons part.
    • We should consider sending a mail to participants with a update on the agenda and request information regarding the expectations for the workshop (and confirm their participation). Sara will request a list of participants from Abbie.
  • (1 hour) Review of the currently defined tasks:NAS-1720@jira.
    •  
  • (Afternoon) Leveraging the WARC formats possibility for adding metadata.
    • Define initial metadata model.
  • Mapping of NetarchiveSuite metadata with WARC warcinfo, metadata and named fileds.
    • Clement (and Sophie and Sara) will write up the proposed WARC format specification and send it to the participants.
    • Nicholas will create a specification wiki page based on this. It will be recommended to the participants to subscribe to changes to this page.
    • The additional NAS functionality need to support the extended format (harvest info metadata, configurable file name format, etc.) will be defined by Nicholas (assisted by Søren and Mikis).

WARC in NAS format draft

Error rendering macro 'viewdoc' : Failed to find attachment with Name WARC NAS specifications.doc
  • No labels