Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Participants?: Mikis, Sara, Andreas, Søren, Nicholas 

How to switch production from ARC to WARC. What are the impacts on harvesting, access and preservation?Location: 01.93, basement

Participants: All

When do you intend to switch your production workflow?

  • is there a period when you will produce ARCs and WARCs?

    Panel

    Netarkivet.dk; No.

  • Will you migrate your legacy ARCs in WARCs and when?

    Panel

    Both Netarkivet.dk and BnF sees this as a longer term priority

  • If yes, what kind of tools will you use for the migration?

    Panel

    This hasn't been decided. Both the JWAT and Hanzo's Warc tools should be considered. Neither are have production ready migration functionality yet.

  • During the period when the two formats will exist in parallel
    • what will be the estimated duration of this period.

      Panel

      Depends on the when the full migration of the achieve will be done. Will properly be for a long period.

    • on what kind of files will you give access? Is the wayback machine and other tools able to manage in parallel ARCs and WARCs?

      Panel

      The 4.0 release will include functionality for accessing WARC files through Wayback using the NAS archive. Wayback supports local WARC access out-of-the-box.

Specific question on harvesting

  • do you need to change the harvesting profiles?

Specific question on preservation

  • is the bit repository system able to manage WARCs?

    Panel

    Yes, as of 4.0-

  • if you migrate your legacy ARCs in WARCs, will you keep the original ARCs?

    Panel

    BnF will properly be keeping a tape copy. This is undecided for DK.

What kind of costs will you have for this transition?

  • software, machine, human costs

    Panel

    This depends on the migration strategy

  • Do you have other web harvesting processes (which are not internal to NetarchiveSuite) writing WARC files ?

    Panel

    KB is already using WARC for a number of archives based on WARC, including legacy content harvest through non-heritrix tools.

  • Will you use common tools for web archives and other digital resources in WARC?

    Panel

    Yes, see above.