Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Agenda for the joint BNF, ONB, SB and KB NetarchiveSuite tele-conference August the 14th 2012, 13:00-14:00.

Practical information

  • TDC tele-conference:
    • Dial in number (+45) 70 26 50 45
    • Dial in code 9064479#
  • BridgeIT: BridgeIT conference will be available about 5 min. before start of meeting. The Bridgit url is konf01.statsbiblioteket.dk. The Bridgit password is sbview.

Participants

  • BNF: Nicholas, Sara
  • ONB: Michaela and Andreas
  • KB: Tue, Søren and Nicholas
  • SB: Colin and Mikis, Sabine
  • Any other issues to be discussed on today's tele-conference?

Heritrix 3 in NetarchiveSuite

 

JhoNAS status (Nicholas)

A status update from the begining of August was sent to the PWG and is accessible from this link: jhonas-project-status-aug.pdf

JHoNas JWAT/JHove2 status

All JHove2 Modules seem to work. Thomas Ledoux is working on containerMD.xsl.

Thomas Ledoux has been testing the different modules and a bunch of issues have been fixed in JWAT/Jhove2.

Current issues: WARC-Target-URI validation is too strict, unit test modules, jhove2 does not remove temp files with -t option.

And of course the usual, finish JWAT library...

JHoNas NAS status
  • Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.
    : Done, needs unit testing.
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : Done, needs unit testing. Besides a WARCBatchJob also ArchiveBatchJob has been implemented for batch jobs running on both ARC and WARC.
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : Tested in local installation.
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : Done, needs unit testing.
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : Done, needs unit testing. Problems with WARC and content-length=0.
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. :Done, needs unit testing. Problems with WARC and content-length=0.
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : N/A
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : N/A
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : Currently it is a mirror of the ARC file.
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : N/A
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : N/A
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration. : N/A
  • Unable to locate Jira server for this macro. It may be due to Application Link configuration.

 

Moved sourcecode to GitHub?

I think we should consider moving the code to git hub because:

Iteration 52 (3.21 development release) (Mikis)

 

Status of the production sites

  • Netarkivet:

As our broad crawls a speeded up to last less than 2 month, we took advantage of the break between to broad crawls 

  • To crawl “very big web sites” (such as the Danish National Broadcast dr.dk and our other main tv-station tv2.dk) in depth.
  • To crawl websites of ministries, departments etc. in depth
  • To capture url’s of YouTube videos on and by political parties

We started our own event crawl on the Olympics in London: entering url’s into the system, QA and monitoring.

As to our selective crawls: “business as usual” – that is to say: analyze of “candidates” (new sites proposed for selective crawls), QA of selective crawls, monitoring harvest jobs, revision of harvest profiles

  • BNF:

 

  • ONB:

 

Date for NAS workshop at SB

Mid-october?

Date for next joint tele-conference.

September 11th?

Any other business?

  • No labels