2013-02-05 Statusmeeting

Agenda for the joint BNF, ONB, SB and KB NetarchiveSuite tele-conference January the 29th 2013, 13:00-14:00.

  • BNF: Nicolas, Sara.
  • ONB: Andreas
  • KB: Tue, Søren and Nicholas
  • SB: Colin and Mikis, Sabine.
  • Any other issues to be discussed on today's tele-conference?

Iteration 54 (4.1 development release)

See plan here. Planned release at the end of march.

Curator roadmap

JHoNAS status (Nicholas)

  • JHove2 release
  • Final report

Status of the production sites


We did some afforts to harvest the Danish Twitter profiles (about 45.000). In order to avoid redundance we set up some filters for https. That was no success, because Twitter now only accepts https. So we retried with https. We harvested about 34 GB (configuration: path 2 levels orderxml)

  • Status for BCWeb opensourcing? 

Since the end of August, the Digital Legal Deposit service of the BnF has had a Twitter account. At first, it was started mainly to make us more visible: we collect the Internet, so we had to be on it! We thought it would be a good way to answer questions from producers, and to inform the public of our activities. Plus, it would allow the members of the team who were present at professional events to use an official account to live-tweet rather than their personal accounts. A lot of professional institutions are already on Twitter and do a good job: information goes very fast and dialogue can be very instructive.

After almost 6 months of existence, we already have almost 600 followers, which is more than what we honestly expected!

We do communicate on our activities: the beginning of the broad harvest, publications, participation in scientific or technical events, statistics. We can see what is being said about us too. What is more instructive for us is the feedback and the questions of the Twitter community, on the harvest of particular websites, on the legal status of the archives, etc.

But this account has entered into our production workflow too.Two times since we started (mid-September 2012 and mid-January 2013), we have tweeted that producers could send us URLs via our generic email address. These tweets were quite successful and we have indeed received suggestions, by email or directly on Twitter. It is difficult to estimate their number precisely, but there have been around 15-20. We treat these suggestions like others from producers in our generic mailbox: they go into the next "emergency harvest" (one per month) and stay in our database for later harvests. We plan to tweet that information on a regular basis.

How do you manage your Twitter account? If you do not have one, why? Do you plan to start one?

  • New selective Crawl was started which contains many pages of politcal parties and blogs about politics.
  • Currently  preparing stage 1 of our domain crawl 2013. Should we start with version 4.0?
  • In Spring our IT-Department will replace our crawler-server with new boxes


Any other business?