2020-11-03 Statusmeeting

Agenda for the joint NetarchiveSuite tele-conference 2020-11-03, 13:00-14:00.

Participants

  • BNF: Sara
  • ONB: Andreas
  • KB/DK - Copenhagen: Tue, Stephen, Anders, Alexandre
  • KB/DK - Aarhus: Kristian, Colin
  • BNE: Alicia, José Carlos, María
  • KB/Sweden: Pär, Peter

Join from PC, Mac, Linux, iOS or Android:

    https://kbdk.zoom.us/j/104443571

Or an H.323/SIP room system:

    H.323: 109.105.112.236
    Meeting ID: 104 443 571

    SIP: 104443571@109.105.112.236

Or Skype for Business (Lync):

    https://kbdk.zoom.us/skype/104443571

Or Telephone:

Denmark: +45 89 88 37 88 or +45 32 71 31 57
United Kingdom: +44 203 051 2874 or +44 203 481 5237 or +44 203 966 3809 or +44 131 460 1196
Finland: +358 9 4245 1488 or +358 3 4109 2129
Sweden: +46 850 539 728 or +46 8 4468 2488
Norway: +47 7349 4877 or +47 2396 0588
US: +1 669 900 6833 or +1 646 558 8656
    Meeting ID: 104 443 571

    International numbers available: https://zoom.us/u/acRu0MV3xJ

You can join a meeting by using apps from a pc, a tablet or a smartphone, but you can also use the browser based version (it works with newer versions of Chrome or Firefox)


Update on NAS latest tests and developments

Feedback on revisits

Status of the production sites

Netarkivet

  1. Systems
    1. Overall, our systems work
    2. Firewall / network upgrade and consequent crashes in the Network Archive have filled a lot. ITD is working at high pressure to get a stable infrastructure in KBH, but we will probably have to enter 2021 before everything has been renewed and updated.
    3. Heritrix IIPC standard is in production
    4. SolrWayback is soon on its way into production and in a new updated version.

  2. Who uses our systems
    1. Browsing in the Online Archive:
      1. Statistics 6 months back: At least 1 external (but issues with seeing how many) and correspondingly 13 internal (for QA, development and much more).
    2. 40+ external user have access to our systems
    3. Delivery of data from Netarkivet takes place on an ongoing basis and I only expect it to be more comprehensive in the future.

  3. Collection
    1. Netarkivet has made a great effort in relation to Corona event harvesting
    2. Heritrix standard version may mean more efficient harvests from the 2nd cross-sectional harvest 2020, which is underway.
    3. Much of the interesting content at the moment. requires manual flows: Facebook (especially when it comes to comments)
    4. Still an issue to get certain types of dynamically generated content. Until we have other solutions e.g. browser-based harvesting uses Archive-IT and various work arounds (eg use of XML sitemaps that give us the URLs Heritrix does not immediately see).
    5. We are looking at how / if we can get Warc files from webrecorder / conifer.org in Netarkivet. It looks promising.

  4. Preservation
    1. We are well underway with the major projects in relation to DKM-077 - one online copy (Closed and part of DKM-085) and DKM-085 - Bitmagasine. Schedules have been made and special work is being done to refine the cutover process.
  5. Access
    1. Solrwayback on the way in production. Internal and external rejoice. It looks promising.

  6. Organization
    1. It goes well. Our more agile approach with daily stand-up meetings, review, retrospective and planning with a small group of Netarkiv people, provides added value.

  7. Cooperation
    1. More and more interest from external. Several from the Netarkivet have participated in a seminar on research in the Netarkivet for researchers at KU (KUB)
    2. I think there will be a higher demand for Netarkivet's content in the future.
    3. 40+ external users the last years

  8. The future
    1. NAS development and/vs. external solutions (decision proposals must be made)
    2. Webdanica analysis
    3. BcWeb
    4. SolrWayback further development / project setting.
    5. Workshop with ITU / DKU in relation to our access solutions - PyWb for playback of the Netarkivet in tandem with SolrWayback - IIPC will in future support 1 solution: PyWb (instead of Open Wayback as before)
    6. More dialogue with researchers about their wishes in relation to e.g. access solutions
    7. A long way to go to get enough resources / competencies in relation to the Network Archive's tasks (ITU + DKU). Part of 5-year plan.

BnF


ONB


BNE

About our collections

We are still working in our Coronavirus collection. We have written a post about this collection for the Digital Preservation Coalition that will be posted the World Digital Preservation Day

About our technical situation

We met the new head of the IT department and we have told her about all our projects. We have requested more technical support for NAS.

About NAS

WARC files have been migrated to a new storage servers

About BCWeb

He already have the version 6.1 in production and we are working in it.

KB-Sweden


Next meetings

  • December 8, 2020
  • January 5, 2021

Any other business?

·