Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Agenda for the joint BNF, ONB, SB, KB and BNE NetarchiveSuite tele-conference 14-06-2016, 13:00-14:00.

Practical information

Participants

  • BNF: Sara, Annick
  • ONB: Michaela, Andreas
  • KB/DK: Søren, Tue, Jonas, Stephen, Nicholas
  • SB: Colin, Sabine, Niels
  • BNE: Mar
  • KB/SE: Bengt ??, Stewart ??

NAS 5.1 Update (Tue)

Now in use in Netarkivet production environment

IIPC GA (all)

Feedback and important information from GA

NAS workshop (Sara)

Topics

1) Share experience with NAS 5 and Heritrix 3

2) Discuss challenges with specific types of sites (news, social media)

3) Discuss collection strategies

4) Discuss features/a GUI to handle the harvester

5) Look into the possibility to integrate another crawler into NAS (Colin proposed to come with a prototype with a headless browser)

Schedule

End of January 2017 - 2,5 days - in Vienna

Poll from Michaela http://doodle.com/poll/nk6dfc3kav4a4hs8

Status of the production sites

Netarkivet

 

Broad crawl
We started the second broad crawl 2016 with a limit of 100 MB from each domain to be crawled.

Event crawls
We stopped the refugee crisis crawl. We did a smaller event crawl for the “Eurovision Song Contest”, were we focused on the Danish participants presence on Twitter and on thematic news sections. We are preparing for a crawl of the Olympic in Rio.

Selctive crawls
We started the implementatoin of our revised collection strategy. We have almost established the new selective crawls of national news sites.

One of the first social media platforms, arto.com, closed at 1st  June. We had problems with our last complete crawl before the closing. With a specially developed modul, where the FetchDNS method is changed, we hope to be able to get all content directly from their server.

Potential collaboration project
The Parliamentary Library gives inhouse access to historical (archived) versions of the political parties’ websites. They are not quite satisfied with their solution. Netarchive and the Parliamentary Library are looking at potential future cooperation on this subject.

Internal
Niels Bønding is project lead for curation now.

BnF

 

ONB

 

BNE

 

Next meeting

2016-06-14

Any other business?

 

 

 

  • No labels