Agenda for the joint BNF, ONB, SB, KB and BNE NetarchiveSuite tele-conference 14-06-2016, 13:00-14:00.
Practical information
- Go to https://c.deic.dk/netarkivstyregruppe
- Login as guest
- Write your name
- Insert password: wayback
Participants
- BNF: Sara, Annick
- ONB: Michaela, Andreas
- KB/DK: Søren, Tue, Jonas, Stephen, Nicholas
- SB: Colin, Sabine, Niels
- BNE: Mar
- KB/SE: Bengt ??, Stewart ??
NAS 5.1 Update (Tue)
Now in use in Netarkivet production environment
IIPC GA (all)
Feedback and important information from GA
NAS workshop (Sara)
Topics
1) Share experience with NAS 5 and Heritrix 3
2) Discuss challenges with specific types of sites (news, social media)
3) Discuss collection strategies
4) Discuss features/a GUI to handle the harvester
5) Look into the possibility to integrate another crawler into NAS (Colin proposed to come with a prototype with a headless browser)
Schedule
End of January 2017 - 2,5 days - in Vienna
Poll from Michaela http://doodle.com/poll/nk6dfc3kav4a4hs8
Status of the production sites
Netarkivet
Broad crawl
We started the second broad crawl 2016 with a limit of 100 MB from each domain to be crawled.
Event crawls
We stopped the refugee crisis crawl. We did a smaller event crawl for the “Eurovision Song Contest”, were we focused on the Danish participants presence on Twitter and on thematic news sections. We are preparing for a crawl of the Olympic in Rio.
Selctive crawls
We started the implementatoin of our revised collection strategy. We have almost established the new selective crawls of national news sites.
One of the first social media platforms, arto.com, closed at 1st June. We had problems with our last complete crawl before the closing. With a specially developed modul, where the FetchDNS method is changed, we hope to be able to get all content directly from their server.
Potential collaboration project
The Parliamentary Library gives inhouse access to historical (archived) versions of the political parties’ websites. They are not quite satisfied with their solution. Netarchive and the Parliamentary Library are looking at potential future cooperation on this subject.
Internal
Niels Bønding is project lead for curation now.
BnF
ONB
BNE
Next meeting
2016-06-14
Any other business?