Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

Panel
  • Toke
  • Letting all the great knowledge and impression from IIPC WAC 2024 in Paris sink in.
  • 2nd Broadcrawl 2024- step 2
    • Running as planned
    • Investigating status code 555! mainly from 2023 and 2024
      • "555 Security Incident Detected Your request was blocked. If you are the owner of the website: The Website Application Firewall that is protecting your website has blocked this request for being suspicious. You can see the detailed reason for this in your webserver logs. If you are the visitor: The public IP address assigned to you, by your internet provider, might be suffering from poor reputation: Look up IP reputation here. IP addresses from VPN providers or public networks often have poor reputation."
  • Still testing on-site installation of Browsertrix. Upgraded to latest version but forgot to update crawler to later than 1.1.0 (neccesary for QA functions) 
  • SolrWayback as a search & discovery tool for researchers to work with web archive collections -  Workshop at DHNBC 2024, Iceland, with Jon from Nettarkivet, Norway https://www.conftool.org/dhnb2024/index.php?page=browseSessions&form_session=94&presentations=sho
  • https://github.com/netarchivesuite/solrwayback/releases/tag/5.1.0 
    • Substantial speed up when exporting (csv,warc etc.) from large multi sharded collections. See #329. This feature still needs a little more testing. Feedback will be welcome.
  • Progress on data delivery and legal matters.
  • Part of project with KU and others: "På randen af litteraturhistoriens digitale afgrund" (translsated to "On the brink of the digital abyss of literary history!"
    • Maybe crowdsourcing some parts (donations? crawls?) needs to be investigated further.
  • New consortium accelerates Danish language models

...

Panel

This month, we are working on three election events at the same time, included the election for European Parlament.

The broad crawl of the .eus domain (Basque Country) has been finished and we are going to start with the harvesting of open access reviews that we carry out annually. The idea is to incorporate the links from OpenWayback to catalog in the missing reviews every year.

We are preparing an in-person workshop for collaborating web curators from the different regional conservation centres that it will be held at the BNE in September. It will focus on legal deposit and especially on web archiving; it is will be the first one we have held since before the pandemic.

...