Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

Status of the production sites

Netarkivet

Panel
  • Broad crawl
    • Step 1 finished in great fashion with new Bitmagasin
    • Step 2 will start march. 7
  • War in Ukraine event harvest ongoing. Helped DCH-instituions from Ukraine with info/best practise 
  • SolrWayback
    • Using the ongoing harvests (Solr-index/shard) as QA-platform for curators - Q2 2022
    • New Shard built
  • Focus on paywall content and IP-validation to get the most data possible
  • Twitter API-harvest - still pilot project, but very relevant with lots of activity regarding Ukraine



BnF

Panel

First, the Winter Olympic Games harvest ended at the end of February and a new crawl dedicated to the Winter Paralympics has been launched last week. About 14 million URLs were collected in February, including almost 1.4 million Twitter URLs for a total of 0.57TB.

We also decided to make another attempt to collect Instagram in 2022. After several tries, we succeeded. 73 Instagram accounts on the theme of the Olympics were collected, that is to say about 7 000 URLs.

Finally, last December we opened a participation form until the end of January so that the public could indicate sites to be added to the Intelligence Artificial harvest.
We received 23 e-mails including about 60 suggestions of websites to crawl.

...

  • April 12th
  • May 10th
  • June 7th
  • July 5th
  • September 6th
  • October 4th
  • November 8th
  • December 6th
  • January 10th, 2023

Any other business?

·