Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Read access is handled the same way as write access, and the same model for logging and autherization can be used.
  • Both replicaes replicas can be used for retrieving data (can they).

Disadvantage:

...

Alternate solution

Switch to let Wayback access files directly though mounted RO drives. 

...

  • Wouldn't put stress on the message broker or bitarchive infrastructure.
  • Wouldn't put depend on the message broker or bitarchive infrastructure scalability.

...

  1. All improvements have to be done with NetarchiveSuite ressources.
  2. Somewhat unstable.

...

Alternate solution

The de facto standard platform for massproccessing massprocessing is Hadoop, which is used at an increasingly number of webarchiving institutions for analysing the stored web data. Both SB and KB have established Hadoop clusteresclusters, which are already used for processing the Netarkivet.dk archive.

...

  1. Hadoop is an mature standard processing platform for large datasets.
  2. Comes with a huge set of tools, including Webarchiving tools.
  3. Very robust and scalable.
  4. Enables processing resources and data to be seperated separated (SB setup)

Disadvantage:

  1. Migration cost

...

Current solution

Pros: 

Cons:

...

Alternate solution

Switch to using a Bitrepository system for the Bitpreservation

...