Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Using this storage interface, we can integrate arbitrary storage solutions with Fcrepo. The interface is split into three java classes, Blobstore, BlobstoreConnection, Blob. The basic design is that the Blobstore is created as a Singleton by the Fcrepo server system. To work with blobs, the blobstore is asked to open a connection, BlobstoreConnection. From this connection, Blobs can be read and written. 

When requesting a Blob from the BlobstoreConnection, a Blob is returned, even if it does not exist. Like a File object, it has an exist(); method. You can then open input and outputstreams on the blob. 

Tapes, the basic design

No data will ever be overwritten. Every write creates a new instance of an object. This is the fundamental invariant in the tape design. 

The tapes are tar files. The can be said to exist in a long chain. Each tape is named according to the time it was created. Only the newest tape can be written. When the newest tape reaches a certain size, it is closed, and a new tape is started. This new tape is now the newest tape. 

Only one thread can write to the tape system at a time.

A separate index is maintained. This index retains the mapping between object identifier and the newest instance of the object (ie. tape name and offset into tape). 

Tape tar files, locking and the like

The tapes are tar files. To understand the following, see http://en.wikipedia.org/wiki/Tar_(computing)

The tapes are read with a library called JTar, see https://github.com/blekinge/jtar

When an outputstream is opened to a blob, the global write lock is acquired by this thread. As Fedora does not tell the blob how much data it is going to write, the outputstream will buffer the written data until the stream is closed. When the stream is closed, the buffer will be written to the newest tape as a new tar entry. The object instance will be registered in the index. Lastly, the write lock will be released.