...
The last datastructure is the set "tapes". This contains all the tapes that have been indexed so far. Lookup on key and adding to the set is fast operations.
Error recovery
Upon startup, the server goes through the following process.
It lists and sorts by name all the tapes in the given tape folder.
For each tape it checks if the index marks this tape as indexed.
If the tape is not indexed, it is read through for indexing.
Any further tapes in the list are also indexed the same way.
As mentioned, a tape is only marked as indexed when closed, so the newest tape will always be indexed upon server startup.
Indexing a tape
The tape is read through from the beginning. This process is fast, as we can skip over the actual record contents. The relevant information here are the objectIds, the tape name and the offset of the record. For each record, the index is updated in the same way as it would have been when the record was written. This update will overwrite any entry that already existed in the index. As the tapes are read in order of creation-time (this is encoded in the tape name) and the records in the tape are written in order, when all entries concerning a given objectId have been indexed, the index can be sure to have the information about the newest instance of the record.
...
When a tape have been indexed, it is marked as such in the index.
Broken tapes
Sometimes the tapes can become broken (not observed yet). This is detected only during startup, when the tape is read for indexing. This means that if a tape have already been indexed, it will not be verified upon startup. If an IOException occurs while reading the tape, the system regards the tape as broken and attempts to fix it. The fix is rather brutal. Every record, from the beginning of the tape, is read and written to a new temp tape, until the broken tape either runs out of bytes or the IOException reoccurs. This is done in a way to ensure that the temp tape will only have complete records. When no more can be read from the broken tape, it is deleted and replaced with the temp tape. The indexing then proceeds to the next tape in the list.
Spring Configuration
The system is configured from the file "akubra-llstore.xml" which is a spring config file. It is reproduced below.
Code Block |
---|
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">
<beans>
<!--Standard-->
<bean name="org.fcrepo.server.storage.lowlevel.ILowlevelStorage"
class="org.fcrepo.server.storage.lowlevel.akubra.AkubraLowlevelStorageModule">
<constructor-arg index="0">
<map/>
</constructor-arg>
<constructor-arg index="1" ref="org.fcrepo.server.Server"/>
<constructor-arg index="2" type="java.lang.String"
value="org.fcrepo.server.storage.lowlevel.ILowlevelStorage"/>
<property name="impl"
ref="org.fcrepo.server.storage.lowlevel.akubra.AkubraLowlevelStorage"/>
</bean>
<bean
name="org.fcrepo.server.storage.lowlevel.akubra.AkubraLowlevelStorage"
class="org.fcrepo.server.storage.lowlevel.akubra.AkubraLowlevelStorage"
singleton="true">
<constructor-arg>
<description>The store of serialized Fedora objects</description>
<ref bean="tapeObjectStore"/>
<!--Here we reference our tape system-->
</constructor-arg>
<constructor-arg>
<description>The store of datastream content</description>
<ref bean="datastreamStore"/>
</constructor-arg>
<constructor-arg value="false"><!--This is set to false, as we do not ever delete stuff-->
<description>if true, replaceObject calls will be done in a way
that
ensures the old content is not deleted until the new content is safely
written. If the objectStore already does this, this should be
given as
false
</description>
</constructor-arg>
<constructor-arg value="true">
<description>save as above, but for datastreamStore</description>
</constructor-arg>
</bean>
<!--This is the tape store Akubra Implementation-->
<bean name="tapeObjectStore" class="dk.statsbiblioteket.metadatarepository.xmltapes.XmlTapesBlobStore"
singleton="true">
<constructor-arg value="urn:example.org:tapeObjectStore"/>
<!--This parameter is the name of the storage. -->
<property name="archive" ref="tarTapeObjectStore"/>
<!--And this is the reference to the actual implementation-->
</bean>
<!--The guts of the tape system-->
<bean name="tarTapeObjectStore" class="dk.statsbiblioteket.metadatarepository.xmltapes.TapeArchive"
init-method="rebuild"
singleton="true">
<!--This constructor argument specifies the tape store location. -->
<constructor-arg value="file:/CHANGEME/tapeObjectStore" type="java.net.URI"/>
<!--This specifies the maximum length a tape can be before a new tape is started-->
<constructor-arg value="10485760" type="long"/>
<!--10 MB-->
<!--This is the reference to the index-->
<property name="index" ref="redisIndex"/>
</bean>
<!--This is our Redis index-->
<bean name="redisIndex" class="dk.statsbiblioteket.metadatarepository.xmltapes.redis.RedisIndex"
singleton="true">
<!--The redis server-->
<constructor-arg value="localhost"/>
<!--The port it is running on-->
<constructor-arg value="6379"/>
<!--The database name. Redis databases are always identified by integers-->
<constructor-arg value="0"/>
</bean>
<!--Standard storage for managed datastreams. We do not use managed datastreams-->
<bean name="datastreamStore" class="org.akubraproject.map.IdMappingBlobStore"
singleton="true">
<constructor-arg value="urn:fedora:datastreamStore"/>
<constructor-arg>
<ref bean="fsDatastreamStore"/>
</constructor-arg>
<constructor-arg>
<ref bean="fsDatastreamStoreMapper"/>
</constructor-arg>
</bean>
<!--Standard storage for managed datastreams. We do not use managed datastreams-->
<bean name="fsDatastreamStore" class="org.akubraproject.fs.FSBlobStore"
singleton="true">
<constructor-arg value="urn:example.org:fsDatastreamStore"/>
<constructor-arg value="/CHANGEME/datastreamStore"/>
</bean>
<!--Standard storage for managed datastreams. We do not use managed datastreams-->
<bean name="fsDatastreamStoreMapper"
class="org.fcrepo.server.storage.lowlevel.akubra.HashPathIdMapper"
singleton="true">
<constructor-arg value="##"/>
</bean>
<bean name="fedoraStorageHintProvider"
class="org.fcrepo.server.storage.NullStorageHintsProvider"
singleton="true">
</bean>
</beans> |