Excerpt |
---|
Description of the general method of creating the initial object heirachry hierarchy in DOMS based on a tree structure in a file system |
...
The goal of this document is to describe the work of transforming data files and accompanying metadata files in a tree structure on disk to a Content-Model-less object tree in DOMS.
...
- Data files and metadata files that belongs together have the same prefix
- Data files can be recognised recognized by their file suffix.
- Checksum files is not supposed to be represented as objectsdirectly, but rather as a property of an objectthe data the are associated with. They will thus be skipped as objects in the tree, but used for creation in the creation ingest of the object data they belong to.
General rules:
...
- Fedora Objects correspond to file system directories
- Sub-directories are represented by a "hasPart" relation to a new the subdirectory object.
- The name of the relation is the directory name The new object's label is the path of the directory.
- Each object will, as an identifier, have the file system path to the directory
- A data file is a file containing data. The actual data is stored outside DOMS. The data file is represented as a file object in doms.
- A metadata file is a file containing metadata. The file is stored inside DOMS.
- A grouping of files (files having a common prefix) is represented as a object with:
- Datastreams for each metadata file
- "hasFile" relations to data files (hasFile is a specialization of hasPart)
Pseudo code expressing the above rules
The following pseudo code is meant to express the above rules on a more formal basis.
In the codes, the methods:
- groupByPrefix(): returns a list of lists of files, grouped by their common prefix.
- isDataFile(): returns a boolean telling if the given file is a datafile.
Code Block |
---|
void handleDir(myDir, domsParentObject) { thisDirObject = new Object(labelidentifier = myDir.getNamegetPath()); domsParentObject.addHasPart(object = thisDirObject,); relationName = handleFiles(myDir.getPathgetFiles(), thisDirObject); for(dir in myDir.getDirectories()) { handleDir(dir, thisDirObject); } handleFiles(myDir.getFiles(), thisDirObject); } void handleFiles(myFiles, dirParentObject) { for(groupgroupedByPrefix in= groupByPrefix(myFiles)) { //groupedByPrefix is a set of groups. A group is a prefix and a set of files if (groupgroupedByPrefix.size == 1) {{ //There is only one group, so add them as datastreams to the current handleFile(group = groupedByPrefix.get(0) for (file in group){ handleFile(file, dirParentObject); } } else { // there is more than one group, so introduce sub directories for(group in groupedByPrefix) { addHasPartaddPart(group, dirParentObject); } } } void handleFile(file, parentObject) { if(isDataFile(file)) { addHasFileaddFile(file, parentObject); } else { addDataStream(file, parentObject); } } void addHasPartaddPart(fileGroup, dirParentObject) { thisPartObject = new Object(labelidentifier = fileGroup.getPrefix()); dirParentObject.addHasPart(object = thisPartObject, relationName = fileGroup.getPrefix()); for(file in fileGroup) { handleFile(file, thisPartObject); } } void addFile(file, parentObject) { thisFileObject = new Object(identifier = file.getName); parentObject.addHasFile(object = thisFileObject); } |