Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

Description of the general method of creating the initial object heirachry hierarchy in DOMS based on a tree structure in a file system

...

The goal of this document is to describe the work of transforming data files and accompanying metadata files in a tree structure on disk to a Content-Model-less object tree in DOMS.

...

  • Data files and metadata files that belongs together have the same prefix
  • Data files can be recognised recognized by their file suffix.
  • Checksum files is not supposed to be represented as objectsdirectly, but rather as a property of an objectthe data the are associated with. They will thus be skipped as objects in the tree, but used for creation in the creation ingest of the object data they belong to.


General rules:

  1. Directories are treated as Fedora Objects correspond to file system directories
  2. Sub-directories are represented by a "hasPart" relation to a new the  subdirectory object.
  3. The name of the relation is the directory name
  4. The new object's label is the path of the directory. 
  5. Each object will, as an identifier, have the file system path to the directory
  6. A data file is a file containing data. The actual data is stored outside DOMS. The data file is represented as a file object in doms.
  7. A metadata file is a file containing metadata. The file is stored inside DOMS.
  8. A grouping of files (files having a common prefix) is represented as a object with:
    1. Datastreams for each metadata file
    2. "hasFile" relations to data files (hasFile is a specialization of hasPart)

 

Pseudo code expressing the above rules

...

In the codes, the methods:

  • groupByPrefix(): returns an object which groups files a list of lists of files, grouped by their common prefix.
  • isDataFile(): returns a boolean telling if the given file is a datafile. 
Code Block
void handleDir(myDir, domsParentObject) {
  thisDirObject = new Object(labelidentifier = myDir.getNamegetPath());
  domsParentObject.addHasPart(object = thisDirObject,);

relationName = handleFiles(myDir.getPathgetFiles(), thisDirObject);
 
  for(dir in myDir.getDirectories()) {

    handleDir(dir, thisDirObject);
  }
  
  handleFiles(myDir.getFiles(), thisDirObject);
}

void handleFiles(myFiles, dirParentObject) {
  for(group inList<List> groupedByPrefix = groupByPrefix(myFiles))
{  if   if(groupgroupedByPrefix.size == 1) {{ //There is only one group, so add them as datastreams to the current
      handleFile(group = groupedByPrefix.get(0)
      for (file in group){
      	handleFile(file, dirParentObject);
	  }
  } else { // there is more than one group, so introduce sub directories
    for(List group in groupedByPrefix) {
      addHasPart(group, dirParentObject);
    }
  }
}

void handleFile(file, parentObject) {
  if(isDataFile(file)) {
    addHasFileaddFile(file, parentObject);
  } else {
    addDataStream(file, parentObject);
  }
}

void addHasPart(fileGroup, dirParentObject) {
  thisPartObject = new Object(labelidentifier = fileGroup.getPrefix());
  dirParentObject.addHasPart(object = thisPartObject, relationName = fileGroup.getPrefix());
  
  for(file in fileGroup) {
    handleFile(file, thisPartObject);
  }
}
 
void addFile(file, parentObject) {
   thisFileObject = new Object(identifier = file.getName);
   parentObject.addHasFile(object = thisFileObject);
}