Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Version History

Version 1 Next »

Specification and consideration for the FileID keys used to identify files in the bit repository.

Format

The key is constructed by using the file name including the path in the batch structure with the batchid (including round trip numbernumber) as the root element. The path separator is the '_' (underscore) char. The maximum length allowed is 250.

The fileID must conform to the regular expression: [a-zA-z0-9\-_.] {5,250}

Example: The file

B400022028241-RT1/400022028241-14/1795-06-13-01/adresseavisen1759-1795-06-13-01-0006.jp2

in batch structure will be archived with fileID:

B400022028241-RT1_400022028241-14/1795-06-13-01_adresseavisen1759-1795-06-13-01-0006.jp2

Considerations

The defined format is based on the following considerations 

  • File and directory name format: Most of the elements making up the directory and file names are either predefined name (UNMATCHED, ISO, Targer) or number formats (0002, 1, 1890-10-18-01, 4000220289521). The only slightly complexed case is the newspaperID element, where the (current unwritten) format is [a-zA-z0-9].
  • Bit repository only accepts FileIDs of the format [a-zA-z0-9\-_.] {5,250}.

Note that the constraint on newspaperIDs must be implemented in the file structure check and shared with the MFPak people.

  • No labels