Structure checks done
Below are lists of the checks done by each checker. Each is prefixed with an error number (in boldface) that should appear in the given error message returned by checkers.
The format of the error number is: "2F-" (for appendix 2F of the specification to Ninestars, which specifies demands to file structure) followed by a letter specifying which checker checks for the given error, finally followed by a number unique to the given checker. The letters for the different checkers are:
S - schematron file structure checks
Q - sequence number checks (java)
M - mfpak-related checks
O - other checks
Schematron simple file structure checks
Each of these is prefixed with a name (like "batchNode") which shows which node of the batch structure is checked.
2F-S1 batchNode: Form of name: B<batchID>-RT<Roundtrip>
2F-S2 batchNode: Existence of WORKSHIFT-ISO-TARGET
2F-S3 batchNode: All folders except WORKSHIFT-ISO-TARGET have form <batchID>-[0-9]{2} No other files/folders
2F-S4 workshiftIsoTarget: Existence of nodes in WORKSHIFT-ISO-TARGET, i.e. Target-files
2F-S5 workshiftIsoTarget: Names (nodes) in WORKSHIFT-ISO-TARGET must be of the right format: Target-[0-9]{6}-[0-9]{4}
2F-S6 workshiftIsoTarget: No other files or folders
2F-S7 workshiftImage: Form of names: Target-<targetSerialisedNumber>-<billedID>.(jp2|mix)
2F-S8 workshiftImage: One mix-file per jp2-file
2F-S9 workshiftImage: 6-digit targetSerialisedNumber
2F-S10 workshiftImage: 4-digit billedID
2F-S11 workshiftImage: There must exist a file in each WORKSHIFT-ISO-TARGET/Target-[0-9]{6}-[0-9]{4} called Target-[0-9]{6}-[0-9]{4}.mix.xml
2F-S12 workshiftImage: There must exist a jp2-node in each WORKSHIFT-ISO-TARGET/Target-[0-9]{6}-[0-9]{4} called Target-[0-9]{6}-[0-9]{4}.jp2 containing a contents attribute
2F-S13 film: Any folder in BATCH not called WORKSHIFT-ISO-TARGET must have name of format <batchID>-[0-9]{2} (a FILM folder) with batchID as in BATCH folder
2F-S14 film: Existence of film.xml
2F-S15 film: Existence of edition-folder(s) with name of form [12][0-9]{3}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-[0-9]{2}
2F-S16 film: Only existence of FILM-ISO-target, UNMATCHED, or [12][0-9]{3}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-[0-9]{2} are allowed
2F-S17 film: Existence of file with name: [avisID]-[batchID]-[filmSuffix].film.xml (batchID as in parent dir FilmNodeChecker, filmSuffix as in parent dir FilmNodeChecker) No other files/folders.
2F-S18 unmatched: Nodes in UNMATCHED must have format [avisID]-[filmID]-[0-9]{4}[A-Z]? where [avisID]-[filmID] is as found in the film metadata file for this film.
2F-S19 filmIsoTarget: nodes have form: [avisID]-[filmID]-ISO-[1-9] where [avisID]-[filmID] is as in film-xml of parent directory
2F-S20 filmIsoTargetFile: If there is a FILM-ISO-target folder, it must contain atleast one file (node)
2F-S21 edition: folder name has form: [dato]-[udgaveLbNummer] i.e. [12][0-9]{3}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-[0-9]{2}
2F-S22 edition: atleast one node (i.e. newspaper page scan) must exist in edition folder
2F-S23 edition: a file exists with name [avisID]-[editionID].edition.xml where avisID is as in the film-xml and editionID is as in our parent folder name
2F-S24 edition: If there is an attribute (file) in the edition directory, it must have name [avisID]-[editionID].edition.xml where avisID is as in the film-xml and editionID is as in our parent folder name
2F-S25 editionPage: Any node not ending in .brik must have name of the form [avisID]-[editionID]-[0-9]{4}[A-Z]? where avisID is as in film-xml and editionID is as parent directory name
2F-S26 <functionality moved to Mfpak-checks>
2F-S27 <functionality moved to Mfpak-checks>
2F-S28 editionPage: Any node not ending in .brik must contain a .mods.xml attribute with name prefix as that of parent node
2F-S29 editionPage: Any node not ending in .brik must contain a .mix.xml attribute with name prefix as that of parent node
2F-S30 editionPage: Any node not ending in .brik must contain a .jp2 node with name prefix as that of parent node
2F-S31 editionPage: Any node not ending in .brik can only contain attibutes ending in .mix.xml, .mods.xml, or .alto.xml
2F-S32 editionPage: Any node not ending in .brik can only nodes ending in .jp2 (no other nodes)
2F-S33 editionPage: For any node not ending in .brik, any sub-node must contain an attribute called "contents"
2F-S34 unmatchedPage: Any node in UNMATCHED must contain an attribute with name ending in .mix.xml
2F-S35 unmatchedPage: Any node in UNMATCHED must contain an attribute with name ending in .jp2
2F-S36 unmatchedPage: Any node in UNMATCHED can only contain attributes with names ending in .mix.xml, .mods.xml, or .alto.xml
2F-S37 unmatchedPage: Any node in UNMATCHED can only contain nodes with names ending in .jp2
2F-S38 unmatchedPage: Any node under a node in UNMATCHED must contain an attribute called "contents"
2F-S39 brik: Any node in an edition, with a name X ending in -brik must contain an attribute with name X.mix.xml
2F-S40 brik: Any node in an edition, with a name X ending in -brik must contain a node with name X.jp2
2F-S41 brik: For any node in an edition, with a name X ending in -brik, any contained attribute must have name X.mix.xml
2F-S42 brik: For any node in an edition, with a name X ending in -brik, any contained node must have name X.jp2
2F-S43 brik: For any node in an edition, with a name X ending in -brik, any contained node must contain an attribute called "contents"
2F-S44 filmIsoTargetScan: Any node in FILM-ISO-target with a name X must contain an attribute with name X.mix.xml
2F-S45 filmIsoTargetScan: Any node in FILM-ISO-target with a name X must contain a node with name X.jp2
2F-S46 filmIsoTargetScan: For any node in FILM-ISO-target with a name X, any contained attribute must have name X.mix.xml
2F-S47 filmIsoTargetScan: For any node in FILM-ISO-target with a name X, any contained node must have name X.jp2
2F-S48 filmIsoTargetScan: For any node in FILM-ISO-target with a name X, any contained node must contain an attribute called "contents"
2F-S49 checksumExistence: Every attribute (file) must have a checksum
Sequence number checks (java)
2F-Q1 Page numbering: The pages numbered inside of a film across editions and UNMATCHED folder must be sequential without holes starting with 0001.
2F-Q2 Partial page subnumbering: For each node (in an edition) that corresponds to a scanned page, if the [billedID] is followed by a letter, these letters must be sequential "within" the [billedID].
2F-Q3 Workshift-iso-target numbering: For the a given targetSerialisedNumber, the billedID numbering (last number NNNN) must be sequential without holes and start with 1.
2F-Q4 Film-suffix numbering: Inside of a Batch node, the film sequence numbers (last number in filmID) must be without holes starting with 1.
2F-Q5 Edition numbering: For the edition nodes for a given date, the numbering (the number after the date) must be in sequence without holes starting with 1.
MF-PAK related checks (java)
2F-M1 film: Validate that all film filenames contain the avisID from the MFPak database
2F-M2 batchNode: The batch contains the correct number of films
2F-M3 film/edition: Each film only contains editions from dates that are expected from MFpak
2F-M4 editionPage: Any node not ending in .brik must contain a .alto.xml attribute with name prefix as that of parent node (if option B1, B2, or B9 is "on")
2F-M5 editionPage: Any node not ending in .brik must not contain a .alto.xml attribute with name prefix as that of parent node (if all of options B1, B2, and B9 are "off")
Other checks
2F-O1 Checksums: Check that all checksums are correct