Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Yderste mappenavn:

  1. Form: B<batchid>-RT<roundtrip number>  BatchNodeChecker
  2. <batchid> er det forventede batchid  BatchNodeChecker
  3. <roundtrip number> er stigende over tid (hvis tilgængeligt)
  4. Eksistens af workshift-iso-target BatchNodeChecker
  5. Eksistens af en mappe per filmid 
  6. Ikke andre filer og mapper BatchNodeChecker

 

 

Workshift-iso-target:

  1. Skal have netop dette navn BatchNodeChecker
  2. Eksistens af Target-filer
  3. Ikke andre filer og mapper

Target-filer:

...

TODO identifiers

 

batchNodeChecker: Form of name: B<batchID>-RT<Roundtrip>

batchNodeChecker: Existence of WORKSHIFT-ISO-TARGET

batchNodeChecker: All folders except WORKSHIFT-ISO-TARGET have form <batchID>-[0-9]{2} No other files/folders

workshiftIsoTargetChecker: Existence of nodes in WORKSHIFT-ISO-TARGET, i.e. Target-files

workshiftIsoTargetChecker: Names (nodes) in WORKSHIFT-ISO-TARGET must be of the right format: Target-[0-9]{6}-[0-9]{4}

workshiftIsoTargetChecker: No other files or folders

workshiftImageChecker: Form of names: Target-<targetSerialisedNumber>-<billedID>.(jp2|mix)

...

workshiftImageChecker: One mix-

...

file per jp2-

...

file

workshiftImageChecker: 6-digit targetSerialisedNumber

workshiftImageChecker: 4-digit

...

 

Film-directories:

  1. Form: [batchId]-[filmSuffix] BatchNodeChecker
  2. <batchid> er der forventede batchid BatchNodeChecker
  3. 2-digit filmSuffix
  4. Fortløbende filmSuffix
  5. TODO filmSuffix kan muligvis checkes mod mf-pak
  6. Potentiel eksistens af FILM-ISO-target FilmNodeChecker
  7. Potentiel eksistens af UNMATCHED FilmNodeChecker
  8. Eksistens af edition-mapper FilmNodeChecker
  9. Ikke andre filer og mapper FilmNodeChecker
  10. film.xml-fil FilmNodeChecker

Film.xml-fil

  1. Form: [avisID]-[batchID]-[filmSuffix].film.xml FilmNodeChecker
  2. [avisId] er som forventet i MF-PAK
  3. batchID er som i parent dir FilmNodeChecker
  4. filmSuffix er som i parent dir FilmNodeChecker

 

FILM-ISO-target

  1. Præcist dette navn
  2. Eksistens af iso-filer
  3. Ikke andre filer og mapper

FILM-ISO-target-filer:

  1. Form: [filmID]-[batchID]-[filmSuffix]-ISO-[1-9].(jp2|mix)
  2. Én mix-fil pr. jp2-fil
  3. filmID, [batchID], <filmSuffix> som i parent directory (filmID dog som film.xml i parent directory)
  4. [1-9] fortløbende fra 1 (>10???)

 

UNMATCHED:

TODO!!!

 

Edition-mappe:

  1. Form: [date]-[udgaveLbNummer] FilmNodeChecker
  2. [date] skal være iso8601 FilmNodeChecker
  3. <udgaveLbNummer> fortløbende startende med 1 [Is this even right? FilmNodeChecker]
  4. [date] svarer til informationer fra MF-PAK
  5. Eksistens af edition-fil EditionNodeChecker
  6. Eksistens af side-mapper EditionNodeChecker
  7. Potentiel eksistens af brik-mapper EditionNodeChecker
  8. Ingen andre filer og mapper EditionNodeChecker

Edition-filer:

  1. Form: [avisID]-[date]-[udgaveLbNummer].edition.xml EditionNodeChecker
  2. [avisID], [date], [udgaveLbNummer] som i parent directory (avisID dog som film.xml i parent directory) EditionNodeChecker

 

Side-mapper:

  1. Form: [avisID]-[date]-[udgaveLbNummer]-[billedID] EditionNodeChecker
  2. [avisID], [date], [udgaveLbNummer] som i parent directory EditionNodeChecker / EditionPageNodeChecker
  3. 4-digit [billedID] potentielt efterfulgt af fortløbende bogstaver EditionNodeChecker
  4. Eksistens af mods EditionPageNodeChecker
  5. Eksistens af mix EditionPageNodeChecker
  6. Eksistens af jp2-undermappe EditionPageNodeChecker
  7. Eksistens af alto-filer (principielt set afhængig af valgte optioner) EditionPageNodeChecker
  8. Ingen andre filer og mapper EditionPageNodeChecker

Side-files:

TODO

 

Jp2-mapper:

TODO

 

Brik-mapper:

TODO

Brik-filer:

TODO

 

Tværgående checks

Fortløbende nummerering for skanned avissider.

Implementeret i PageImageIDSequenceChecker.

Checks the the scanned pages are named in sequence without holes and starting with 1. The sequence covers a full film, eg. the UNMATCH dir and all the edition dir for a single film. The rules are:

  1. Sequence numbers are in the format NNNN or NNNNA/NNNNB NNNNA/NNNNB/NNNNC...., the later in case of two or more pages on a single film image.
  2. The film image NNNN numbers are in sequence without holes or duplicates.
  3. For a single NNNN film image number, the letter postfix are in sequence without holes, eg. NNNNA, NNNNB.... Further more the at least to

Nodes not adhering to the naming standard are just ignored, eg. not considered relevant for the sequence numbering. The format check is considered to be the responsibility of another checker.billedID

workshiftImageChecker: There must exist a file in each WORKSHIFT-ISO-TARGET/Target-[0-9]{6}-[0-9]{4} called Target-[0-9]{6}-[0-9]{4}.mix.xml

workshiftImageChecker: There must exist a jp2-node in each WORKSHIFT-ISO-TARGET/Target-[0-9]{6}-[0-9]{4} called Target-[0-9]{6}-[0-9]{4}.jp2 containing a contents attribute

filmChecker: Any folder in BATCH not called WORKSHIFT-ISO-TARGET must have name of format <batchID>-[0-9]{2} (a FILM folder) with batchID as in BATCH folder

filmChecker: Existence of film.xml

filmChecker: Existence of edition-folder(s) with name of form [12][0-9]{3}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-[0-9]{2}

filmChecker: Only existence of FILM-ISO-target, UNMATCHED, or [12][0-9]{3}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-[0-9]{2} are allowed

filmChecker: Existence of file with name: [avisID]-[batchID]-[filmSuffix].film.xml (batchID as in parent dir FilmNodeChecker, filmSuffix as in parent dir FilmNodeChecker) No other files/folders.

unmatchedChecker: Nodes in UNMATCHED must have format [avisID]-[filmID]-[0-9]{4}[A-Z]? where [avisID]-[filmID] is as found in the film metadata file for this film.

filmIsoTargetChecker: nodes have form: [avisID]-[filmID]-ISO-[1-9] where [avisID]-[filmID] is as in film-xml of parent directory

filmIsoTargetFileChecker: If there is a FILM-ISO-target folder, it must contain atleast one file (node)

editionChecker: folder name has form: [dato]-[udgaveLbNummer] i.e. [12][0-9]{3}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-[0-9]{2}

editionChecker: atleast one node (i.e. newspaper page scan) must exist in edition folder

editionChecker: a file exists with name [avisID]-[editionID].edition.xml where avisID is as in the film-xml and editionID is as in our parent folder name

editionChecker: If there is an attribute (file) in the edition directory, it must have name [avisID]-[editionID].edition.xml where avisID is as in the film-xml and editionID is as in our parent folder name

editionPageChecker: Any node not ending in .brik must have name of the form [avisID]-[editionID]-[0-9]{4}[A-Z]? where avisID is as in film-xml and editionID is as parent directory name

editionPageChecker: Any node not ending in .brik must contain a .alto.xml attribute with name prefix as that of parent node (if the altoFlag is set)

editionPageChecker: Any node not ending in .brik must not contain a .alto.xml attribute with name prefix as that of parent node (if the altoFlag is not set)

editionPageChecker: Any node not ending in .brik must contain a .mods.xml attribute with name prefix as that of parent node

editionPageChecker: Any node not ending in .brik must contain a .mix.xml attribute with name prefix as that of parent node

editionPageChecker: Any node not ending in .brik must contain a .jp2 node with name prefix as that of parent node

editionPageChecker: Any node not ending in .brik can only contain attibutes ending in .mix.xml, .mods.xml, or .alto.xml

editionPageChecker: Any node not ending in .brik can only nodes ending in .jp2 (no other nodes)

editionPageChecker: For any node not ending in .brik, any sub-node must contain an attribute called "contents"

unmatchedPageChecker: Any node in UNMATCHED must contain an attribute with name ending in .mix.xml

unmatchedPageChecker: Any node in UNMATCHED must contain an attribute with name ending in .jp2

unmatchedPageChecker: Any node in UNMATCHED can only contain attributes with names ending in .mix.xml, .mods.xml, or .alto.xml

unmatchedPageChecker: Any node in UNMATCHED can only contain nodes with names ending in .jp2

unmatchedPageChecker: Any node under a node in UNMATCHED must contain an attribute called "contents"

brikChecker: Any node in an edition, with a name X ending in -brik must contain an attribute with name X.mix.xml

brikChecker: Any node in an edition, with a name X ending in -brik must contain a node with name X.jp2

brikChecker: For any node in an edition, with a name X ending in -brik, any contained attribute must have name X.mix.xml

brikChecker: For any node in an edition, with a name X ending in -brik, any contained node must have name X.jp2

brikChecker: For any node in an edition, with a name X ending in -brik, any contained node must contain an attribute called "contents"

filmIsoTargetScanChecker: Any node in FILM-ISO-target with a name X must contain an attribute with name X.mix.xml

filmIsoTargetScanChecker: Any node in FILM-ISO-target with a name X must contain a node with name X.jp2

filmIsoTargetScanChecker: For any node in FILM-ISO-target with a name X, any contained attribute must have name X.mix.xml

filmIsoTargetScanChecker: For any node in FILM-ISO-target with a name X, any contained node must have name X.jp2

filmIsoTargetScanChecker: For any node in FILM-ISO-target with a name X, any contained node must contain an attribute called "contents"

checksumExistenceChecker: Every attribute (file) must have a checksum