...
Tool | Purpose | Denmark | France | Austria | Spain | Sweden | ||||
---|---|---|---|---|---|---|---|---|---|---|
DeployApplication | Creates deploy scripts from a deploy-config | y | ||||||||
HarvestdatabaseUpdateApplication | Updates HarvestDB schema | y | ||||||||
BuildCompleteSettings | Merges module settings files in NAS to one large global default settings file. Run as part of release process. | y | ||||||||
GetFile | Retrieves a file via the ArcRepository interface | |||||||||
GetRecord | Retrieves a (w)arc-record via the ArcRepository interface | |||||||||
LoadDatabaseChecksumArchive | Migration tool from file-based checksums to database-based checksums | |||||||||
ReestablishAdminDatabase | For reestablishing the admin database from a 'admin.data' file | |||||||||
RunBatch | Runs a batch job from the command line | |||||||||
Upload | Uploads a file to the ArcRepository from the command line. (Handy for testdata.) | y | ||||||||
ReestablishAdminDatabase | Should be deprecated Reads old admin.data file. | |||||||||
ClassDependencies | Non NAS Utility (license is not ours) | |||||||||
CreateIndex | CLI to talk to IndexServer via IndexClient | |||||||||
RunChecksum | CLI to get all checksums from a Bitarchive (deprecated) | |||||||||
SendDedupIndexRequestToIndexserver | Asynchronously starts a dedup indexing on an IndexServer and then exits. Tue Hejlskov Larsen is this what you use to generate deduplication indexes? | |||||||||
MakeIndex | Runs a CDX extraction on a single file in a remote ArcRepository | |||||||||
FindRelevantCrawllogLines | Finds crawl-log lines matching a given domain name in a local metadata file | |||||||||
JMXProxy | "This tool will simply reregister all MBeans that matches the given query from the JMX hosts read in settings, using* its own platformmbeanserver. It will then wait forever." | |||||||||
DeduplicateToCDXApplication | Extracts CDX records for deduplicate annotations from a local crawl log file | |||||||||
ResetFailedFiles | Utility for WaybackIndexer to reset files that have failed more than 3 times so they can be retried | |||||||||
ARCReaderUtils | Splits an arcfile (not warc) and dumps results to a directory | |||||||||
ArcWrap | Creates an arcfile by wrapping a file | |||||||||
ExtractCDX | Extracts CDX records, unsorted, from a list of local input arcfiles (not warcs) | |||||||||
JMSBroker | Checks that a JMS broker (as specified in NAS settings) is up and running. | |||||||||
WriteBytesToFile | Just creates large files full of null bytes | |||||||||
FTPValidator SimpleCmdlineTool | Tests if an ftp server configuration in a NAS settings file points to a NAS-compliand ftp server. | |||||||||
ArcMerge | Merges several arcfiles into one arcfile | |||||||||
ArchiveExtractCDX | Extracts CDX records, unsorted, from a list of local input (w)arcfiles | |||||||||
WARCExtractCDX | Extracts CDX records, unsorted, from a list of local input warcfiles | |||||||||
ReformatTranslationFileMailValidator DigestIndexer MakeNewMetadataFile FindDomainsForCrawllogExtraction CheckDuplicateReduction StandaloneApplicationReduced SchedulerDatabaseBuilder | MigrateDefaultHarvestDatabase | CreateCDXMetadataFile | Heritrix3ControllerTest | H3LaunchTesti) reorders a translation file so keys are in the same order as a reference file, and ii) allows the encoding of the output file to be changed | ||||||
MailValidator | Checks the validity of a mail-server configured in NAS settings by sending a test-mail | |||||||||
MakeNewMetadataFile | Creates a metadata file. For use when postprocessing fails. Is this used? | |||||||||
FindDomainsForCrawllogExtraction | ? | |||||||||
CheckDuplicateReduction | Validates deduplication by comparing a crawl log with a collection of arcfiles. (not warc) | |||||||||
StandaloneApplicationReduced | Creates a standalone NetarchiveSuite in a single JVM | |||||||||
MigrateDefaultHarvestDatabase | This just initialises a SiteSection object which is supposed to upgrade the harvest database as a side-effect | |||||||||
CreateCDXMetadataFile | Complex tool that takes a set of filenames and runs a batch job to extracts the cdx'es from each files and pack them in a metadata arc or warc file, one record per input file | |||||||||
HarvesterQueueControl | ||||||||||
HarvestDatabaseValidator | ||||||||||
HarvestTemplateApplication | ||||||||||
CheckDomainCrawltraps | ||||||||||
CheckTrapsInFile |
...