harvestInfo.performer in warcinfo records is not included in harvestInfo.xml

Description

In 5.1, the harvestInfo.performer included in warcinfo records is empty. And is not included in harvestInfo.xml.

#added by NetarchiveSuite Version: 5.1 (<a href="https://github.com/netarchivesuite/netarchivesuite/commit/cde61d78299cabccae6195908b81ef77c84a76b9">cde61d7829</a>)
harvestInfo.version: 0.5
harvestInfo.jobId: 20
harvestInfo.channel: CIBLEE
harvestInfo.harvestNum: 3
harvestInfo.origHarvestDefinitionID: 2
harvestInfo.maxBytesPerDomain: -1
harvestInfo.maxObjectsPerDomain: 50
harvestInfo.orderXMLName: default_NAS5_1_KLM
harvestInfo.origHarvestDefinitionName: test saa
harvestInfo.scheduleName: annuelle
harvestInfo.harvestFilenamePrefix: BnF-20-2
harvestInfo.jobSubmitDate: Mon Jul 25 13:08:46 CEST 2016
harvestInfo.performer:
harvestInfo.audience: champ public

<?xml version="1.0" encoding="UTF-8"?>
<harvestInfo>
<version>0.5</version>
<jobId>20</jobId>
<channel>CIBLEE</channel>
<harvestNum>3</harvestNum>
<origHarvestDefinitionID>2</origHarvestDefinitionID>
<maxBytesPerDomain>-1</maxBytesPerDomain>
<maxObjectsPerDomain>50</maxObjectsPerDomain>
<orderXMLName>default_NAS5_1_KLM</orderXMLName>
<origHarvestDefinitionName>test saa</origHarvestDefinitionName>
<origHarvestDefinitionComments>Collecte réalisée avec NAS 5.1 pour contrôler les WARC de données et de métadonnées.</origHarvestDefinitionComments>
<scheduleName>annuelle</scheduleName>
<harvestFilenamePrefix>BnF-20-2</harvestFilenamePrefix>
<jobSubmitDate>2016-07-25T11:08:46Z</jobSubmitDate>
<audience>champ public</audience>
</harvestInfo>

Checklist

Activity

Show:

Sara AubryOctober 10, 2016 at 1:58 PM

Tested, if the settings.harvester.performer is not declared, it will not appear either in the harvestInfo.xml, nor in the warcinfo metadata of the data files.

Sara AubrySeptember 28, 2016 at 6:55 AM

Great, we'll test it.

SrSeptember 27, 2016 at 4:02 PM

NAS is now consistent. It does not longer add empty performer values into the warcInfo metadata

SrSeptember 21, 2016 at 1:50 PM

So probably, you just need to override the empty settings-value

The thing, we probably should do is to avoid inserting a empty performer in warcInfo metadata

Sara AubrySeptember 21, 2016 at 1:48 PM

Yes, we saw that by declaring a performer in the settings.
But to be consistent, if the performer is not declared in the settings, maybe the harvestInfo.performer should not be inserted with an empty value in the warcinfo records of the WARC data files. It is currently not inserted in the harvestInfo.xml.

Fixed

Details

Assignee

Reporter

Organization

BNF

Inspector (migrated)

Components

Sprint

Fix versions

Affects versions

Priority

Checklist

Created July 27, 2016 at 1:35 PM
Updated November 3, 2016 at 12:00 PM
Resolved October 19, 2016 at 7:43 AM