Open
Description
Context
When an author has not been assigned an INSPIRE ID yet, the collaborations put all kinds of placeholders in the field corresponding to the ID, like None
or ???
, or leave it empty.
Current Behavior
Because of this , after extracting author information from the authors XML file, the record might be invalid, or some authors might be lacking an ID without us noticing.
Expected Behavior
The invalid authors are ignored, and an RT ticket is created with information about the record and the authors having invalid or missing IDs.
Note
It might make sense to rewrite the authors XML extraction using parsel
(the library powering scrapy
XML parsing) and the SignatureBuilder
instead of bolting this behavior on top of the current XSLT+dojson pipeline.