Skip to content

workflows: report malformed and missing IDs in authors XML files #3033

Open
@michamos

Description

@michamos

Context

When an author has not been assigned an INSPIRE ID yet, the collaborations put all kinds of placeholders in the field corresponding to the ID, like None or ???, or leave it empty.

Current Behavior

Because of this , after extracting author information from the authors XML file, the record might be invalid, or some authors might be lacking an ID without us noticing.

Expected Behavior

The invalid authors are ignored, and an RT ticket is created with information about the record and the authors having invalid or missing IDs.

Note

It might make sense to rewrite the authors XML extraction using parsel (the library powering scrapy XML parsing) and the SignatureBuilder instead of bolting this behavior on top of the current XSLT+dojson pipeline.

cc @hoc3426 @annetteholtkamp

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions