Decide which files to index based on dokumentobjekt.format too

At the moment the indexer decide which files to extract content from based on their file name.  This assume something about the content in dokumentobjekt.referanseDokumentfil that is not specified in Noark 5, and I have run into extractions where the file names did not include file extentions.

It would be better if values in dokumentobjekt.format were consulted in addition to looking at file suffixes.  According to Arkivverket, the values in this field is now standardized as PRONOM codes, so those values should at least be recognized.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide which files to index based on dokumentobjekt.format too #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Decide which files to index based on dokumentobjekt.format too #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions