-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Expected behaviour
When uploading Dataverse files, Archivematica should be able to correctly parse the Dataverse METS XML and generate METS.xml documentation, when "Approve automatically" is checked.
Current behaviour
For datasets containing tabular data files, processing in Archivematica fails at the "Parse Dataverse METS XML" step.
Error Message:
type: 'Item' using path: originalFormatStata/originalFormatStatacitation-endnote.xml
FSEntry(type='Item', path='originalFormatRdata/originalFormatRdata.RData', use='original', label='originalFormatRdata.RData', file_uuid='8db9f23d-1c63-4787-9767-8297052524c4', checksum='9b44d38dcaffacbdef5b358806af222f', checksumtype='MD5', fileid='file-8db9f23d-1c63-4787-9767-8297052524c4')Traceback (most recent call last):
File "/usr/lib/archivematica/MCPClient/job.py", line 103, in JobContext
yield
File "/usr/lib/archivematica/MCPClient/clientScripts/parse_dataverse_mets.py", line 321, in call
job.set_status(init_parse_dataverse_mets(job))
File "/usr/lib/archivematica/MCPClient/clientScripts/parse_dataverse_mets.py", line 307, in init_parse_dataverse_mets
return parse_dataverse_mets(job, transfer_dir, transfer_uuid)
File "/usr/lib/archivematica/MCPClient/clientScripts/parse_dataverse_mets.py", line 291, in parse_dataverse_mets
create_db_entries(job, mapping, agent)
File "/usr/lib/archivematica/MCPClient/clientScripts/parse_dataverse_mets.py", line 182, in create_db_entries
original_uuid = mapping[entry.derived_from].uuid
KeyError: FSEntry(type='Item', path='originalFormatRdata/originalFormatRdata.RData', use='original', label='originalFormatRdata.RData', file_uuid='8db9f23d-1c63-4787-9767-8297052524c4', checksum='9b44d38dcaffacbdef5b358806af222f', checksumtype='MD5', fileid='file-8db9f23d-1c63-4787-9767-8297052524c4')Steps to reproduce
- Select Transfer Type - Dataverse
- Browse and select "Archivematica Test on Demo Dataverse"
- Choose sample test and upload
- Make sure Approve automatically is checked
- Error happens during Microservice: Parse external files - Job Parse Dataverse METS XML
Cause
This issue is related to what was mentioned in issue 269. During the extract_and_remove_bundle step, RData files are considered as files that need to be extracted, and after the extraction the original RData files is deleted. This causes parse_dataverse_mets.py to throw an error if there is a RData file because the code can't find it in the database.
Your environment (version of Archivematica, operating system, other relevant details)
Archivematica v1.16.0
Storage Service v0.22.0
For Artefactual use:
Before you close this issue, you must check off the following:
- All pull requests related to this issue are properly linked
- All pull requests related to this issue have been merged
- A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
- Documentation regarding this issue has been written and merged (if applicable)
- Details about this issue have been added to the release notes (if applicable)