Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it: This PR improves scaling of requests to DataCite in two ways:
The former is fairly straight forward - rather than failing immediately, Dataverse will wait/temporarily slow requests to see if DataCite recovers/Dataverse can drop below the rate limit. If things recover, Dataverse's operations will succeed. If not, there could be a delay of ~ 1minute before a final error occurs and the operation fails.
The latter is perhaps more controversial (there was discussion several years ago about whether this is useful): instead of always sending an update, causing DataCite to write info, this optional change causes Dataverse to first query DataCite (a read) and only send an update if the local info is different than what DataCite has. In cases such as file DOIs where changes are infrequent, this results in many reads and few writes instead of many writes and DataCite (and growing records as they track all writes of new metadata) which appears to be faster. It may be generally useful, but installations not using file DOIs may not want to try it.
Which issue(s) this PR closes:
Special notes for your reviewer: QDR had trouble publishing a dataset with >10K files before this change and succeeded after.
Suggestions on how to test this: Minimally regression test (w/ and w/o flag).
Could also attempt to create/publish a dataset with file DOIs and many files using the DataCite test server and see if the changes increase the success rate/largest size that succeeds and/or improves performance (i.e. with the flag on.) I'm not sure this is worth it given the testing/deployment at QDR.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: