-
Notifications
You must be signed in to change notification settings - Fork 69
Author interface for ROR, CRediT, and ORCiD in submission workflow #4697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@joemull can you check you've pushed everything? I think there are some missing imports on |
I don't get any import errors @ajrbyers--was it on server start or on loading a page? |
Check
|
Ah, right you are. I put the author search functionality in that file at the last minute. To guard against simple things like this, I should probably write quick view tests for each branch of the view's POST logic, eh? |
We did a demo of this in the design meeting last week. Here are notes from that.
|
My other todos when this comes back from review:
|
@ajrbyers the change to the credit role display you requested is done now. Also forgot to mention yesterday that I could not reproduce the error you found with updating your affiliations from ORCID. It works by querying the public ORCID API using a cleaned version of the ORCID of the request user. Anything you were doing that would trip that up? Obviously in the editor UI I'll have to make it possible for an editor to bulk update someone else's affils, and to do so for a frozen author. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this - looks good. Just one final thing from me: when you edit your own details although the next
querystring is set there is no way to get back from the edit profile page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few comments around the changes to the orcid integration and potential impact over authentication. Happy to follow your leads in regards to interface and flow
int((orcid_end.get('year', {}) or {}).get('value', 1)), | ||
int((orcid_end.get('month', {}) or {}).get('value', 1)), | ||
int((orcid_end.get('day', {}) or {}).get('value', 1)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to handle the case of explicit None
like for the other cases? (e.g {'year': {'value': None}}
)
# that matches an email address in a public ORCiD record. | ||
if not new_author and emails: | ||
for email in emails: | ||
candidates = core_models.Account.objects.filter(email=email) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emails are unique, why do we avoid a .get() call
for email in emails: | ||
candidates = core_models.Account.objects.filter(email=email) | ||
if candidates.exists(): | ||
candidates.update(orcid=cleaned_orcid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to do this? A user could have multiple accounts in Janeway. A broad update like this could lead to multiple of the user accounts to be associated to the same ORCID, which can cause problems during authentication. We should either
- Avoid tampering accounts of co-authors
- Make orcid unique
- Tweak our authentication so that it can gracefully handle multiple accounts with the same ORCID
I suspect 3. is the ideal solution whilst 1. might be what we want now given time constraints.
There is also the fact that we are trusting ORCID to confirm and validate public email addresses. An attacker that can list a third party email address on their public orcid profile would be able to steal the same account in Janeway using this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a good question.
I think the best option is to check for an account that already has that ORCID in JW. If it exists, use it. This function should be doing that anyway. It would avoid a situation where multiple accounts are affiliated with the same ORCID.
If there is no match, then I still do think we should check the ORCID record's email addresses against email addresses in Janeway, and update the first account found with the ORCID.
The reason is: this is what we do during ORCID login--I was replicating the behavior we have already used for a few years there:
Lines 180 to 185 in e73827d
for email in orcid_details.get("emails"): | |
candidates = models.Account.objects.filter(email=email) | |
if candidates.exists(): | |
# Store ORCID for future authentication requests | |
candidates.update(orcid=orcid_id) | |
login(request, candidates.first()) |
If we don't link existing accounts, it increases the likelihood of duplicate accounts, which is a big headache for usability. The alternative I see is to use frozen authors only for co-authors during this step, as you suggest below, but that opens a few cans of worms. See my response below on that.
As for the security risk you mention, yes, we are trusting ORCID. They say: "Unverified emails do not display in the public view of the record and only email addresses that have been verified will appear in API results."
If we don't want to trust them, we need to change the login with ORCID flow as well, because the same risk applies there. Do you want to do that?
def add_author_from_search(search_term, request, article): | ||
query = Q(email=search_term) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my own sanity, can we split this function? There are just too many branches. At a first step we want to check if the search term is either an Orcid, an email or something else. Then we delegate to add_author_via_email
and add_author_via_orcid
Also try to keep all updates to a record in the same area of the codebase.
form.cleaned_data.get('orcid'), | ||
'0000-0003-2126-266X', | ||
) | ||
clean_orcid_id('Mauro-sfak-orci-dtst') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕺
if not affils: | ||
affils.extend( | ||
[affil for affil in summary["educations"]["education-summary"]] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense for get_affiliations to return these, but I think we are adding them automatically to the user when they are added via orcid, do we want that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was replicating the logic we used to use for a single affiliation. Do you think it would be unexpected for all your education affils to be pulled through if you are a student? Too much data? Get just the last one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajrbyers thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think grabbing the latest one is okay. From the if not affils
I presume it only runs when there are no other affiliations to get.
if emails: | ||
email = emails[0] | ||
else: | ||
email = generate_dummy_email(orcid_details) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not flagging the account as active, so they won't be able to login with orcid.
And the email address is fake, so they won't be able to activate it either. I reckon for these authors (no email address) we should NOT create an account, only a FrozenAuthor
record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
Thanks @mauromsl, see my comments above. GitHub didn't give me a "Reply" text box on your last comment for some reason, so I'll reply here:
Yeah, that is a good point. I had missed that it's the ORCID registration view in core, not the login one, that makes them active: Lines 336 to 340 in e73827d
As for whether to create frozen authors here, I originally was headed in that direction, because I fundamentally don't like the feature of Janeway where a non-trusted party can create an account for someone else. But if we were to use frozen authors here, that raises all kinds of questions about how you'd re-link records to authors, and the view during author submission would have to handle the user's expectations about their profile data matching their frozen author data. Let's discuss offline. |
Closes #4519.
Closes #3111.
Fixes #1485.
This follows the proposed design of #4519 as closely as possible. The ORCiD-based flow works really well, CRediT roles are easy to add, it's easy to change the correspondence author, and authors are easy to re-order.
However, there's a bit more work to do around editing co-author affiliations. We may need to move straight on to #145, which was out of scope for this PR but is a natural way to solve the problem of allowing maximum control over affiliation data, while not introducing security risks where bad actors could edit any user's affiliations.