Skip to content

feat(vcs): support for new VCS integration (backend)#2284

Merged
zzacharo merged 1 commit intoinveniosoftware:masterfrom
palkerecsenyi:vcs-backend
Apr 2, 2026
Merged

feat(vcs): support for new VCS integration (backend)#2284
zzacharo merged 1 commit intoinveniosoftware:masterfrom
palkerecsenyi:vcs-backend

Conversation

@palkerecsenyi
Copy link
Copy Markdown
Member

@palkerecsenyi palkerecsenyi commented Mar 23, 2026

Closes inveniosoftware/invenio-vcs#2

This commit only contains the backend changes. The frontend changes are in #2283. Please merge these 2 PRs at the same time.

  • Refactored for generic VCS access, removing GitHub-specific code.
  • The filtering and computation of the license SPDX ID of a repository that was previously fully performed within invenio-rdm-records was very specific to GitHub, so this processing has also been moved to invenio-vcs. It can be implemented for each VCS in accordance with their respective peculiarities.
  • Added a Component to update the VCS release in the database when the record gets published in case the publish didn't already succeed.
  • Split the release publish process into 2 parts, so the draft gets saved first and then published. This way, even if the publish fails, the draft will be saved and visible to the user.
  • Added notifications on successful and unsuccessful publish, as well as when community approval is needed.
  • The GITHUB_... config variables have been renamed to VCS_...
  • The directory has been renamed from github to vcs (but the old github directory has been retained for now)
  • We've ensured that none of the VCS-related components will be imported by default, so everything works as expected even if invenio-vcs is not installed.
  • Some other random code cleanups

"""
self.release_processing()
# Commit state change, in case the publishing is stuck
db.session.commit()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in which cases can this happen, when publishing is stuck? I have a feeling this shouldn't be in a service layer?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we had this originally, it is from the existing code. I will investigate to see if we need it.

)
owner = last_record._record.parent.access.owner.resolve()

identity = _get_user_identity(owner)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case, I understand it means that we assume the owner is publishing the next release... why not system? Technically it can be anyone that creates a release in a specific repository, I think system is more "accurate" than saying that the owner created it, no?
I am fine to leave it as is too, since I guess it is replicating the previous behaviour in invenio-github, I think it was good to mention as a discussion point

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good point but I agree that we should not change this unless we absolutely have to. I guess for Zenodo it would be quite a big breaking change @slint? And also I'm not sure how the permissions to edit the record would work if we didn't set the repo's owner as the creator. In order to set DOIs and make other metadata changes, the owner needs to be able to edit the record.

Also to be clear, when we say the "owner" of a repository, we are referring to the Invenio user who last clicked the "enable" button on their repository settings page. Multiple Invenio users might be able to see the repo because they have access to it, but only the one who clicks the "enable" button will be the creator of records for that repo's releases. They will therefore also be the only one able to edit/delete the records unless they manually grant permissions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having an owner for a record in general is important, I can imagine many user expectation "breaking" for their published records if they can't edit/manage them. Also it's very difficult to understand who to contact in case of issues (take-downs, plagiarism, etc.).

And yes, this behavior currently is questionable as Pal mentioned, especially given that it could have been any person with admin access to the repo that could have enabled the integration... This alos becomes problematic when e.g. the person who enabled the repository leaves and thus should have to transfer over ownership somehow.

What would have been a better experience for VCS-sourced records is that we delegate permissions/access to the VCS claims of the repository, i.e. "if you're an admin of the repo, you can edit/manage the record as well".

With the new M2M tables from invenio-vcs I think it might actually be much easier now to implement the permission check (it's just PK/FK joins through record -> release -> repository -> owners). When eavaluating single record permissions (e.g. on the record landing page), that would be fast, but I worry what the behavior on search results would be...

@palkerecsenyi palkerecsenyi force-pushed the vcs-backend branch 4 times, most recently from 02a1042 to 06d0884 Compare April 1, 2026 14:00
This commit only contains the backend changes.

- Refactored for generic VCS access, removing GitHub-specific code.
- The filtering and computation of the license SPDX ID of a repository
that was previously fully performed within `invenio-rdm-records` was
very specific to GitHub, so this processing has also been moved to
`invenio-vcs`. It can be implemented for each VCS in accordance with
their respective peculiarities.
- Added a Component to update the VCS release in the database when the
record gets published in case the publish didn't already succeed.
- Split the release publish process into 2 parts, so the draft gets
saved first and then published. This way, even if the publish fails, the
draft will be saved and visible to the user.
- Added notifications on successful and unsuccessful publish, as well as
when community approval is needed.
- The `GITHUB_...` config variables have been renamed to `VCS_...`
- The directory has been renamed from `github` to `vcs` (but the old
`github` directory has been retained for now)
- We've ensured that none of the VCS-related components will be imported
by default, so everything works as expected even if `invenio-vcs` is not
installed.
- Some other random code cleanups
Invenio-GitHub have been retained. Please migrate to Invenio-VCS.
"""

# TODO: add link to migration docs.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shelve: document this to an issue so we dont forget :)

{
"person_or_org": {
"type": "personal",
"family_name": _("Unknown"),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case this unknown? Creators are a required field...is it the same in GH API?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically if a repository doesn't have any commits or for another reason the GH/GL API returns an empty list of contributors. The contributors list is normally based on Git metadata (which is different from "collaborators" on GH) but if GL/GH for example don't recognise the email address of a Git contributor they might not include them in the list returned from the API. So this is a backup to make sure we don't throw an error if there is no contributor. But this case is extremely rare.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it easy to raise and let the user edit the draft?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IRL to shelve it for now

# Create a review request for the repo's configured community ID if any
# If RDM_COMMUNITY_REQUIRED_TO_PUBLISH is true and no ID is provided, the publish will fail
# and the user will be sent a notification to manually assign a community.
current_rdm_records_service.review.create(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using the same endpoint as for the case of "Managers can publich directly to community"?

@zzacharo zzacharo merged commit 14a0afc into inveniosoftware:master Apr 2, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make invenio-github support other VCS providers

4 participants