Skip to content

Conversation

@bernt-matthias
Copy link
Contributor

@bernt-matthias bernt-matthias commented Mar 20, 2025

Tested this manually here: https://github.com/Helmholtz-UFZ/ufz-galaxy-tools

I also have code for parsing the tool versions from all the tool revisons (get_all_versions) tested this. In all cases it was unbumped tool versions.

Copy link
Contributor Author

@bernt-matthias bernt-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to make a (manual?) test run. But since after the fix step there is another lint step before the actual upate not much could got wrong... I guess :)

if nxt:
if nxt in tool["revisions"]:
print(f"remove {cur} in favor of {nxt} {name} {owner}")
remove.append(cur)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script will remove an uninstallable revision from the list if the next installable one is already in it.

@bgruening
Copy link
Member

Hi guys, I lost a but track here. I'm happy to merge it and keep the script here, but please feel also free to create a run-PR of this script so we see the impact and to clean the repo.

Thanks

@martenson
Copy link

martenson commented Apr 7, 2025

@bernt-matthias I think this is close to perfect, I played with it and opened bernt-matthias#1 to fix the process of adding the next installable revision (and there is also some polish if you want), but apart from that this seems to work great.

@bernt-matthias
Copy link
Contributor Author

@bernt-matthias I think this is close to perfect, I played with it and opened bernt-matthias#1 to fix the process of adding the next installable revision (and there is also some polish if you want), but apart from that this seems to work great.

Hi @martenson .. Also have some improvements nearly ready. Will merge with your efforts and push soon.

@martenson
Copy link

@bernt-matthias looks awesome, I'll test some more tomorrow. I think renaming of the script could improve its appeal: uninstallable is not the same as non-installable. Locally I have it as fix_outdated but tbh any other name would be better. :)

@bernt-matthias
Copy link
Contributor Author

Two more comments:

  • It seems that there can be uninstallable revisions that are not due to having the same version as a later revision (e.g. flye from bgruening de24438c9988 where the next installable revision would be 8d4f03b5fe9d, graphclust_align_cluster has no installable revisions). No idea what is the reason for this. But I changed the code such that there is only a warning in such cases.
  • I found at least one case on eu where the current revision was still installed. The repo info from the API contains some error, but the repo is still reported as installed, I guess that the update to the newer revision (with the same version) failed. Also only logging in these cases now.

TODO

  • Should we change CI such that only for trusted repos revisions are added?
  • For production we should cleanup the temporary clones of the TS repos

Currently running this on this repos. Will need a bit.

@martenson
Copy link

martenson commented Apr 7, 2025

It seems that there can be uninstallable revisions that are not due to having the same version as a later revision (e.g. flye from bgruening de24438c9988 where the next installable revision would be 8d4f03b5fe9d

Well that seems like a toolshed bug :/

graphclust_align_cluster has no installable revisions

This tool seems pretty broken, the xml is likely invalid but maybe it was not before?

I think in all these cases we shouldn't modify the lockfile. Maybe we could create an issue for each and try to track down the problems, if there are any takers.

@martenson
Copy link

for the latest version I've also encountered this:

Traceback (most recent call last):
  File "/Users/marten/devel/git/galaxy_tools/scripts/fix_outdated.py", line 196, in <module>
    fix_outdated(args.lockfile.name, args.toolshed, args.galaxy_url)
  File "/Users/marten/devel/git/galaxy_tools/scripts/fix_outdated.py", line 124, in fix_outdated
    all_revisions = get_all_revisions(toolshed_url, name, owner)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marten/devel/git/galaxy_tools/scripts/fix_outdated.py", line 67, in get_all_revisions
    assert proc.returncode == 0, f"failed {' '.join(cmd)} in {repo_path}"
           ^^^^^^^^^^^^^^^^^^^^
AssertionError: failed hg update tip in /tmp/repos/toolshed.g2.bx.psu.edu-galaxyp-cardinal_combine

the entry looks like this:

- name: cardinal_combine
  owner: galaxyp
  revisions:
  - 2b1868ab3724
  - a9a28e46d54a
  - bf06d196e23d
  tool_panel_section_id: proteomics
  tool_panel_section_label: Proteomics

so in my understanding this should just remove the 2b1868ab3724, but hg fails with the above

@bernt-matthias
Copy link
Contributor Author

So far, for me, deleting the temporary directory for this repo solved the problem (so maybe temporary problem with the remote?). Wondering if we should include stderr, stdout here.

@martenson
Copy link

the specific repo seems broken :/

$ hg clone https://toolshed.g2.bx.psu.edu/repos/galaxyp/cardinal_combine
destination directory: cardinal_combine
requesting all changes
adding changesets
adding manifests
adding file changes
added 21 changesets with 769 changes to 260 files
new changesets 93be1d20b5c3:a9a28e46d54a
updating to branch default
abort: case-folding collision between test-data/cluster_skm.RData and test-data/cluster_skm.rdata

@bernt-matthias
Copy link
Contributor Author

Works for me

hg clone https://toolshed.g2.bx.psu.edu/repos/galaxyp/cardinal_combine
destination directory: cardinal_combine
requesting all changes
adding changesets
adding manifests
adding file changes
added 21 changesets with 769 changes to 260 files                                                                                                                     
new changesets 93be1d20b5c3:a9a28e46d54a
updating to branch default
218 files updated, 0 files merged, 0 files removed, 0 files unresolved

@martenson
Copy link

martenson commented Apr 8, 2025

yeah, case-sensitive vs case-insensitive filesystem :/

xref galaxyproteomics/tools-galaxyp#791

@bernt-matthias
Copy link
Contributor Author

To get an idea I added the changes to the .lock files and the log output (.fix).

Do not merge yet. Have another idea...

@bernt-matthias bernt-matthias marked this pull request as draft April 8, 2025 22:04
@martenson
Copy link

I propose showing the changes after run in a separate PR, it makes this PR hard to approach to everyone else

@bernt-matthias
Copy link
Contributor Author

My idea/suggestion is the following: we could have an additional much simpler step that just determines the installed revisions of a given Galaxy instance. Then we just remove everything from the lock files that is not installed. Should be super simple to do.

Should achieve similar results, but with much less resources: This script downloads GBs from the TS and just takes a lot of time -- at least for the first run - I guess subsequent runs will be much faster.

@martenson
Copy link

martenson commented Apr 9, 2025

@bernt-matthias I understand the urge to optimize but, as you said, to me this is basically a one-off cloning spree triggered only during the first run. Subsequent runs will have one repo to clone at most.

@bernt-matthias
Copy link
Contributor Author

@bernt-matthias I understand the urge to optimize but, as you said, to me this is basically a one-off cloning spree triggered only during the first run. Subsequent runs will have one repo to clone at most.

Sure, but my main motivation would be that its much simpler, i.e. makes less assumptions on things that are "not well documented" and understood by only a few people.

@martenson
Copy link

I think that is a fair take.

It likely does not cover the situation where you actually want to update the revision of the repo that did not bump tool version and maybe other corner cases, but likely does majority of the cleanup we want.

@arash77
Copy link
Contributor

arash77 commented Oct 17, 2025

Isn't it possible to get the data from ts.repositories.get_repository_revision_install_info(name, owner, revision) instead of cloning it?

@bernt-matthias
Copy link
Contributor Author

Isn't it possible to get the data

Which data?

@arash77
Copy link
Contributor

arash77 commented Oct 17, 2025

Which data?

repository metadata (all tool info, including versions)

For example, this has a lot of information, including the valid_tools that might be useful for our case:
https://testtoolshed.g2.bx.psu.edu/api/repositories/get_repository_revision_install_info?name=multiqc&owner=iuc&changeset_revision=08188e2202aa

@bernt-matthias
Copy link
Contributor Author

I see: the tool version could be extracted from the guid. This could then replace get_all_versions. How about get_all_revisions?

@arash77
Copy link
Contributor

arash77 commented Oct 17, 2025

I see: the tool version could be extracted from the guid. This could then replace get_all_versions. How about get_all_revisions?

Why do we need all of them?
We already have the revisions in the lock files and the ordered_installable_revisions as well. Shouldn’t that be enough?

@bernt-matthias
Copy link
Contributor Author

Could be impossible to get_next if the list of revisions is incomplete?

@arash77
Copy link
Contributor

arash77 commented Oct 17, 2025

Could be impossible to get_next if the list of revisions is incomplete?

Do we need this?
From what I understand, if we have only one installable revision for each version, Galaxy installs the latest revision automatically anyway, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants