Skip to content
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 103 additions & 0 deletions .github/workflows/fix-outdated-tools.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
name: Fix Outdated Tools

on:
workflow_dispatch:
schedule:
- cron: '0 9 1 * *'

jobs:
get-lockfiles:
runs-on: ubuntu-latest
outputs:
lockfiles: ${{ steps.set-matrix.outputs.lockfiles }}
steps:
- name: Checkout repository
uses: actions/checkout@v5

- name: Get all lock files
id: set-matrix
run: |
lockfiles=$(ls *.yaml.lock | jq -R -s -c 'split("\n")[:-1]')
echo "lockfiles=$lockfiles" >> $GITHUB_OUTPUT

fix-outdated:
needs: get-lockfiles
runs-on: ubuntu-latest
strategy:
matrix:
lockfile: ${{ fromJson(needs.get-lockfiles.outputs.lockfiles) }}
fail-fast: false
permissions:
contents: write
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v5

- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'

- name: Install uv
uses: astral-sh/setup-uv@v7
with:
github-token: ${{ secrets.GITHUB_TOKEN }}

- name: Install dependencies
run: uv pip install --system -r requirements.txt

- name: Fix ${{ matrix.lockfile }}
run: python scripts/fix_outdated.py "${{ matrix.lockfile }}"

- name: Upload changes
uses: actions/upload-artifact@v4
if: always()
with:
name: ${{ matrix.lockfile }}
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The artifact name uses ${{ matrix.lockfile }} which may contain special characters or path separators (e.g., "tools.yaml.lock"). Artifact names have restrictions and should not contain certain characters. Consider using a sanitized name or a unique identifier that replaces problematic characters.

Suggested change
name: ${{ matrix.lockfile }}
name: ${{ matrix.lockfile.replace('/','_').replace('.','_').replace('-','_') }}

Copilot uses AI. Check for mistakes.
path: |
${{ matrix.lockfile }}
*.not-installable-revisions.yaml
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wildcard pattern *.not-installable-revisions.yaml will match all not-installable files in the directory, not just the one corresponding to the current matrix.lockfile. This could cause artifacts from one matrix job to include files from other lockfiles if they already exist. Consider using a more specific pattern like ${{ matrix.lockfile }}.not-installable-revisions.yaml or referencing the specific filename generated by the script.

Suggested change
*.not-installable-revisions.yaml
${{ matrix.lockfile }}.not-installable-revisions.yaml

Copilot uses AI. Check for mistakes.
if-no-files-found: ignore

create-pr:
needs: fix-outdated
if: always()
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Inconsistent indentation: this line uses 4 spaces while other lines in the same block use 2 spaces. Consider using 2 spaces for consistency with the rest of the workflow file.

Suggested change
fetch-depth: 0
fetch-depth: 0

Copilot uses AI. Check for mistakes.

- name: Download all artifacts
uses: actions/download-artifact@v5
with:
merge-multiple: true

- name: Check for changes
id: check_changes
run: |
if [[ -n $(git status --porcelain) ]]; then
echo "changes=true" >> $GITHUB_OUTPUT
echo "Changes detected in lock files"
else
echo "changes=false" >> $GITHUB_OUTPUT
echo "No changes detected"
fi

- name: Create or update Pull Request
id: cpr
if: steps.check_changes.outputs.changes == 'true'
uses: peter-evans/create-pull-request@v7
with:
branch: fix-outdated-tools
commit-message: Remove not-installable tool revisions
title: 'Remove not-installable tool revisions'
body: |
This PR was automatically generated by the `fix-outdated-tools` workflow.
Workflow run: [${{ github.run_id }}](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }})
delete-branch: true
186 changes: 186 additions & 0 deletions scripts/fix_outdated.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
import argparse
import logging
import sys
import time
from pathlib import Path
from concurrent.futures import ThreadPoolExecutor, as_completed
from collections import defaultdict

import yaml
from bioblend import toolshed

logging.basicConfig(
level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
)
logger = logging.getLogger(__name__)


def retry_with_backoff(func, *args, **kwargs):
MAX_RETRIES = 5
backoff = 2
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial backoff value and max backoff (60s at line 36) should be defined as constants at module level for better maintainability and configurability (e.g., INITIAL_BACKOFF = 2, MAX_BACKOFF = 60).

Copilot uses AI. Check for mistakes.

for attempt in range(MAX_RETRIES):
try:
return func(*args, **kwargs)
except Exception as e:
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching a bare Exception is too broad and may mask unexpected errors. Consider catching more specific exception types (e.g., requests.exceptions.RequestException, ConnectionError, TimeoutError) or at minimum re-raising unexpected exceptions after logging them.

Copilot uses AI. Check for mistakes.
error_msg = str(e)
if any(
code in error_msg
for code in ["502", "503", "504", "timed out", "timeout", "Connection"]
):
if attempt < MAX_RETRIES - 1:
logger.warning(
f"Attempt {attempt + 1}/{MAX_RETRIES} failed: {error_msg}. Retrying in {backoff}s..."
)
time.sleep(backoff)
backoff = min(backoff * 2, 60)
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The magic number 60 (max backoff seconds) should be defined as a constant for clarity and maintainability.

Copilot uses AI. Check for mistakes.
continue
raise e
raise Exception("Retry failed after max attempts")


def get_tool_versions(ts, name, owner, revision):
versions = set()

try:
repo_metadata = retry_with_backoff(
ts.repositories.get_repository_revision_install_info, name, owner, revision
)
if isinstance(repo_metadata, list) and len(repo_metadata) > 1:
for tool in repo_metadata[1].get("valid_tools", []):
if "id" in tool and "version" in tool:
versions.add((tool["id"], tool["version"]))
except Exception as e:
logger.warning(f"{name},{owner}: failed to fetch {revision} ({e})")
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching a bare Exception is too broad. Consider catching more specific exception types related to API calls or at minimum logging the full exception type to aid debugging.

Suggested change
logger.warning(f"{name},{owner}: failed to fetch {revision} ({e})")
logger.warning(f"{name},{owner}: failed to fetch {revision} ({type(e).__name__}: {e})")

Copilot uses AI. Check for mistakes.
sys.exit(1)
Comment on lines +54 to +55
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling sys.exit(1) on a tool fetch failure will terminate the entire script, preventing other tools from being processed. Consider logging the error and continuing with the next tool, or collecting errors and exiting at the end if any critical failures occurred.

Copilot uses AI. Check for mistakes.
return versions


def fetch_versions_parallel(ts, name, owner, revisions, max_workers=10):
version_cache = {}
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = {
executor.submit(get_tool_versions, ts, name, owner, rev): rev
for rev in revisions
}
for future in as_completed(futures):
rev = futures[future]
try:
version_cache[rev] = future.result()
except Exception as e:
logger.warning(f"{name},{owner}: error fetching {rev} ({e})")
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching a bare Exception is too broad. Consider catching more specific exception types or at minimum logging the exception type to aid debugging.

Suggested change
logger.warning(f"{name},{owner}: error fetching {rev} ({e})")
logger.warning(f"{name},{owner}: error fetching {rev} ({type(e).__name__}: {e})")

Copilot uses AI. Check for mistakes.
sys.exit(1)
Comment on lines +71 to +72
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling sys.exit(1) inside the thread pool executor will terminate the script immediately when any revision fetch fails. This is inconsistent with the exception handling at line 54 and prevents other revisions from being processed. Consider collecting errors and handling them after all futures complete.

Copilot uses AI. Check for mistakes.
return version_cache


def fix_uninstallable(lockfile_name, toolshed_url):
ts = toolshed.ToolShedInstance(url=toolshed_url)
lockfile_path = Path(lockfile_name)
with open(lockfile_path) as f:
lockfile = yaml.safe_load(f) or {}
locked_tools = lockfile.get("tools", [])
total = len(locked_tools)

not_installable_file = lockfile_path.with_name(
lockfile_path.name.replace(".yaml.lock", ".not-installable-revisions.yaml")
)
Comment on lines +84 to +86
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string replacement .replace(".yaml.lock", ".not-installable-revisions.yaml") assumes the lockfile name ends with ".yaml.lock". If the lockfile has a different naming pattern or doesn't contain this exact string, the replacement will fail silently and create an incorrectly named file. Consider using a more robust path manipulation approach, such as lockfile_path.with_suffix('') or checking that the expected suffix exists before replacing.

Suggested change
not_installable_file = lockfile_path.with_name(
lockfile_path.name.replace(".yaml.lock", ".not-installable-revisions.yaml")
)
# Robustly generate the not-installable file name
if lockfile_path.name.endswith(".yaml.lock"):
not_installable_file = lockfile_path.with_name(
lockfile_path.name.replace(".yaml.lock", ".not-installable-revisions.yaml")
)
else:
logger.warning(
f"Lockfile name '{lockfile_path.name}' does not end with '.yaml.lock'. Using fallback naming."
)
not_installable_file = lockfile_path.with_name(
lockfile_path.name + ".not-installable-revisions.yaml"
)

Copilot uses AI. Check for mistakes.

removed_map = defaultdict(set)
try:
with open(not_installable_file) as f:
not_installable_data = yaml.safe_load(f) or {}
for t in not_installable_data.get("tools", []):
removed_map[(t["name"], t["owner"])] = set(t.get("revisions", []))
except FileNotFoundError:
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except FileNotFoundError:
except FileNotFoundError:
# If the file does not exist, proceed with an empty removed_map.

Copilot uses AI. Check for mistakes.
pass

logger.info(f"Processing {total} tools from {lockfile_path.name}...")
changed, skipped = 0, 0

for i, tool in enumerate(locked_tools):
if i % 10 == 0:
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The magic number 10 (progress logging interval) should be defined as a constant for clarity and maintainability (e.g., PROGRESS_LOG_INTERVAL = 10).

Copilot uses AI. Check for mistakes.
logger.info(
f"Progress: {i}/{total} tools ({skipped} skipped, {changed} changed)"
)

name, owner = tool.get("name"), tool.get("owner")
current_revisions = set(tool.get("revisions", []))
try:
installable_list = retry_with_backoff(
ts.repositories.get_ordered_installable_revisions, name, owner
)
except Exception as e:
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching a bare Exception is too broad. Consider catching more specific exception types related to API calls or at minimum logging the full exception type to aid debugging.

Copilot uses AI. Check for mistakes.
logger.warning(f"{name},{owner}: could not get installable revisions ({e})")
continue

uninstallable = current_revisions - set(installable_list)
if not uninstallable:
skipped += 1
continue

all_revs = list(uninstallable) + installable_list
version_cache = fetch_versions_parallel(ts, name, owner, all_revs)

installable_signatures = {}
for rev in installable_list:
sig = frozenset(version_cache.get(rev, []))
if sig:
installable_signatures[sig] = rev
to_remove = set()

for cur in uninstallable:
cur_sig = frozenset(version_cache.get(cur, []))
if not cur_sig:
if installable_list:
nxt = installable_list[-1]
logger.info(f"{name},{owner}: unverifiable {cur}, keeping {nxt}")
to_remove.add(cur)
continue

nxt = installable_signatures.get(cur_sig)

if not nxt:
logger.warning(
f"{name},{owner}: no matching installable revision for {cur}"
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message could be more helpful by providing actionable context. Consider adding information about what tool versions were found in the uninstallable revision and what the available installable signatures are, to help diagnose why no match was found.

Suggested change
f"{name},{owner}: no matching installable revision for {cur}"
f"{name},{owner}: no matching installable revision for {cur}\n"
f" Signature of uninstallable revision: {sorted(cur_sig)}\n"
f" Available installable signatures: {[sorted(sig) for sig in installable_signatures.keys()]}\n"
f" Installable revisions: {installable_list}"

Copilot uses AI. Check for mistakes.
)
sys.exit(1)
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling sys.exit(1) when no matching installable revision is found will terminate the script and prevent other tools from being processed. Consider logging a warning and continuing with the next tool, or collecting critical errors to handle at the end.

Suggested change
sys.exit(1)
continue

Copilot uses AI. Check for mistakes.

logger.info(f"{name},{owner}: removing {cur} in favor of {nxt}")
if nxt not in current_revisions:
tool["revisions"].append(nxt)
Comment on lines +135 to +150
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name nxt is unclear. Consider using a more descriptive name like matching_revision or installable_match to improve code readability.

Suggested change
nxt = installable_list[-1]
logger.info(f"{name},{owner}: unverifiable {cur}, keeping {nxt}")
to_remove.add(cur)
continue
nxt = installable_signatures.get(cur_sig)
if not nxt:
logger.warning(
f"{name},{owner}: no matching installable revision for {cur}"
)
sys.exit(1)
logger.info(f"{name},{owner}: removing {cur} in favor of {nxt}")
if nxt not in current_revisions:
tool["revisions"].append(nxt)
matching_revision = installable_list[-1]
logger.info(f"{name},{owner}: unverifiable {cur}, keeping {matching_revision}")
to_remove.add(cur)
continue
matching_revision = installable_signatures.get(cur_sig)
if not matching_revision:
logger.warning(
f"{name},{owner}: no matching installable revision for {cur}"
)
sys.exit(1)
logger.info(f"{name},{owner}: removing {cur} in favor of {matching_revision}")
if matching_revision not in current_revisions:
tool["revisions"].append(matching_revision)

Copilot uses AI. Check for mistakes.
Comment on lines +135 to +150
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name nxt is unclear. Consider using a more descriptive name like replacement_revision or installable_revision to improve code readability.

Suggested change
nxt = installable_list[-1]
logger.info(f"{name},{owner}: unverifiable {cur}, keeping {nxt}")
to_remove.add(cur)
continue
nxt = installable_signatures.get(cur_sig)
if not nxt:
logger.warning(
f"{name},{owner}: no matching installable revision for {cur}"
)
sys.exit(1)
logger.info(f"{name},{owner}: removing {cur} in favor of {nxt}")
if nxt not in current_revisions:
tool["revisions"].append(nxt)
replacement_revision = installable_list[-1]
logger.info(f"{name},{owner}: unverifiable {cur}, keeping {replacement_revision}")
to_remove.add(cur)
continue
replacement_revision = installable_signatures.get(cur_sig)
if not replacement_revision:
logger.warning(
f"{name},{owner}: no matching installable revision for {cur}"
)
sys.exit(1)
logger.info(f"{name},{owner}: removing {cur} in favor of {replacement_revision}")
if replacement_revision not in current_revisions:
tool["revisions"].append(replacement_revision)

Copilot uses AI. Check for mistakes.
to_remove.add(cur)

if to_remove:
changed += 1
tool["revisions"] = sorted(set(tool["revisions"]) - to_remove)
removed_map[(name, owner)].update(to_remove)

logger.info(
f"Completed: {total} tools processed, {skipped} skipped, {changed} changed"
)

with open(lockfile_path, "w") as f:
yaml.dump(lockfile, f, sort_keys=False, default_flow_style=False)

if removed_map:
not_installable_output = {
"tools": [
{"name": n, "owner": o, "revisions": sorted(revs)}
for (n, o), revs in removed_map.items()
]
}
with open(not_installable_file, "w") as f:
yaml.dump(
not_installable_output, f, sort_keys=False, default_flow_style=False
)


if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("lockfile", help="Tool.yaml.lock file path")
parser.add_argument(
"--toolshed", default="https://toolshed.g2.bx.psu.edu", help="Toolshed base URL"
)
args = parser.parse_args()

fix_uninstallable(args.lockfile, args.toolshed)