Skip to content

[feat] Add function to download prediction files #34

@vpchung

Description

@vpchung

Describe The Problem To Be Solved

Currently, there is no option to download submissions from the web UI. This feature will make it so one could download all submitted entities (regardless of status) to one's computer.

This came about when @gaiaandreoletti needed to download all submission files submitted to the CAGI challenges.

(optional) Suggest a Solution

I put together this quick hack to download the submissions, based on a submission view. (Should also consider downloading directly from an evaluation ID)

from pathlib import Path

import synapseclient


def download_submissions(view_id: str, parent_folder_name: str):
    """
    Downloads submissions from a Synapse view, organizing them by submitter.
    """

    def get_name(uuid: int) -> str:
        """Fetches the name for a given submitter ID (user or team)."""
        try:
            return syn.getTeam(uuid).get("name").replace(" ", "_")
        except synapseclient.core.exceptions.SynapseHTTPError:
            return syn.getUserProfile(uuid).get("userName").replace(" ", "_")

    # Query a submission view, grabbing all submitted entities.
    submissions = syn.tableQuery(f"SELECT id, submitterid FROM {view_id}").asDataFrame()

    # Map submitter IDs to human-readable names.
    submissions["team"] = submissions["submitterid"].apply(get_name)

    # Download each submission file into a parent folder, organized by submitter name.
    base_folder = Path(parent_folder_name)
    for _, row in submissions.iterrows():
        download_folder = base_folder / row.team
        download_folder.mkdir(parents=True, exist_ok=True)

        # Rename the file so that organizers can tell which submission ID the file belongs to.
        pred = syn.getSubmission(row.id, downloadLocation=download_folder)
        downloaded_file = Path(pred.filePath)
        new_filename = f"{row.id}{downloaded_file.suffix}"
        downloaded_file.rename(download_folder / new_filename)


if __name__ == "__main__":
    syn = synapseclient.login()

    download_submissions(
        view_id="syn68871597",
        parent_folder_name="CAGI_FGFR",
    )

Metadata

Metadata

Assignees

Labels

featureNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions