Skip to content

Conversation

@jaymedina
Copy link
Contributor

@jaymedina jaymedina commented Sep 30, 2025

Problem:

This work is part of the ongoing effort to refactor the Synapse Python Client using object-oriented principles, following the pattern established in https://sagebionetworks.jira.com/browse/SYNPY-1418.

We currently to not have an OOP model for communications to the Evaluation API services provided by the Synapse platform.

Solution:

The work has been separated into 3 feature branches, with the *main branch being intended for merge into develop and the other 2 branches merging into *main when ready.

The acceptance criteria has been split up into these 3 branches in the following way.


Branch: synpy-1589-evaluation-model-main

  • async integration tests are introduced

Branch: synpy-1590-submission-model-functionality

Evaluation API Design Summary

The data model of the Evaluation API is built around around two primary objects:

  • Evaluation: The primary object representing a Synapse Evaluation. Access to Evaluations is governed by an Access Control List (ACL).
  • Submission: A user in a Synapse Evaluation can submit a Synapse Entity as Submission to that Evaluation. Submission data is owned by the parent Evaluation, and is immutable.

The data model includes additional objects to support scoring of Submissions and convenient data access:

  • SubmissionStatus: An object used to track scoring information for a single Submission. This object is intended to be modified by the users (or test harnesses) managing the Evaluation.
  • SubmissionBundle: A convenience object to transport a Submission and its accompanying SubmissionStatus in a single web service call.
  • SubmissionView (already implemented): A submission view can be created using the Entity Services providing as scope a list of evaluation ids, in order to query the set of submissions through the Table Query Services. Annotations set in the submissionAnnotations property of a SubmissionStatus can be exposed in the view.

Source: https://rest-docs.synapse.org/rest/#org.sagebionetworks.repo.web.controller.EvaluationController

The design for the python client tools for communicating with this API will take into consideration the Evaluation API design summary above. This implementation will look like so:

  • The required API services that interact with the Submission(/Status/Bundle) object will be exposed under evaluation_services.py

Submission operations:
Create submission: https://rest-docs.synapse.org/rest/POST/evaluation/submission.html
Get submission: https://rest-docs.synapse.org/rest/GET/evaluation/submission/subId.html
Get submissions for evaluation: https://rest-docs.synapse.org/rest/GET/evaluation/evalId/submission/all.html
Get user submissions: https://rest-docs.synapse.org/rest/GET/evaluation/evalId/submission.html
Get submission count: https://rest-docs.synapse.org/rest/GET/evaluation/evalId/submission/count.html
Delete submission: https://rest-docs.synapse.org/rest/DELETE/evaluation/submission/subId.html
Cancel submission: https://rest-docs.synapse.org/rest/PUT/evaluation/submission/subId/cancellation.html

SubmissionStatus operations:
Get submission status: https://rest-docs.synapse.org/rest/GET/evaluation/submission/subId/status.html
Update submission status: https://rest-docs.synapse.org/rest/PUT/evaluation/submission/subId/status.html
Get all submission statuses: https://rest-docs.synapse.org/rest/GET/evaluation/evalId/submission/status/all.html
Batch update statuses: https://rest-docs.synapse.org/rest/PUT/evaluation/evalId/statusBatch.html

Bundle operations:
Get submission bundles: https://rest-docs.synapse.org/rest/GET/evaluation/evalId/submission/bundle/all.html
Get user submission bundles: https://rest-docs.synapse.org/rest/GET/evaluation/evalId/submission/bundle.html

  • The following dataclasses are created in their own *.py files for discoverability:
Submission
SubmissionStatus
SubmissionBundle
  • SubmissionBundle will inherit Submission and SubmissionStatus
  • async/sync methods are introduced to the DataClasses that wrap around the new services
  • async/sync methods' docstrings are updated with example use-cases
  • async/sync Integration tests are patched/introduced
  • async/sync Unit tests are patched/introduced

Branch: synpy-1590-submission-model-docs

  • The old models are deprecated
  • Our Synapse Python Client documentation (using mkdocs) is updated for these models
  • A tutorial is made for popular use-cases of these models

Testing:

  • Passing unit tests
  • Passing integration tests
  • Passing documentation build
  • Running through the tutorial presents no issues

Copy link
Member

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work starting this, let's get the evaluation work across the finish line first so we don't spread ourselves too thin.

@jaymedina jaymedina force-pushed the synpy-1590-submission-model-main branch from 8b50d06 to a061fb4 Compare November 5, 2025 14:46
@jaymedina jaymedina force-pushed the synpy-1590-submission-model-main branch from a061fb4 to c748132 Compare December 1, 2025 16:33
@jaymedina jaymedina force-pushed the synpy-1590-submission-model-main branch from c748132 to 31d9583 Compare December 2, 2025 19:47
jaymedina and others added 2 commits December 22, 2025 13:28
…dle) OOP model (#1251)

* tdd initial tests

* initial intro of dataclasses

* expose api services for submission object

* style and update docstring

* add submission and submissionstatus models

* add submission status retrieval and update methods; remove empty submissionstatus file

* pipe query params directly into restAPI httpx requests

* new dataclass object submission_bundle

* move submission services functions to evaluation_services.py

* renaming imports, to_synapse_request, request body refactor

* patching up store method signature

* update docs

* new suite of tests

* submissionstatus rework as a mutable object

* bug fix for Statuses: updated to_synapse_request to follow same pattern as evaluations design

* replace != with is not for full object comparison (not just keys)

* expose the is_private arg for to_submission_status_annotations ONLY FOR submission annotations

* fixed submission status/submission annotations

* add support for legacy annotations

* remove debug prints

* get_all_submission_statuses now returns a list of substat objects

* docstring updates

* update submissionbundle docstrings, add more examples

* initial sync test for status. moved evaluation_id retrieval to fill_from_dict for submissionstatus calls

* update submissionBundle submissionstatus with evaluation_id

* patch sync substatus integ tests. style.

* fix submissionStatus integ tests and has_changed attribute

* new substatus async integ tests. can_cancel can now be modified by an organizer on the client. cancel request returns no response body.

* new test class for submission cancel functionality

* substatus async unit tests

* remove compare=false for some attributes. update sync unit tests

* add submissionBundle integration tests

* add submissionBundle unit tests

* remove unnecessary imports and add style

* get_evaluation_submissions returns generator object

* get_user_submissions returns generator object

* submissionBundle methods return generators

* address final todo: implement docker_tag, and note in docs that version_number can be ignored for docker submissions

* import -> imports

* change back to import. patch uses synapse logger instance.

* lock mkdocstrings-python to >=2.0.0

* add Return description to create_submission

* [SYNPY-1714] Fix Issues with Failing Tests (#1287)

* always log warning

* check assert called with

* patch object on class instance

* Revert "always log warning"

This reverts commit ae63c81.

* update patch target

* re-add erroneously removed line

* no need to build out Annotations object from scratch (remove evaluation_id attribute)

* remove Dict, List imports

* fix broken tests due to removed evaluation_id attr

* style

* import classes directly from typing. remove Dict and List.

* style

* add API reference links to _fetch_latest_entity

* type hint should be logging.Logger instance (generic or Synapse client)

* explicit import to follow the other imports

* import order

* minimize indenting by using if not:

* raise ValueError(no etag for entity) sooner

* remove docker submission async integration test

* link Jira ticket to TODO item

* remove old cancel submission test

* remove old cancel submission test (async)

* assert the SubmissionStatus object has not changed

* valueError(msg) from e

* ValueError -> LookupError

* add evaluation_round_id and explicit optional_fields dict for organization

* loop over optionalfields for SubmissionStatus

* new global var: POSSIBLE_STATUSES

* same for sync

* remove validation using response

* style

* correctly pass synapse_client instance

* use prior synapse_client instance. LookupError message

---------

Co-authored-by: SageGJ <[email protected]>
synapse_client: Optional[Synapse] = None,
) -> dict:
"""
Update multiple SubmissionStatuses. The maximum batch size is 500.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since 500 is the limit, should we enforce that in the python method?

Copy link
Contributor Author

@jaymedina jaymedina Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the limit before the API starts a new batch. The request can still have >500 individual submission statuses, but once it exceeds 500, the API splits them into batches and the response we get back will include batch tokens to identify the batches.

https://rest-docs.synapse.org/rest/PUT/evaluation/evalId/statusBatch.html

Copy link
Member

@BryanFauble BryanFauble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments for areas to improve, but overall this is really excellent work @jaymedina . Thank you so much for your attention to detail to get these changes across the finish line.

Copy link
Contributor

@linglp linglp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jaymedina ! I spent some time today testing different functions in this script. For some reason, a few of them hang for me, and I’m curious whether you’ve encountered similar issues.
In addition, there are two things you might want to consider:

  • I’m not currently seeing any code that validates whether status is one of the allowed values listed here. It may be worth adding validation to ensure status is valid.
  • In the unit tests under models/synchronous/unit_test_submission_bundle.py and models/synchronous/unit_test_submission.py, many tests still appear to be calling async methods instead of their sync counterparts (for example, cancel_async instead of cancel).

Arguments:
evaluation_id: The ID of the evaluation queue.
status: Optionally filter submissions by a submission status.
Submission status can be one of <https://rest-docs.synapse.org/rest/org/sagebionetworks/evaluation/model/SubmissionStatusEnum.html>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since status is an optional parameter, I would expect Submission.get_evaluation_submissions(evaluation_id="9999999") to still work. However, when I tested it, omitting status seems to cause the code to hang. Could you try verifying this on your end?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a user passes in a status value that is not one of the enum values listed here? Can we add validation in the code to catch that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah. This method wraps the new async generator function, and for evaluation IDs that exist, this method works fine, with the generator object returning the submission objects as expected. But for invalid evaluation IDs, instead of raising an error like Your evaluation_id does not exist it silently returns nothing.

@BryanFauble is this the intended behavior for the wrap-async-to-sync generator function, or should we revisit? Do you see this as a blocker for this PR?

In [32]: subs = Submission.get_evaluation_submissions(eval_id)

In [33]: for i in subs:
    ...:     print("submission: ", i.id)
    ...: 
submission:  9742736
submission:  9742737
submission:  9743051
submission:  9743052
submission:  9746035
submission:  9746036

In [34]: Submission.get_evaluation_submissions("9999999")
Out[34]: <generator object SubmissionSynchronousProtocol.get_evaluation_submissions at 0x1055030b0>

In [35]: subs2 = Submission.get_evaluation_submissions("9999999")

In [36]: for i in subs:
    ...:     print("submission: ", i.id)
    ...: 

In [37]: 

Copy link
Contributor Author

@jaymedina jaymedina Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a user passes in a status value that is not one of the enum values listed here? Can we add validation in the code to catch that?

When a user passes an invalid status, we leave it to the API to handle that. For example, this is the message a user would retrieve when trying to collect submissions with a status of 'jenny':

SynapseHTTPError: 400 Client Error: No enum constant org.sagebionetworks.evaluation.model.SubmissionStatusEnum.JENNY

If there is a way to grab the valid constants for SubmissionStatusEnum, then I think we can add some error-handling here to make a prettier ValueError message. Otherwise, I would say we continue letting the API handle this error, since the other option would mean having to maintain a hard-coded list of available statuses.

^ If there's no known way to do this I can investigate further when I come back from holidays.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I happend to have a discussion with @BryanFauble about Enum here. In short, we could support either a string or the enum, though using the enum is preferred. See this as an example.

I think something like this will work:

from enum import Enum
class SubmissionStatusEnum(str, Enum):
    OPEN = "OPEN"
    CLOSED = "CLOSED"
    SCORED = "SCORED"
    INVALID = "INVALID"
    VALIDATED = "VALIDATED"
    EVALUATION_IN_PROGRESS = "EVALUATION_IN_PROGRESS"
    RECEIVED = "RECEIVED"
    REJECTED = "REJECTED"
    ACCEPTED = "ACCEPTED"


def validate_status(input_status: str) -> SubmissionStatusEnum:
    try:
        return SubmissionStatusEnum(input_status)
    except ValueError:
        valid = ", ".join([e.value for e in SubmissionStatusEnum])
        raise ValueError(f"Invalid submission status: {input_status!r}. Must be one of: {valid}")



validate_status(SubmissionStatusEnum.SCORED)  # Valid
validate_status("INVALID")  # Valid
validate_status("UNKNOWN")  # Invalid, should raise ValueError

)

@classmethod
def get_user_submissions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not exactly sure what's going on, but when I tested it out the code on my end, I observed different outputs from sync v.s. async method:

def get_user_submission():
    for submission in Submission.get_user_submissions(
        evaluation_id=evaluation.id,
        user_id="3443707"
    ):
        print("user submission", submission)
get_user_submission()

async def get_user_submission_async():
    async for submission in Submission.get_user_submissions_async(
        evaluation_id=evaluation.id,
        user_id="3443707"
    ):
        print("user submission async", submission)

asyncio.run(get_user_submission_async())

The async version is able to print out many submission records, while the sync function gets stuck.

@staticmethod
async def get_submission_count_async(
evaluation_id: str,
status: Optional[str] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above: I think we should add validation to ensure that status is one of the allowed enum values.

"""Protocol defining the synchronous interface for SubmissionBundle operations."""

@classmethod
def get_evaluation_submission_bundles(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following code works for me:

async def list_submission_bundle():
    print("Starting to fetch bundles...")
    bundles = []
    try:
        async for bundle in SubmissionBundle.get_evaluation_submission_bundles_async(
            evaluation_id=evaluation.id,
            status="SCORED",  # Add status filter
            synapse_client=syn
        ):
            print(f"Got a bundle: {bundle.submission.id if bundle.submission else 'None'}")
            bundles.append(bundle)
    except Exception as e:
        print(f"Error occurred: {e}")
        import traceback
        traceback.print_exc()
    print(f"Found {len(bundles)} submission bundles")
    return bundles

I could see no submission bundle gets printed out. But if I remove status="SCORED", the code would hang again.

jaymedina and others added 6 commits December 23, 2025 19:00
…OOP model (#1252)

* tdd initial tests

* initial intro of dataclasses

* expose api services for submission object

* style and update docstring

* add submission and submissionstatus models

* add submission status retrieval and update methods; remove empty submissionstatus file

* pipe query params directly into restAPI httpx requests

* new dataclass object submission_bundle

* move submission services functions to evaluation_services.py

* renaming imports, to_synapse_request, request body refactor

* patching up store method signature

* update docs

* new suite of tests

* submissionstatus rework as a mutable object

* bug fix for Statuses: updated to_synapse_request to follow same pattern as evaluations design

* replace != with is not for full object comparison (not just keys)

* expose the is_private arg for to_submission_status_annotations ONLY FOR submission annotations

* fixed submission status/submission annotations

* add support for legacy annotations

* remove debug prints

* get_all_submission_statuses now returns a list of substat objects

* docstring updates

* update submissionbundle docstrings, add more examples

* initial sync test for status. moved evaluation_id retrieval to fill_from_dict for submissionstatus calls

* update submissionBundle submissionstatus with evaluation_id

* patch sync substatus integ tests. style.

* fix submissionStatus integ tests and has_changed attribute

* new substatus async integ tests. can_cancel can now be modified by an organizer on the client. cancel request returns no response body.

* new test class for submission cancel functionality

* substatus async unit tests

* remove compare=false for some attributes. update sync unit tests

* add submissionBundle integration tests

* add submissionBundle unit tests

* remove unnecessary imports and add style

* get_evaluation_submissions returns generator object

* get_user_submissions returns generator object

* submissionBundle methods return generators

* address final todo: implement docker_tag, and note in docs that version_number can be ignored for docker submissions

* add async page in api references

* add initial submission, status, bundle docs

* updated tutorial purpose. add api reference to navbar

* new tutorial scripts

* remove try/excepts. add line references. add resources and source code sections.

* add reference to File model

* deprecate old submission model

* style

* add submission status and bundle to references

* no need to import Submission for submission_organizer tutorial

* status should be CLOSED not CANCELLED. add Submission import back (its for the delete step).

* set tutorial global vars to None

* fix line references

* remove limits

* add 2 more examples for Submission docstring

* use the same evaluation_id in all examples

* elaborate on submissions, statuses, bundles, and their relationship to evaluations

* remove AcessControllable from being inherited
@jaymedina jaymedina marked this pull request as ready for review December 24, 2025 01:31
@jaymedina jaymedina requested a review from a team as a code owner December 24, 2025 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants