Skip to content

Conversation

@enrubio
Copy link
Member

@enrubio enrubio commented Nov 19, 2025

This PR:

  • Updates compute_dual_submission_metadata so that it uses the sparse scores and no longer uses heaps to compute the top 5 scores per row.
    • Removes sparse_value as a param since it's now only used to request to the expertise. I think the default for this function should be to compute between active papers using sparse=5 and if PCs want anything else, they should call the expertise separately.
  • Updates request_paper_similarity so that you can pass a list of submissions for entity A or B. Useful if you need to compute scores when the venue is already over.

Ran test jobs and checked that output is as expected.

# Check entity A params
if bool(venue_id) == bool(invitation):
raise OpenReviewException('Provide exactly one of the following: venue_id, invitation')
if sum(map(bool, [venue_id, invitation, submissions])) != 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding several optional parameters is very confusing. Let's create three different functions:

	•	request_paper_similarity_by_venue(...)
	•	request_paper_similarity_by_invitation(...)
	•	request_paper_similarity_from_submissions(...)

They can all call to the same inner functions that understand the expertise API request schema:

def _request_paper_similarity_core(
    name,
    entity_a,
    entity_b,
    model='specter2+scincl',
    sparse_value=400,
    baseurl=None,
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so you mean request_paper_similarity_by_venue will just call to request_paper_similarity with the correct params?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I propose:

def request_paper_similarity_by_venue(
    self,
    name: str,
    venue_id: str,
    alternate_venue_id: str | None = None,
    model: str = 'specter2+scincl',
    sparse_value: int = 400,
    baseurl: str | None = None,
):
    return self._request_paper_similarity_core(
        name,
        source_a={'venue_id': venue_id},
        source_b={'venue_id': alternate_venue_id} if alternate_venue_id else None,
        model=model,
        sparse_value=sparse_value,
        baseurl=baseurl,
    )


def request_paper_similarity_by_invitation(
    self,
    name: str,
    invitation: str,
    alternate_invitation: str | None = None,
    model: str = 'specter2+scincl',
    sparse_value: int = 400,
    baseurl: str | None = None,
):
    return self._request_paper_similarity_core(
        name,
        source_a={'invitation': invitation},
        source_b={'invitation': alternate_invitation} if alternate_invitation else None,
        model=model,
        sparse_value=sparse_value,
        baseurl=baseurl,
    )


def request_paper_similarity_from_submissions(
    self,
    name: str,
    submissions,
    alternate_submissions=None,
    model: str = 'specter2+scincl',
    sparse_value: int = 400,
    baseurl: str | None = None,
):
    return self._request_paper_similarity_core(
        name,
        source_a={'submissions': submissions},
        source_b={'submissions': alternate_submissions} if alternate_submissions else None,
        model=model,
        sparse_value=sparse_value,
        baseurl=baseurl,
    )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants