Skip to content

(RUF012) Fixed mutable class Defaults- Task 1 #10283

Merged
cdrini merged 24 commits intointernetarchive:masterfrom
ananyakaligal:task
Jun 4, 2025
Merged

(RUF012) Fixed mutable class Defaults- Task 1 #10283
cdrini merged 24 commits intointernetarchive:masterfrom
ananyakaligal:task

Conversation

@ananyakaligal
Copy link

@ananyakaligal ananyakaligal commented Jan 6, 2025

Part of #10196
NOTE: These are NOT automated changes, they are done by hand

Prevents accidental modifications of mutable class attributes at the instance level.
Aligns with project style guidelines enforced by Ruff.

Technical
Updated mutable class attributes in 3 files to use typing.ClassVar for clarity and correctness.
Added ClassVar annotations to mutable class attributes to ensure they are treated as class-level attributes.
Ensured proper type hints for clarity and type safety.

Testing
Ran ruff check --select RUF012 to confirm no further warnings.

Stakeholders
@RayBB

@ananyakaligal
Copy link
Author

I closed the previous draft pull request because I ran into some errors on that branch and couldn't continue working on it. I created a new branch and am still working on it.

@ananyakaligal
Copy link
Author

Hello @RayBB,
I was working on fixing RUF 012 of TASK-1 and was annotating the variables with ClassVar. However, after adding ClassVar, I encountered the following error:
Cannot override instance variable (previously declared on base class "SearchScheme") with class variable

What happened:
In the base class SearchScheme, variables like field_name_map, sorts, and default_fetched_fields are instance variables. When I added ClassVar in the derived class, it treated them as class variables, which caused a conflict. This is because instance variables cannot be overridden by class variables, leading to the error.

Clarification:
All three files assigned to me as part of TASK-1 are using the same base class SearchScheme. Since the class variables were already correctly defined in the base class, no changes are needed to these files. Everything should be fine as is. Please correct me if I am wrong, and kindly assign TASK-2 if this is correct.

@RayBB RayBB mentioned this pull request Jan 7, 2025
24 tasks
@RayBB RayBB marked this pull request as draft January 7, 2025 04:28
@RayBB
Copy link
Collaborator

RayBB commented Jan 7, 2025

@ananyakaligal I need to catchup on many other of the issues before digging into this one. For now lets leave it as a draft and I'll follow up with you when I can evaluate 👍
But yes go ahead and work on the other one :)

@RayBB RayBB self-assigned this Jan 7, 2025
@ananyakaligal
Copy link
Author

@RayBB Thankyou, please do tag me after you evaluate it because the rule hasn't been enabled yet, and I can do it later.

@RayBB RayBB added Needs: Staff / Internal Reviewed a PR but don't have merge powers? Use this. and removed Needs: Staff / Internal Reviewed a PR but don't have merge powers? Use this. labels Jan 15, 2025
@RayBB
Copy link
Collaborator

RayBB commented Jan 29, 2025

I think that openlibrary/plugins/worksearch/schemes/editions.py will need to be changed as part of this PR.
So lets do that here instead of in #10286

@RayBB
Copy link
Collaborator

RayBB commented Jan 29, 2025

Copying this from the other PR:

In general the approach we'll want is:

  • By default, assume it should be moved to an immutable type.
  • Check if there is anywhere that is is being changed
  • If it is being changed somewhere we'll need to discuss but it should then probably become and instance variable.

@ananyakaligal ananyakaligal marked this pull request as ready for review January 30, 2025 15:11
@ananyakaligal
Copy link
Author

@RayBB Could you review this and let me know if any changes are required.

Copy link
Collaborator

@RayBB RayBB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General idea looks good to me. Left a few notes to on comments to restore (restore all please).

Copy link
Collaborator

@cdrini cdrini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, nice work @ananyakaligal ! The approach here seems good ; a few pieces of feedback for keeping the type information and some small bug fixes 👍

facet_rewrites: dict[tuple[str, str], str | Callable[[], str]]
def __init__(self):
# Set of queries that define the universe of this scheme
self.universe = set()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's bring back the type information here, otherwise it doesn't know what the set contains.

Suggested change
self.universe = set()
self.universe: set[str] = set()

And so-on.

Also, let's change universe back to a list ; we shouldn't need it to be a set.

facet_rewrites: dict[tuple[str, str], str | Callable[[], str]] = {}
def __init__(self):
super().__init__()
self._universe = {'type:author'}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These shouldn't have underscore prefixes, or they won't override the baseclass values!

Suggested change
self._universe = {'type:author'}
self.universe = {'type:author'}

Comment on lines 201 to 202
('has_fulltext', 'true'): f'ebook_access:[{get_fulltext_min()} TO *]',
('has_fulltext', 'false'): f'ebook_access:[* TO {get_fulltext_min()}]',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will need their lambdas back, because the value needs to be computed on a per-user basis.

@cdrini cdrini added Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] and removed Needs: Response Issues which require feedback from lead labels Mar 12, 2025
@RayBB RayBB moved this from Waiting Review/Merge from Staff to Someone else is working on it in Ray's Project Mar 12, 2025
@github-actions github-actions bot removed the Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] label Apr 18, 2025
@ananyakaligal
Copy link
Author

@RayBB @cdrini does this look good?

@github-actions github-actions bot added the Needs: Response Issues which require feedback from lead label Apr 22, 2025
@RayBB RayBB moved this from Someone else is working on it to Waiting Review/Merge from Staff in Ray's Project May 14, 2025
@RayBB RayBB requested a review from cdrini May 17, 2025 21:19
Copy link
Collaborator

@RayBB RayBB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just gone through and made a couple tweaks but overall I think this is ready and it's a big improvement.

First, I made facet_fields and default_fetched_fields class variables because they must be based on how they're used. And I made them frozen sets so they can't be modified.

Second, I enabled the ruff rule since this is the last PR. From that I found small things to fix with component_mapping, types, and an array that should be a set.

Although this series of PRs for this rule have been a lot of work I certainly have learned a lot and I do think it's an improvement for the code base.

Hoping that @cdrini and I can review and merge this one Wednesday!

@cdrini
Copy link
Collaborator

cdrini commented Jun 4, 2025

@RayBB and I have been flipping back and forth between the two options here:

  • Put everything in constructor

    • Pros: Can be mutable, avoid awkward Mapping syntax
    • Cons: Small with accidental immutability on same instance? Unlikely
    • Cons: Have to constructor everywhere we access default_fetched_fields, even though we as humans know these things are immutable
    • Cons: Performance having to call the constructor
  • Put everything on the class

    • Pros: Can access single copy from class
    • Cons: Need to use weird Mapping syntax
    • Cons: Might require some extra magic type annotations in certain use cases because of the immutable nature
    • Cons: There's something we've forgotten about why we switched away from this
    • Pro: This is the way it's been running on production -- but with mutability (which could introduce subtle bugs)

I think long term we're leaning towards the second option, which better matches the intended meaning of these fields, which is that they are immutable, and biting the cost of the awkward python syntax required. BUT since this solution comes with a number of risks that we fear might make this take a long time to resolve/come to a conclusion, for this PR, we have decided to disable the ruff rule for these specific files, and revert them to their original state, so that we can unblock this PR and merge it in.

Thank you so much for your patience @ananyakaligal ! This one turned out to be significantly more subtle/intricate than any one of us anticipated :P And thank you so much @RayBB for diligently keeping things moving and making updates and progress as we discovered more about this problem space! @RayBB is also kindly taking on the work of reverting the SearchScheme files to make this ready-to-merge!

Copy link
Collaborator

@cdrini cdrini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks perfect!!! Thank you so much @RayBB and @ananyakaligal !

@cdrini cdrini merged commit 975c2a4 into internetarchive:master Jun 4, 2025
4 checks passed
@github-project-automation github-project-automation bot moved this from Waiting Review/Merge from Staff to Done in Ray's Project Jun 4, 2025
@RayBB RayBB mentioned this pull request Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Affects: Developers Needs: Response Issues which require feedback from lead

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants