Skip to content

Conversation

steppi
Copy link
Collaborator

@steppi steppi commented Oct 15, 2025

The primary purpose of this PR is to refactor AdeftClassifier to make it easier to use other forms of models besides logistic regression with tfidf features. Rather than hardcoding the model type into AdeftClassifier, AdeftClassifier now takes a parameter estimator. I've written this in such a way to maintain backwards compatibility. I've written the estimator class BaselineModel which implements the current adeft models, and made this the default in AdeftClassifier. I have not defined an abstract base class yet for estimators used in AdeftClassifier, but plan to do so when I start working on adding other model types. The intention of this PR is simply to lay the groundwork for future improvements. There was some shuffling around required to make this all work out.

Also included in this PR are:

  • a fix to the adeft grounding gui, which I found no longer worked when started from within a jupyter notebook due to changes in jupyter.
  • a function str2filename for mapping model names to safe filenames, which is needed since we use model names as filenames, and used to be contained in adeft_indra under a different name before I accidentally removed it from there without replacing it somewhere else 4 years ago.
  • A fix to the underlying issue behind Error when disambiguating using scikit-learn 1.5.0  #80 that does not require retraining models.
  • bumping max_iter for logistic regression to get some warnings in the tests to go away

I will follow this PR up with a PR improving the validation of adeft models, allowing for fair comparisons between models of different types.

@steppi
Copy link
Collaborator Author

steppi commented Oct 15, 2025

I also had to update the tested python versions to no longer include python 3.7, which is no longer supported and no longer available in github actions. I added python 3.13 for good measure.

@steppi
Copy link
Collaborator Author

steppi commented Oct 15, 2025

Oops. I was doing some debugging, playing around with changing adeft.__version__ because that controls what models get downloaded, and accidentally left it set to an older version. Fixed. Ideally we'd have versions like 0.12.4-dev, but I'd need to think about how that would interact with us having separate folders of models on s3 for each version.

@steppi
Copy link
Collaborator Author

steppi commented Oct 16, 2025

I've gone ahead and created an abstract base class for adeft models because while working on validation I realized it would be clean to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant