Skip to content

Suggestions#289

Merged
Fuhan-Yang merged 6 commits into
fy_refactor_lplfrom
fy_refactor_lpl_swo
Mar 31, 2026
Merged

Suggestions#289
Fuhan-Yang merged 6 commits into
fy_refactor_lplfrom
fy_refactor_lpl_swo

Conversation

@swo
Copy link
Copy Markdown
Collaborator

@swo swo commented Mar 30, 2026

  • Write out the groups everywhere -- This removes any confusion about whether "season_geo" is a group of not, and for what purpose
  • Models take in a list of quantiles to do predictions at
    • this separates the logic of which quantiles we want from the functionality of developing them
    • move alpha into the top-level config, rather than being mixed across models and plots
  • Move all the preprocessing involving seasons into scripts/preprocess.py -- this means that the season starts are no longer model parameters
  • Move seed into the model params
  • Give LPLModel a single _preprocess() method that does both the column renaming and the indexing (via _index())
  • Repair tests and viz

@swo swo force-pushed the fy_refactor_lpl_swo branch from 316db9a to eb84b11 Compare March 30, 2026 20:17
@swo swo marked this pull request as ready for review March 30, 2026 20:17
@swo swo requested a review from Fuhan-Yang March 30, 2026 20:17
Copy link
Copy Markdown
Contributor

@Fuhan-Yang Fuhan-Yang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My lesson learned here is how to code in a more OOP way. It seems to me is that to put signatures as class attributes as much as it makes sense (which enables handling by instance method or classmethod) is more OOP than having external signatures handled by staticmethod. There are questions about defining groups as a global variable, see below!

Comment thread iup/models.py Outdated

class CoverageModel(abc.ABC):
groups = ["season", "geography"]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why groups is coded this way rather than having a global variable defined in iup.utils? This way hard-code groups for multiple classes (including RFModel), which makes it harder to debug than having a global variable governing groups for all classes? In addition, this way will lead to importing CoverageModel to retrieve groups in scripts/eval.py, which I feel is not that efficient?

Copy link
Copy Markdown
Collaborator Author

@swo swo Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this, the more I think we should just write out ["season", "geography"] everywhere.

First, if we ever do want to add different groups to this codebase, then we're going to need to figure out what to do with the LPL model and its crossterms. Any design of code that allows for different groups is wholly speculative, since we haven't figured out the science first.

Second, the word "groups" gets used in two ways. It refers to the things that the models use (which includes season_geo for the LPL model, but not for the RF) and to the things used in scoring and visualization (which does not include season_geo, even for the LPL model). So it's not even the case that, as I wrote it, we can say the "groups" for the LPL models are season, geo, season_geo and the groups for the RF model are season, geo, since for the LPL model the "groups" sometimes include season_geo and sometimes don't.

Copy link
Copy Markdown
Contributor

@Fuhan-Yang Fuhan-Yang Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me this is the discrepancy between what we plan to do versus what the final results to be: what we plan to do is to differ the groups for fitting versus the groups for postprocessing, in this case it makes sense to define groups for fitting and for postprocessing separately. While the final results are that only the groups used for LPLModel fitting needs cross-terms, that's why I tend to add the cross-terms in the LPLModel.preprocess() and it will only use for fitting. I guess it is ok to use either method as it achieve the same functionality? I'm still a bit bothered that, if pursuing the track of what we plan to do, that the groups are separately hard-coded for abstract class and its subclass, and importing the groups from abstract class in scripts...seems to me it's bug-prone, especially if in the future different groups will be added...But if we decide this is just for current version of only considering season and geo, I guess this is ok too? All of this is pretty subtle...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I redid this, writing out the groups everywhere, and it's not so bad. I think it actually makes things a little clearer.

Comment thread iup/models.py Outdated
"""

groups = ["season", "geography", "season_geo"]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, hard-coding groups in LPLModel independent from CoverageModel feels little finicky...

Comment thread iup/models.py
iup.CumulativeCoverageData(data.rename({"sample_size": "N_tot"}))
@classmethod
def _preprocess(cls, data: pl.DataFrame) -> pl.DataFrame:
out = (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making _preprocess() to be a class method is more OOP than static method, which is nice! Does this lead to hard-coding groups (since groups is not processed here, it needs to be defined somewhere)?

Comment thread scripts/config_full.yaml
Comment thread scripts/eval.py Outdated
import iup.models
from iup.utils import date_to_season

GROUPS = iup.models.CoverageModel.groups
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where I feel finicky about importing a global variable from an abstract class instead of utils...

Comment thread iup/models.py
Comment thread scripts/preprocess.py
@swo swo requested a review from Fuhan-Yang March 31, 2026 18:15
Copy link
Copy Markdown
Contributor

@Fuhan-Yang Fuhan-Yang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved!

@Fuhan-Yang Fuhan-Yang merged commit 9ede9bc into fy_refactor_lpl Mar 31, 2026
4 checks passed
@Fuhan-Yang Fuhan-Yang deleted the fy_refactor_lpl_swo branch March 31, 2026 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants