-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SUT Options defined at test level #897
base: main
Are you sure you want to change the base?
Conversation
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
Conceptually, I like the idea of removing sut options from TestItems and passing them in where needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good other than the disabled test.
As to how to make it so people are less likely to accidentally break things by using the wrong options, would it make sense for the TestItem to have a pointer to the Test? Then anything that needs them, like a runner, could either do test_item.test.sut_options()
or just test_item.sut_options()
which has a default implementation that passes the request along to the Test.
src/modelgauge/base_test.py
Outdated
@@ -50,6 +51,12 @@ def get_annotators(cls) -> List[str]: | |||
""" | |||
pass | |||
|
|||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a class method? It seems to me that it could happen at the instance level, although given our small number of tests, I suppose it's hard to say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might also not make it an abstract method, come to think of it. Having a default that uses a default SUTOptions seems fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it might be useful for it to be a class method so that we could get information without having to instantiate a test object, which might be expensive. But it definitely doesn't have to be a class method.
And yeah you're right about getting rid of the abstract portion. I'll implement add a default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah, if that's something we're doing, that would be a good reason. But if we don't need it now, I'd say stick with an instance method and we can always rejigger later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I think I did it.
@classmethod | ||
def sut_options(cls) -> SUTOptions: | ||
"""Returns the SUT options that are supplied in each test item.""" | ||
return SUTOptions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes a new one each time, which is fine if it's not called often, but "once per test item" is for us a pretty frequent call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's true. I could make it a class property instead.
@wpietri @rogthefrog I ended up going with Roger's suggestion of removing sut_options from TestItem. I tried doing William's suggestion but you can't really point to non-pydantic models in a pydantic model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the cleanup work!
This is part of the effort to chisel away the need for the test wrapper class. More specifically, this PR makes it easy to access the SUT options used by a test object.
However, there is no guarantee that the SUT options returned by
test.sut_options()
are actually used in all of the test items. This is because the tests are responsible to assembling their test own test items (i.e. bundling the prompts + contexts + sut options).I see 3 options here:
sut_options()
is accurate (i.e. what this PR does). I don't like this weak link between the method and what it actually represents, but it is the easiest approach.sut.translate_text_prompt(prompt)
becomessut.translate_text_prompt(prompt, sut_options)
.