-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analytics cli #894
Analytics cli #894
Conversation
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
@@ -102,6 +102,8 @@ def validate_uid(ctx, param, value): | |||
Raises a BadParameter exception if the user-supplied arg(s) are not valid UIDs. | |||
Applicable for parameters '--sut', '--test', and '--annotator'. | |||
""" | |||
if not value: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems inconsistent with the comment right above: isn't a null or blank value a bad UID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can see how it looks a little counter intuitive. But this callback is called even when no argument is supplied. So it shouldn't return False when we aren't actually validating a UID.
Before, every click option that used this validate_uid
callback either required exactly one argument or optionally allows a list of args. So validate_uid
would never receive None
.
However, my new cli has an optional option --sut
. Since it's not a list, it's default value is None
if no sut is specified, instead of an empty list.
That's why I need to add this line of code.
Basically, this function only validates UID strings that are actually provided. Does that make sense?
def run_csv(sut_uid, annotator_uids, workers, output_dir, tag, debug, input_path, max_tokens, temp, top_p, top_k): | ||
"""Run rows in a CSV through some SUTs and/or annotators. | ||
|
||
If running a SUT, the file must have 'UID' and 'Text' columns. The output will be saved to a CSV file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is Text
? Is it the prompt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is identical to the already existing run-csv-items command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as we're changing things, do we want to change this so that will read our prompt files? Because I think that's going to be the most common use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that but those changes will also affect the run-csv-items command that the eval folks use.
I think the CSV-ness of this is incidental to the purpose, so I might be more inclined to go with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I think not having everything in the metadata for now is fine. Let's get it into people's hands and see what they want once they're using it.
catch_exceptions=False, | ||
) | ||
|
||
assert result.exit_code == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're saying about the tests. Have you tried creating an object that holds all of those variables? You could at least move the boilerplate inside it, and I suspect it would be a good place for some convenience methods that would clear up your code some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I thought about that but I'm having trouble setting up boiler plate because each test differs slightly in input file, cli args, and expected output file.
@wpietri I'm not very poetic so I re-named to |
First attempt at an analytics CLI tool:
run-csv
.To summarize, each run results in a new directory containing the output results file and a metadata json file.
The output structure is largely based off of the sample dvc directory in the modelbench-private
dvc-experiments
branch. However, I was not able to follow it 100%. Here are the key differences:suts
instead of onesut
. This is because the pipeline runner is shared with the already-exisitingrun-csv-ITEMS
which allows for multi-sut runs.prompts
metadata key toinput
in order to generalize to runs where the input is prompts + responses.provider
andprovider_id
in the metadata. The best I could do is include the SUT initialization record. Annotators do not have initialization records, however, which maybe we should fix.unique_count
for responses in the metadata because it would be kind of difficult to keep track of. If it's super important, we can include it in a later PR.Some questions