-
Notifications
You must be signed in to change notification settings - Fork 77
Updated parsing of problem.yaml by using checks as described in the specification #295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…m.yaml. Co-authored-by: square-cylinder <[email protected]> Cleaned up and added more checks for problem.yaml Co-authored-by: square-cylinder <[email protected]> detect unknown fields f-strings compatible with older python versions fix bug
made it more obvious if old folder name is used, or folder is misspelled made check error messages clearer for people newer to problemtools closest word functionality bugfix index error
readded comment stating weird it is Name change and pytest fix
…roduced in python 3.12
Merged conflict in hello test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started looking at this PR, and I think it needs relatively large amount of work. I don't understand several of the approach changes from #294 at all (not trying to merge the data into a common structure, putting all code in verifyproblem instead of a separate module, not having any test cases).
I ran out of time while reviewing this, but setting this as request changes to not delay giving you the feedback I had for the part I had time to review.
@@ -262,7 +269,7 @@ def check(self, context: Context) -> bool: | |||
self.check_size_limits(self.ansfile) | |||
self._problem.getProblemPart(InputValidators).validate(self) | |||
anssize = os.path.getsize(self.ansfile) / 1024.0 / 1024.0 | |||
outputlim = self._problem.get(ProblemConfig)['limits']['output'] | |||
outputlim = self._problem.get(ProblemConfigLegacy)['limits']['output'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing ProblemConfig
to ProblemConfigLegacy
like this feels dangerous, and makes a reader worry about how this works when reading in a new problem config. I'd advice at the very least signaling some intent here by accessing via ProblemConfigBase
for fields that are the same for both formats.
A better solution would be to ensure the config as read by the rest of the program looks the same (has the same keys, and the same types) regardless of what version was used. I believe you did this in #294.
# Some deprecated properties are inherited from problem config during a transition period | ||
problem_grading = problem.get(ProblemConfig)['grading'] | ||
problem_grading = problem.get(ProblemConfigLegacy)['grading'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work with a new config file, which lacks the grading
key? (This question applies to many places where you access keys which are named differently, or have different value types in the two versions.)
self.error(f'License needs to be one of: {self._VALID_LICENSES}') | ||
if self._data['license'] == 'unknown': | ||
self.warning("License is 'unknown'") | ||
def fix_config_structure(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me like ProblemConfigBase
is an abstract base class, and that this is an abstract method? If so, consider inheriting from ABC
and use the decorator @abstractmethod
here.
|
||
|
||
def check(self, context: Context) -> bool: | ||
if self._check_res is True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you rewrite the memoization of _check_res
this way? It looks like the memoization is now broken.
return self._check_res | ||
elif self._check_res is not False: | ||
self._check_res = True | ||
to_check = [prop for prop in dir(self) if prop.startswith('_check_') and callable(getattr(self, prop))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of trying to find all methods with a special name and running them like this. Why was this approach taken? Do you end up with so many check methods that you can't cleanly list them?
self._check_res = True | ||
to_check = [prop for prop in dir(self) if prop.startswith('_check_') and callable(getattr(self, prop))] | ||
for prop in to_check: | ||
self.info(f'Checking "{prop}"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internal methods called should not be logged at info level.
return best | ||
|
||
class ProblemConfigLegacy(ProblemConfigBase): | ||
DEFAULT_LIMITS = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These defaults used to be specified in a problem.yaml
config file (and user-configurable, as is documented in README.md
). Does the fact that you list the defaults here imply that you dropped support for having the defaults configurable via config file (and user-configurable via a config file in the home directory)? If so, it feels like a big step backwards to hardcode these values rather than have them in a config file.
|
||
def fix_config_structure(self): | ||
self._data.setdefault('problem_format_version', formatversion.VERSION_LEGACY) | ||
if self._data.get('problem_format_version') != formatversion.VERSION_LEGACY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks (checking that something is one of a few allowed values) get very repetitive. It looks like you ened a helper function, something like
check_allowed_values(self, key: str, values: list, error_fn = self.error)
|
||
if 'name' in self._data: | ||
val = self._data['name'] | ||
if type(val) is not str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, here, you should have a check_type()
function to make the code less repetitive.
Alternatively, both such check functions can probably be implemented using json-schemas.
Added checks for legacy and 2023-07 for parsing problem.yaml, making sure its correct and defaulting values.
Made sure to keep the code itself simple so that eventual changes and upkeep to it will be easy to go through with.
It has been tested on most PO problems from the last 5 years as well as islandic problems.
For the new format it has not been tested as much (due to lack of available configs), only a few edgecases and few self made configs.
If there is anything that looks weird or any questions I will be available.