-
Notifications
You must be signed in to change notification settings - Fork 77
New config system #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
New config system #294
Conversation
Co-authored-by: Zazmuz <[email protected]>
standardise casing on bools
Co-authored-by: Zazmuz <[email protected]>
Co-authored-by: Zazmuz <[email protected]>
Co-authored-by: Zazmuz <[email protected]>
Co-authored-by: Zazmuz <[email protected]>
Co-authored-by: Zazmuz <[email protected]>
Co-authored-by: Zazmuz <[email protected]>
Co-authored-by: Zazmuz <[email protected]>
Co-authored-by: Zazmuz <[email protected]>
Co-authored-by: square-cylinder <[email protected]>
Co-authored-by: square-cylinder <[email protected]>
This might be a lot to bite into, @gkreitz, if you wonder anything along the way, just ask us, we are happy to help! |
We've had a slack discussion on the general approach taken here. I think the conclusion is that a different approach using json-schemas will be attempted, to see if that leads to a simpler code base. So we'll "pause" this PR until we see how that plays out. Thanks @sjoqvist for having written an example schema in #285. Its existence was helpful in informing our discussion! |
New system for problem config
A new system for handling loading of configs has been implemented. For a specification document of how it works, see problemtools/config_parser/spec.md. The overview is that you can now create a yaml file that will specify all types, defaults, and some requirements for a problem-format. This system was inspired by json-schemas for verifying the structure of json documents, although pretty specialized for configs for problem-formats. The thought is that this system will also later be used to handle the testdata.yaml and so on later on.
Many aspects of the system have been verified with tests, and for legacy we have extensively verified problems from many competitions to see that it works. Some more testing will be required when it comes to the new format however, as the only real config that has been loaded is a very minimal one.
I think all the checks that happened previously still happen for the legacy format, and every type and structure is now being tested which I don't think was the case previously.
Code overview
I will present a brief overview of the different parts that make up the new system.
Metadata class
This class represents a config that can be loaded. The constructor takes one specification dictionary, and the functions that are supposed to be used are
load_config
which basically take the data and some data to "inject". This means that the data will be thrown into the loaded config which is meant to be use for copying standard values. Also acheck_config
function is supposed to be used to detect some additional errors like incompatible alternatives.Path class
This class is meant to be used to index multiple layers of dictionaries and lists. Maybe some dependency could be added to handle this functionality. These functions were just implemented as pretty basic stuff is required.
AlternativeMatcher class
It was required that we could match strings, ints, floats and bools for various purposes by providing strings. How these match-strings look is described in the spec.md. AlternativeMatch is a baseclass for all the matcher classes, and provides a factory to get the corresponding one based on a type string.
Parser class
To handle many peculiarities of the problem format, custom parsing was added, so a lot of parsers were implemented for different things to get them into a easier-to-work-with format. For example legacy validation is a string with space-separated keywords, which would be much easier to work with if it was an object with the corresponding properties as bools, so a parser "legacy-validation" converts this string to this object representation. Also for weird resolutions of copying values like rights_holder a parsing-rule looked at if the license was public domain and if so did not copy this value from author/source (if not specified) otherwise it does. Parsing rules are pretty powerful as they can list what paths should be resolved before they are enacted. The parser class provides a factory to get the corresponding parsing rule based on its name and type.
Quirks of specification
Because the specification has some quirks like discussed in section about parsers, we took some liberties to convert everything to a more standardized way after loading in the config. This does not mean it will misinterpret the config as it is specified, but rather that after the config is loaded it might not look exactly the same as in the specification. For example for "grading" in the legacy format (which was deprecated from legacy but should stay I was told), the format specified some floats as type "string", which we just interpret as floats instead. Also, after having parsed the config the layout of the config will be entirely standardized, and a given path into the dictionary will always give the same type no matter the input data. All values should be defaulted to something after config is loaded, so as long as the path exists, it should not give a indexing-error.