Add a brute force option to re-use identical structs #298
Description
Currently, it's impossible for kopium to detect similar structs re-used at different hierarchies because the schemas all have to be inlined and we cannot use $ref
in schemas. kubernetes/kubernetes#62872
Suggestion
We should try to brute-force compare all structs (field names + types) to all previously seen structs in the analyzer to deduplicate so that we don't generate 3 different structs for things such as livenessProbe, resources, MatchLabels (under different names).
We already have the skeletons of such code, but it only checks for a few known ones like Condition
here and it's enabled toggled under a flag (default enabled) --no-condition
.
If we pass existing state into the extractor functions (such as extract_container
) we can then inspect current state before adding duplicates (ensuring we only take the first instance of a struct) via something like is_duplicate_struct
. It would also avoid having an awkward O(n^2) algorithm at the end.
My thinking is that it might not be a particularly heavy operation on (at most) a few hundred kB of schema data.
Limitations
This would not solve the problem where we are generating a type for something know; e.g. something like Condition
might be deduplicated with this algorithm alone, but it would not point to the correct upstream type in e.g. k8s-openapi
without futher logic.
This is probably fine however; it allows overrides to be done more easily (change / elide one struct rather than find and replace multiple), and it's a good stepping stone towards identifying them with upstream variants (such as is done for Condition
).