I have an idea to use pip-resolver-benchmarks and Pip code to create tests that could exactly model real world scenarios to test resolvelib.
In particular I would want to add test cases from #134 where they are represented by a json, then as part of the test prep get the pip-resolver-benchmarks code to build the wheels, and then have enough Pip code for this to fail in the current resolvelib.
While I would try and keep it minimal as possible it probably would be quite a lot of code from both pip-resolver-benchmarks and Pip to get this working. So I understand if it's not acceptable.
In particular @pradyunsg do you have any thoughts or objections?