Improve example testing #651

hbierlee · 2025-04-21T11:20:58Z

I think the examples are great, and that it is important to have them work in case someone tries them out. This PR improves the example testing script:

Most importantly, it runs the example with __main__ set, otherwise a lot of examples are skipped.
- This shows many additional failures, some of which may be due to the example being outdated (perhaps worth fixing), or some other type of bug.
- We can make separate issues for each case, or fix them here if it's easy enough
It parametrizes the solver to run the example, so other solvers can be tested
It adds a timeout, which should probably be counted as "skipped", not "failed"
- This does require an additional dependency of .[test], namely pytest-timeout

Seems to take very long (but not on master?)

Unlike pytest-xdist, the user *needs* timeouts to run the test-suite completely

hbierlee · 2025-04-21T13:40:52Z

Following tests are now newly failing for a TO of 1 minute, running with or-tools and minizinc (gurobi is set in the test, but not installed in CI, so is skipped):

2025-04-21T12:46:38.5297654Z =========================== short test summary info ============================
2025-04-21T12:46:38.5298041Z FAILED tests/test_examples.py::test_examples[ortools-./examples/csplib/prob007_all_interval.py] - ValueError: Problem is unsatisfiable
2025-04-21T12:46:38.5298585Z FAILED tests/test_examples.py::test_examples[ortools-./examples/advanced/counterfactual_explain.py] - TypeError: LinearExpr::weighted_sum() only accept constants as coefficients: 'bool_'
2025-04-21T12:46:38.5299044Z FAILED tests/test_examples.py::test_examples[ortools-./examples/advanced/ocus_explanations.py] - TypeError: unhashable type: 'set'
2025-04-21T12:46:38.5299422Z FAILED tests/test_examples.py::test_examples[ortools-./examples/csplib/prob054_n_queens.py] - Failed: Timeout >60.0s
2025-04-21T12:46:38.5300139Z FAILED tests/test_examples.py::test_examples[ortools-./examples/csplib/prob011_basketball_schedule.py] - cpmpy.exceptions.TypeError: and-operator only accepts boolean arguments, not [config[2,0] == 8 config[2,1] == 8 config[2,2] == 8 config[2,3] == 8
2025-04-21T12:46:38.5300271Z  config[2,4] == 8 config[2,5] == 8 config[2,6] == 8 config[2,7] == 8
2025-04-21T12:46:38.5300341Z  config[2,8] == 8]
2025-04-21T12:46:38.5300809Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob001_convert_data.py] - FileNotFoundError: [Errno 2] No such file or directory: 'data.txt'
2025-04-21T12:46:38.5301178Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob007_all_interval.py] - ValueError: Problem is unsatisfiable
2025-04-21T12:46:38.5301608Z FAILED tests/test_examples.py::test_examples[ortools-./examples/csplib/prob044_steiner.py] - Failed: Timeout >60.0s
2025-04-21T12:46:38.5302067Z FAILED tests/test_examples.py::test_examples[ortools-./examples/csplib/prob001_convert_data.py] - FileNotFoundError: [Errno 2] No such file or directory: 'data.txt'
2025-04-21T12:46:38.5302369Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/sudoku_chockablock.py] - Failed: Timeout >60.0s
2025-04-21T12:46:38.5302733Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob026_sport_scheduling.py] - Failed: Timeout >60.0s
2025-04-21T12:46:38.5303105Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/blocks_world.py] - Failed: Timeout >60.0s
2025-04-21T12:46:38.5303422Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob054_n_queens.py] - Failed: Timeout >60.0s
2025-04-21T12:46:38.5304128Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob011_basketball_schedule.py] - cpmpy.exceptions.TypeError: and-operator only accepts boolean arguments, not [config[2,0] == 8 config[2,1] == 8 config[2,2] == 8 config[2,3] == 8
2025-04-21T12:46:38.5304257Z  config[2,4] == 8 config[2,5] == 8 config[2,6] == 8 config[2,7] == 8
2025-04-21T12:46:38.5304326Z  config[2,8] == 8]
2025-04-21T12:46:38.5304617Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/flexible_jobshop.py] - Failed: Timeout >60.0s
2025-04-21T12:46:38.5305077Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/advanced/counterfactual_explain.py] - AttributeError: 'NoneType' object has no attribute 'name'
2025-04-21T12:46:38.5305512Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/advanced/ocus_explanations.py] - TypeError: unhashable type: 'set'
2025-04-21T12:46:38.5305834Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob033_word_design.py] - Failed: Timeout >60.0s
2025-04-21T12:46:38.5306141Z FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob044_steiner.py] - Failed: Timeout >60.0s
2025-04-21T12:46:38.5306425Z ==== 19 failed, 46311 passed, 87 skipped, 56 warnings in 1642.30s (0:27:22) ====

Some examples get all solutions as default, I will lower that, which might fix some timeouts. Note the CI takes now 27 minutes rather than the 22 minutes on main.

hbierlee · 2025-04-21T14:13:08Z

=========================== short test summary info ============================
FAILED tests/test_examples.py::test_examples[ortools-./examples/csplib/prob007_all_interval.py] - ValueError: Problem is unsatisfiable
FAILED tests/test_examples.py::test_examples[ortools-./examples/csplib/prob011_basketball_schedule.py] - cpmpy.exceptions.TypeError: and-operator only accepts boolean arguments, not [config[2,0] == 8 config[2,1] == 8 config[2,2] == 8 config[2,3] == 8
 config[2,4] == 8 config[2,5] == 8 config[2,6] == 8 config[2,7] == 8
 config[2,8] == 8]
FAILED tests/test_examples.py::test_examples[ortools-./examples/advanced/counterfactual_explain.py] - TypeError: LinearExpr::weighted_sum() only accept constants as coefficients: 'bool_'
FAILED tests/test_examples.py::test_examples[ortools-./examples/advanced/ocus_explanations.py] - TypeError: unhashable type: 'set'
FAILED tests/test_examples.py::test_examples[ortools-./examples/csplib/prob001_convert_data.py] - FileNotFoundError: [Errno 2] No such file or directory: 'data.txt'
FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob001_convert_data.py] - FileNotFoundError: [Errno 2] No such file or directory: 'data.txt'
FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob007_all_interval.py] - ValueError: Problem is unsatisfiable
FAILED tests/test_examples.py::test_examples[minizinc-./examples/sudoku_chockablock.py] - Failed: Timeout >60.0s
FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob026_sport_scheduling.py] - Failed: Timeout >60.0s
FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob011_basketball_schedule.py] - cpmpy.exceptions.TypeError: and-operator only accepts boolean arguments, not [config[2,0] == 8 config[2,1] == 8 config[2,2] == 8 config[2,3] == 8
 config[2,4] == 8 config[2,5] == 8 config[2,6] == 8 config[2,7] == 8
 config[2,8] == 8]
FAILED tests/test_examples.py::test_examples[minizinc-./examples/blocks_world.py] - Failed: Timeout >60.0s
FAILED tests/test_examples.py::test_examples[minizinc-./examples/flexible_jobshop.py] - Failed: Timeout >60.0s
FAILED tests/test_examples.py::test_examples[minizinc-./examples/advanced/counterfactual_explain.py] - AttributeError: 'NoneType' object has no attribute 'name'
FAILED tests/test_examples.py::test_examples[minizinc-./examples/advanced/ocus_explanations.py] - TypeError: unhashable type: 'set'
FAILED tests/test_examples.py::test_examples[minizinc-./examples/csplib/prob033_word_design.py] - Failed: Timeout >60.0s
==== 15 failed, 46315 passed, 87 skipped, 56 warnings in 1518.15s (0:25:18) ====

Probably most of these are just problems in the examples themselves, which I can probably fix in this PR, and the >60.0s ones we can see if it is intentional that they take so long, and if so, add them to the example ignore list in the test. If I find anything substantial I'll make an issue.

…mple-testing

hbierlee · 2025-04-25T15:42:08Z

Nice job! From a glance it also seems the examples are much improved, also in style / simpler constraints, but I don't know them that well. I'll add a few small comments

Quick comment for tias on what is run with this new, it is similar to the old one which also ran three solvers (or-tools, minizinc, gurobi), if they are installed, otherwise skip. Note gurobi is not installed on the CI, we might want to look into adding that, and seeing if it is still so slow now that the examples are improved (and solutions limited in many cases). Also I see the CI only takes 20 minutes now, even less than before adding more examples (which might be slightly strange), but anyway I suggest keeping both solvers on.

Here is one more to clean up if you don't have exact installed:

FAILED tests/test_examples.py::test_advanced_example[./examples/advanced/exact_maximal_propagate.py] - Exception: CPM_exact: Install the python package 'exact' to use this solver interface.

hbierlee

not an in-depth review, just two small things I saw

tests/test_examples.py

tias

some questions, for clarity and esp to make sure these are not coincidental changes

tias · 2025-04-28T20:09:56Z

examples/advanced/counterfactual_explain.py

+            x_d_int = x_d.astype(int)
+            x_0_val_int = x_0.value().astype(int)
+            # Add constraint using integer coefficients
+            master_model += [sum(d * x_d_int) >= sum(d * x_0_val_int)]


I'm surprised these casts are needed?

Without the casts, this error is thrown in the master_model solving (OR-tools):
TypeError: Not a number: False of type <class 'numpy.bool_'>

tias · 2025-04-28T20:10:23Z

examples/advanced/ocus_explanations.py

@@ -209,7 +208,7 @@ def explain_one_step_ocus(hard, soft_lit, cost, remaining_sol_to_explain, solver
            print("\n\t hs =", hs, S)

        # SAT check and computation of model
-        if not SAT.solve(assumptions=S):
+        if not SAT.solve(assumptions=list(S)):


why is this needed, to wrap S?

In the or-tools interface, the solver_var method contains this line: if cpm_var not in self._varmap:, which means that if cpm_var is not hashable then there is a TypeError raised. So, this wrapping is needed to convert the set to a list before solving. This is the error raised if not wrapped:

File ".../cpmpy/cpmpy/solvers/ortools.py", line 293, in solver_var if cpm_var not in self._varmap: TypeError: unhashable type: 'set'

tias · 2025-04-28T20:11:09Z

examples/advanced/ocus_explanations.py

-        # return soft weight if constraint is a soft constraint
-        if len(set({cons}) & set(soft)) > 0:
+        # return soft weight if the constraint is a soft constraint
+        if len({cons} & set(soft)) > 0:


this is odd code... if cons in set(soft)?

indeed, corrected.

tias · 2025-04-28T20:13:53Z

examples/csplib/prob011_basketball_schedule.py

-                           (config[d+1, t] == DUKE) & (where[d+1,t] == AWAY))
-                model += ~((config[d, t] == DUKE) & (where[d,t] == AWAY) &
-                           (config[d+1, t] == UNC) & (where[d+1,t] == AWAY))
+                model += ((config[d, t] == UNC) & (where[d, t] == AWAY) &


your code change removes the ~, the not...
and it replaces it by 'implies(False)', which I find MUCH more unintuitive then just negating the statement!? why would this be better?

This change fixes an error with the previous version of the constraints. But it is indeed unintuitive, so I changed it back to the earlier version, while keeping the fix of the error.

tias · 2025-04-28T20:15:27Z

examples/csplib/prob013_progressive_party.py

    for boat in range(n_boats):
-        model += (is_host[boat]).implies(all(visits[:,boat] == boat))
+        model += (is_host[boat]).implies((visits[:, boat] == boat).all())


is there a reason to prefer .all() over cp.all(...)? I prefer a cp.all() upfront...

Agreed and reverted.

tias · 2025-04-28T20:16:48Z

examples/csplib/prob013_progressive_party.py

    for slot in range(n_periods):
        for boat in range(n_boats):
-            model += sum((visits[slot] == boat) * crew_size) <= capacity[boat]
+            model += sum((visits[slot] == boat) * crew_size) + crew_size[boat] * is_host[boat] <= capacity[boat]


was the previous code wrong? the comment says 'number of visitors', so without the crew...? and why does being the host matters for the crew size?

Yes, the constraint was missing the host boat's crew, which is exactly what is added here. The comment was misleading (I corrected it now) as the original problem description specifies that "The total number of people aboard a boat, including the host crew and guest crews, must not exceed the capacity".

tias · 2025-04-28T20:18:25Z

tests/test_examples.py

+from cpmpy.exceptions import NotSupportedError, TransformationNotImplementedError
+import itertools
+
+prefix = '.' if 'y' in getcwd()[-2:] else '..'


what kind of magic is this?

I am not sure about this one (it was already there). From what it seems, it configures the relative path of the examples depending on whether the current working directory ends with something that contains 'y'?

hbierlee added 6 commits April 21, 2025 11:01

Parametrize test_examples for solvers

772b59e

Update test_examples

a7491cf

Remove gurobi from examples

f5bfff4

Seems to take very long (but not on master?)

Improve skip behaviour in test examples

6180dc9

Run examples as __main__

f48180b

Clean up cherry-picked changes

32a469d

hbierlee self-assigned this Apr 21, 2025

hbierlee added 2 commits April 21, 2025 12:16

Add pytest-timeout

9404043

Unlike pytest-xdist, the user *needs* timeouts to run the test-suite completely

Set some reasonable solution limit CLI defaults

13a2613

hbierlee assigned kostis-init and unassigned hbierlee Apr 23, 2025

kostis-init added 6 commits April 25, 2025 10:41

Fix printing for found solutions in all_interval problem

28a7dd2

Merge remote-tracking branch 'origin/master' into feature/improve-exa…

26d1d45

…mple-testing

Fix examples for tests

e4c4599

Fix advanced examples for tests

70b00b9

Separate tests for examples & advanced_examples

b5b15ac

Change parameter value to facilitate minizinc test time

0f39b8a

kostis-init marked this pull request as ready for review April 25, 2025 15:25

kostis-init requested a review from tias April 25, 2025 15:29

hbierlee commented Apr 25, 2025

View reviewed changes

tests/test_examples.py Outdated Show resolved Hide resolved

tests/test_examples.py Outdated Show resolved Hide resolved

kostis-init added 2 commits April 25, 2025 17:55

skip if Exact is not installed

cd9fd80

removed TO_SKIP list

3fa04a2

tias reviewed Apr 28, 2025

View reviewed changes

kostis-init added 2 commits April 29, 2025 11:44

Refactor: Use 'in' for set membership check

1c1411e

Refactor: Simplify basektball_scheduling constraints

da8f291

hbierlee mentioned this pull request Apr 29, 2025

Add pindakaas solver #600

Draft

Refactor: correct constraint comments on progressive party model

07becb4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve example testing #651

Improve example testing #651

hbierlee commented Apr 21, 2025

hbierlee commented Apr 21, 2025

hbierlee commented Apr 21, 2025

hbierlee commented Apr 25, 2025 •

edited

Loading

hbierlee left a comment

tias left a comment

tias Apr 28, 2025

kostis-init Apr 29, 2025

tias Apr 28, 2025

kostis-init Apr 29, 2025

tias Apr 28, 2025

kostis-init Apr 29, 2025

tias Apr 28, 2025

kostis-init Apr 29, 2025

tias Apr 28, 2025

kostis-init Apr 29, 2025

tias Apr 28, 2025

kostis-init Apr 29, 2025

tias Apr 28, 2025

kostis-init Apr 29, 2025

Improve example testing #651

Are you sure you want to change the base?

Improve example testing #651

Conversation

hbierlee commented Apr 21, 2025

hbierlee commented Apr 21, 2025

hbierlee commented Apr 21, 2025

hbierlee commented Apr 25, 2025 • edited Loading

hbierlee left a comment

Choose a reason for hiding this comment

tias left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hbierlee commented Apr 25, 2025 •

edited

Loading