The reproducer is already available in issue #2572 .
I profiled the transformed AST and RAM. This appears to be caused by the
current left-to-right SIPS used by Soufflé's magic-set transformation. It
generates a demand relation containing the Cartesian product
candidate(a) × candidate(b).
For the rule:
prod(x) :- candidate(a), candidate(b), mul(a,b,x).
the transformed program contains:
prime(x) :-
candidate.{f}(x),
!@neglabel.prod(x).
@neglabel.prod(x) :-
@poscopy_1.candidate.{f}(a),
@poscopy_1.candidate.{f}(b),
@poscopy_1.mul.{bbf}(a,b,x).
@magic.@poscopy_1.mul.{bbf}(a,b) :-
@poscopy_1.candidate.{f}(a),
@poscopy_1.candidate.{f}(b).
After the first two body literals have been evaluated, both a and b are
considered bound. Soufflé therefore generates a mul.{bbf} demand relation.
However, this relation materializes the Cartesian product:
candidate(a) × candidate(b)
This is expensive even though most of those demands do not contribute useful
results.
With the range reduced from 500 to 200, I observed:
Without MST:
runtime: 0.57 s
mul: 1,298 tuples
recursive SCC iterations: 201
With MST:
runtime: 14.79 s
@magic.@poscopy_1.mul.bbf: 39,999 tuples
@magic.@poscopy_2.add.bbf: 33,043 tuples
recursive SCC iterations: 520
The later RAM optimizer may reorder joins inside the generated rules, but it
cannot remove a magic relation that has already been introduced by the AST
transformation.
This does not appear to be a semantic correctness bug: the transformed program
still produces the expected result. It appears to be an optimization-planning
issue. The current MST implementation uses source body order when computing
adornments and does not estimate the cost of generated demand relations or
compare the transformed plan against a no-MST alternative.
I am interested in working on a refactor of the MST implementation. On top of that refactor, I
would like to add a cost-aware SIPS heuristic that considers the estimated size
of generated demand relations and the cost of downstream joins, instead of
relying solely on source body order.
The initial goal would be to preserve the current transformation as a legacy
strategy, then add the cost-aware strategy as an alternative. A later extension
could compare the estimated cost against the original no-MST plan and skip MST
when demand propagation is expected to be more expensive.
Would the maintainers be open to this direction?
The reproducer is already available in issue #2572 .
I profiled the transformed AST and RAM. This appears to be caused by the
current left-to-right SIPS used by Soufflé's magic-set transformation. It
generates a demand relation containing the Cartesian product
candidate(a) × candidate(b).For the rule:
the transformed program contains:
After the first two body literals have been evaluated, both
aandbareconsidered bound. Soufflé therefore generates a
mul.{bbf}demand relation.However, this relation materializes the Cartesian product:
This is expensive even though most of those demands do not contribute useful
results.
With the range reduced from
500to200, I observed:The later RAM optimizer may reorder joins inside the generated rules, but it
cannot remove a magic relation that has already been introduced by the AST
transformation.
This does not appear to be a semantic correctness bug: the transformed program
still produces the expected result. It appears to be an optimization-planning
issue. The current MST implementation uses source body order when computing
adornments and does not estimate the cost of generated demand relations or
compare the transformed plan against a no-MST alternative.
I am interested in working on a refactor of the MST implementation. On top of that refactor, I
would like to add a cost-aware SIPS heuristic that considers the estimated size
of generated demand relations and the cost of downstream joins, instead of
relying solely on source body order.
The initial goal would be to preserve the current transformation as a legacy
strategy, then add the cost-aware strategy as an alternative. A later extension
could compare the estimated cost against the original no-MST plan and skip MST
when demand propagation is expected to be more expensive.
Would the maintainers be open to this direction?