Tracking derived and other related benchmarks #43

Tomaqa · 2025-08-27T13:54:45Z

Tomaqa
Aug 27, 2025

Some problems can be encoded in various ways, resulting in e.g. many possible logics. For example, an LRA problem combined with traversing a graph can result in combinations such as LRA, LIRA, BVLRA, UFLRA, etc. Such encodings can result in quite different performance across solvers. Thanks to @hansjoergschurr, we have a database that may allow us to easily filter out related benchmarks and compare their performance.

Generally speaking, different encodings can also be done within the same logic, while still potentially affecting the performance. Not only for the filtering, but also for the sake of maintaining the order of the benchmarks, it would be reasonable to include explicit metadata in such similar benchmarks.

For example, say that an existing benchmark is encoded in LRA, and one wants to submit an alternative encoding of the same benchmark encoded in LIRA. That would require adding a field such as

(set-info :derived-from <orig_bench>)

This would require a unique identifier for the benchmark. A straightforward possibility is to use the relative path of the benchmark used in the repository, including the family name etc.
For example, given a benchmark non-incremental/QF_LRA/AddamsFamily/bench1.smt2, its derived benchmark non-incremental/QF_LIRA/AddamsFamily/bench1.smt2 would include field

(set-info :derived-from "QF_LRA/AddamsFamily/bench1.smt2")

assuming that the derived benchmarks always stay in the same track.
The above is a bit verbose (only the logic differs), not sure about some abbreviations.

A "derived-from" relationship probably does not cover all possibilities of "similar" benchmarks, but I do not have a better suggestion at the moment.

What do you think?
For example,
@hansjoergschurr @mpreiner @aehyvari @bobot

hansjoergschurr · 2025-09-02T21:10:28Z

hansjoergschurr
Sep 2, 2025
Collaborator

I like this idea!
Since we would use set-info it isn't necessary to add it to the standard, we just have to document it somewhere central.

I believe it would be better if we could be more precise. derived-from doesn't give much information. I can immediately think of multiple ways how two benchmarks can be related:

They might be two different encodings of the same problem
One might be a preprocessed version of another
One might be an incomplete re-encoding of the other (e.g., a linearization)
One might be shuffled version of another (including things like Mariposa)

Maybe we can come of with an "ontology" of how benchmarks relate. With ontology I mean something like the SZS ontology for solver return values. We don't need to go that detailed, but we could define some common relations.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking derived and other related benchmarks #43

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Tracking derived and other related benchmarks #43

Uh oh!

Tomaqa Aug 27, 2025

Replies: 1 comment

Uh oh!

hansjoergschurr Sep 2, 2025 Collaborator

Tomaqa
Aug 27, 2025

hansjoergschurr
Sep 2, 2025
Collaborator