Skip to content

Commit 02cbce7

Browse files
committed
merge: Merge branch UI_new_schema
2 parents 8d89fe4 + 49a1310 commit 02cbce7

12 files changed

Lines changed: 809 additions & 205 deletions

File tree

SCHEMA.md

Lines changed: 312 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,312 @@
1+
# OPL Schema
2+
3+
The OPL schema catalogs optimization **problems**, **suites**, **generators**, and their **implementations** in a single, machine-readable format.
4+
5+
Three design choices shape everything below:
6+
7+
1. **One flat library, keyed by ID.**
8+
Every entity lives in a `Library` dict.
9+
Suites reference problems, problems reference implementations using their respective ID.
10+
There is no embedding of problems or implementations within suites to facilitate reuse.
11+
F.e. an implementation might be referenced by multiple problems or suites.
12+
2. **Numeric fields accept a scalar, a set, or a range.**
13+
A problem may have exactly `2` objectives, one of `{2, 3, 5}`, or any value in `{min: 2, max: 50}`.
14+
The same union type is used for variable dimensions and constraint counts.
15+
3. **Three-valued logic for yes/no fields.**
16+
Many boolean fields (f.e. `hard`, `allows_partial_evaluation`, ...) [`YesNoSome`](#yesnosome) as their value.
17+
We lose some expressive power but simplify the data entry.
18+
If we force authors to decide on yes or no, then we would need more complex structures for variables, constraints etc. and that would make the usual case unnecessarily complex.
19+
20+
## Contents
21+
22+
- [Library](#library)
23+
- [Thing types](#thing-types)
24+
- [Implementation](#implementation)
25+
- [ProblemLike](#problemlike) (shared fields)
26+
- [Problem](#problem)
27+
- [Suite](#suite)
28+
- [Generator](#generator)
29+
- [Shared building blocks](#shared-building-blocks)
30+
- [Variable](#variable) / [VariableType](#variabletype)
31+
- [Constraint](#constraint) / [ConstraintType](#constrainttype)
32+
- [Reference](#reference) / [Link](#link)
33+
- [ValueRange](#valuerange)
34+
- [YesNoSome](#yesnosome)
35+
36+
---
37+
38+
## Notation / Conventions
39+
40+
- When an attribute is followed by a `?`, it is optional and can be left out.
41+
- When we refer to a list of unique items, we call them a set.
42+
Technically they are a set in Python, but in the YAML representation they are a list.
43+
However, they _must_ be unique (i.e. obey the set property)
44+
45+
## Library
46+
47+
A `Library` is a dict from ID to a [Thing](#thing-types).
48+
IDs are free-form but must be unique and the convention is to add a prefix marking the type to avoid collisions:
49+
50+
| Prefix | Type |
51+
|---------|------------------|
52+
| `impl_` | Implementation |
53+
| `fn_` | Problem |
54+
| `suite_`| Suite |
55+
| `gen_` | Generator |
56+
57+
On load the library validates that every ID referenced by a suite (`problems`) or problem (`implementations`) exists and has the correct type. Suites also have their `fidelity_levels` auto-populated from their problems.
58+
59+
```yaml
60+
impl_coco:
61+
type: implementation
62+
name: COCO
63+
description: Comparing Continuous Optimisers
64+
fn_sphere:
65+
type: problem
66+
name: Sphere
67+
objectives: [1]
68+
implementations: [impl_coco]
69+
suite_bbob:
70+
type: suite
71+
name: BBOB
72+
problems: [fn_sphere]
73+
```
74+
75+
---
76+
77+
## Thing types
78+
79+
All entities inherit from `Thing`, which only carries a discriminator:
80+
81+
```yaml
82+
type: problem # or: suite | generator | implementation
83+
```
84+
85+
We want to have as flat a structure as possible to make exploring and searching OPL as easy as possible.
86+
That's one of the reasons the top level object is a dictionary of dissimilar things.
87+
But we need to be able to tell them apart so we have a `type` field to discriminate between them.
88+
89+
### Implementation
90+
91+
A pointer to code that implements one or more problems.
92+
Intentionally minimal so that the schema describes *what* a problem is, not how to run it.
93+
There are separate files which contain curated usage examples for problems or suites keyed by their respective IDs.
94+
95+
| Field | Type | Notes |
96+
|-------------------|-----------------------------------|----------------------------------------------|
97+
| `name` | str | required |
98+
| `description` | str | required |
99+
| `language` | str? (e.g. `Python`, `C`) | |
100+
| `links` | list of [Link](#link)? | repo, release, docs… |
101+
| `evaluation_time` | set of str? | free-form list ("8 minutes", "fast") |
102+
| `requirements` | str or list of str? | URL to requirements file or list of packages |
103+
104+
```yaml
105+
impl_coco:
106+
type: implementation
107+
name: COCO
108+
description: Comparing Continuous Optimisers benchmarking platform
109+
language: c
110+
links:
111+
- {type: repository, url: https://github.com/numbbo/coco-experiment}
112+
impl_py_cocoex:
113+
type: implementation
114+
name: Python bindings for COCO
115+
description: The Python bindings for the experimental part of the COCO framework
116+
language: Python
117+
links:
118+
- {type: source, url: https://github.com/numbbo/coco-experiment/tree/main/build/python}
119+
- {type: package, url: https://pypi.org/project/coco-experiment/}
120+
```
121+
122+
### ProblemLike
123+
124+
Fields shared by [Problem](#problem), [Suite](#suite), and [Generator](#generator).
125+
The schema deliberately puts most descriptive fields here so suites can be characterised without explicitly having to add all problems in the suite.
126+
127+
| Field | Type | Notes |
128+
|------------------------------------------|------------------------------------------------|----------------------------------------------------|
129+
| `name` | str | required |
130+
| `long_name` | str? | |
131+
| `description` | str? (markdown) | longer prose |
132+
| `tags` | set of str? | free-form keywords |
133+
| `references` | set of [Reference](#reference)? | |
134+
| `implementations` | set of IDs? | must resolve to [Implementation](#implementation)s |
135+
| `objectives` | set of int? | e.g. `{1}`, `{2, 3}` — **not** a ValueRange |
136+
| `variables` | set of [Variable](#variable)? | |
137+
| `constraints` | set of [Constraint](#constraint)? | omit entirely for unconstrained |
138+
| `dynamic_type` | set of str? | `{"no"}`, `{"time-varying"}`… |
139+
| `noise_type` | set of str? | `{"none"}`, `{"gaussian"}`… |
140+
| `allows_partial_evaluation` | [YesNoSome](#yesnosome)? | |
141+
| `can_evaluate_objectives_independently` | [YesNoSome](#yesnosome)? | |
142+
| `modality` | set of str? | `{"unimodal"}`, `{"multimodal"}` |
143+
| `fidelity_levels` | set of int? | `{1}` = single-fidelity, `{1,2}` = multi-fidelity |
144+
| `code_examples` | set of str? | paths to example scripts |
145+
| `evaluation_time` | set of str? | free-form list ("8 minutes", "fast") |
146+
| `source` | set of str? | `{"artificial"}`, `{"real-world"}` |
147+
148+
> `objectives` is a set of integers because we don't assume extreme scalability in this property so explicit enumeration is fine.
149+
> Dimensions of variables on the other hand are ranges because here problems often are scalable over wide ranges.
150+
151+
When no `evaluation_time` is set, it percolates up from any referenced implementations.
152+
The same is true for the `variables` and `constraints` properties of a suite that has references to problems.
153+
154+
### Problem
155+
156+
One optimization problem (possibly parameterised by instances).
157+
158+
Adds:
159+
160+
| Field | Type | Notes |
161+
|-------------|--------------------------------------------|--------------------------------------------|
162+
| `instances` | [ValueRange](#valuerange) or list of str? | e.g. `{min: 1, max: 15}` or named variants |
163+
164+
```yaml
165+
fn_sphere:
166+
type: problem
167+
name: Sphere
168+
objectives: [1]
169+
variables: [{type: continuous, dim: {min: 2, max: 40}}]
170+
modality: [unimodal]
171+
source: [artificial]
172+
instances: {min: 1, max: 15}
173+
implementations: [impl_coco]
174+
```
175+
176+
### Suite
177+
178+
A curated, fixed collection of problems.
179+
180+
Adds:
181+
182+
| Field | Type | Notes |
183+
|------------|--------------|-----------------------------------------------|
184+
| `problems` | set of IDs? | must resolve to [Problem](#problem)s |
185+
186+
`fidelity_levels` is auto-unioned from member problems at validation time.
187+
188+
```yaml
189+
suite_bbob:
190+
type: suite
191+
name: BBOB
192+
problems: [fn_sphere, fn_rosenbrock, fn_rastrigin]
193+
objectives: [1]
194+
source: [artificial]
195+
implementations: [impl_coco]
196+
```
197+
198+
### Generator
199+
200+
A parametric family of problems — unlike a [Suite](#suite), the member problems are not enumerated. Uses the same fields as [ProblemLike](#problemlike) with no additions; the distinction from [Problem](#problem) is that a generator produces instances on demand.
201+
202+
```yaml
203+
gen_mpm2:
204+
type: generator
205+
name: MPM2
206+
description: Multiple peaks model, second instantiation
207+
objectives: [1]
208+
variables: [{type: continuous, dim: {min: 1}}]
209+
modality: [multimodal]
210+
```
211+
212+
---
213+
214+
## Shared building blocks
215+
216+
### Variable
217+
218+
A group of decision variables of the same type.
219+
Multi-type problems list multiple entries.
220+
While you can have multiple entries of the same type, this should be justified in some way like when you can evaluate the problem on only one subset of variables.
221+
222+
| Field | Type | Default |
223+
|--------|-----------------------------------------------|----------------------|
224+
| `type` | [VariableType](#variabletype) | `unknown` |
225+
| `dim` | int, set of int, [ValueRange](#valuerange), or null | `0` |
226+
227+
```yaml
228+
variables:
229+
- {type: continuous, dim: 10}
230+
- {type: integer, dim: {min: 1, max: 5}}
231+
```
232+
233+
### VariableType
234+
235+
`continuous | integer | binary | categorical | unknown`.
236+
Use `unknown` for permutation/combinatorial problems the schema doesn't yet distinguish **and** add an appropriate tag.
237+
We are actively watching for unknown variable types and are open to extending the above list if there is a critical mass of problems to justify it.
238+
239+
### Constraint
240+
241+
A group of constraints.
242+
To indicate that the problem is unconstrained, you need an _empty_ `constraints` field.
243+
A missing `constraints` field or if it is set to `null` means it is not known if unconstrained.
244+
245+
| Field | Type | Notes |
246+
|------------|-----------------------------------------------|------------------------------------|
247+
| `type` | [ConstraintType](#constrainttype) | default `unknown` |
248+
| `hard` | [YesNoSome](#yesnosome)? | hard vs. soft |
249+
| `equality` | [YesNoSome](#yesnosome)? | equality vs. inequality |
250+
| `number` | int, set of int, [ValueRange](#valuerange), null | |
251+
252+
```yaml
253+
constraints:
254+
- {type: box, hard: yes, number: 10}
255+
- {type: linear, hard: some, equality: no, number: {min: 1}}
256+
```
257+
258+
### ConstraintType
259+
260+
`box | linear | function | unknown`. `function` covers non-linear/black-box constraints.
261+
262+
### Reference
263+
264+
Bibliographic pointer.
265+
Requires either a `title` or a `link` and optionally a list of `authors`.
266+
267+
```yaml
268+
references:
269+
- title: "Honey Badger Algorithm: New metaheuristic algorithm for solving optimization problems."
270+
authors:
271+
- Fatma A. Hashim
272+
- Essam H. Houssein
273+
- Kashif Hussain
274+
- Mai S. Mabrouk
275+
- Walid Al-Atabany
276+
link: {type: doi, url: "https://doi.org/10.1016/j.matcom.2021.08.013"]
277+
```
278+
279+
### Link
280+
281+
`{type?: str, url: str}`.
282+
`type` is free-form (`repository`, `arxiv`, `paper`, `doi`, ...).
283+
`url` is a URL to some resource.
284+
285+
If `type` is `doi`, please use the full URL (starting with `https://doi.org/...`) instead of the raw DOI.
286+
287+
### ValueRange
288+
289+
An inclusive numeric range type.
290+
At least one of `min`/`max` must be given.
291+
If `min` is given and `max` is missing, it does not imply that there is no upper bound.
292+
There might be one, it is just not known.
293+
The same applies for the case where `max` is given and `min` is missing.
294+
295+
```yaml
296+
dim: {min: 2} # 2 or more
297+
dim: {min: 2, max: 40} # between 2 and 40
298+
dim: {max: 100} # up to 100
299+
```
300+
301+
Used by `Variable.dim`, `Constraint.number`, `Problem.instances`.
302+
303+
### YesNoSome
304+
305+
Three-valued flag: `yes | no | some | ?` (the last serialises as the literal `'?'` string, meaning unknown).
306+
`some` captures the common case where *part* of something has some property.
307+
For example only some constraints might hard but we don't know the exact number of hard and soft constraints, only the total number.
308+
309+
```yaml
310+
constraints: [{type: box, hard: some}]
311+
allows_partial_evaluation: "unknown"
312+
```

0 commit comments

Comments
 (0)