generated from automl/automl_template
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Fill out the template and post here as comment
---
# For reference on dataset card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/datasetcard.md?plain=1
# Doc / guide: https://huggingface.co/docs/hub/datasets-cards
{}
---
# Dataset Card for carps
This dataset contains several optimizer runs on a subselection of blackbox tasks.
## Dataset Details
### Dataset Description
<!-- Provide a longer summary of what this dataset is. -->
- **Curated by:** [More Information Needed]
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
### Dataset Sources [optional]
<!-- Provide the basic links for the dataset. -->
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
<!-- Address questions around how the dataset is intended to be used. -->
### Direct Use
<!-- This section describes suitable use cases for the dataset. -->
[More Information Needed]
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the dataset will not work well for. -->
[More Information Needed]
## Dataset Structure
```yaml
# Identifying information of the run
optimizer_id: The identifier of the optimizer.
task_id: The identifier of the task.
seed: Which seed has been used. The combination of a unique `optimizer_id`, `task_id`, `seed`, equals one optimization run.
experiment_id: Experiment id. Enumeration of all runs.
# Information about the progress of the optimization
n_trials: The number of trials that have been evaluated so far.
n_function_calls: The number of times the objective function has been called. This can differ from `n_trials`, when we are in the multi-fidelity settings and can call the objective function with a lower fidelity. This still results in a full increase of `n_function_calls`, but only a fractional increase in `n_trials`.
n_trials_norm: The number of trials, normalized per run.
time: The elapsed time.
time_norm: The elapsed time, normalized per run.
# Information about the trial (ask)
trial_info__config: The configuration that has been evaluated.
trial_info__instance: The instance that the configuration has been evaluated on (None for anything but Algorithm Configuration).
trial_info__seed: The seed with which the objective function has been evaluated. In case of stochastic objective functions, we might wish to evaluate the objective function several times with different seeds for the same configuration.
trial_info__budget: The multi-fidelity resource, e.g. the number of epochs. None for anything but multi-fidelity.
trial_info__normalized_budget: The normalized budget (normalized by the maximum budget as indicated by the task).
trial_info__name: An optional name for the trial.
trial_info__checkpoint: An optional checkpoint for the trial.
# Information about the evaluated trial (tell)
## Cost related
trial_value__cost: The objective function value (lower is better).
trial_value__cost_raw: Same as `trial_value__cost`.
trial_value__cost_norm: Normalized cost, min and max are taken over all runs for one task.
## The incumbent cost
trial_value__cost_inc: The incumbent cost (best/lowest cost seen so far).
trial_value__cost_inc_norm: Incumbent cost, normalized over all runs for one task.
trial_value__cost_inc_norm_log: Logarithm of incumbent cost.
### Multi-objective
hypervolume: The hypervolume as calculated over all runs for one task.
reference_point: Reference point as determined over all runs for one task (worst seen combo).
## Time related
trial_value__time: The time the objective function took to evaluate.
trial_value__virtual_time: If the objective function is a surrogate, it can still return an evaluation time. This is marked then as virtual time.
trial_value__status: The status of the trial. Mostly hopefully `SUCCESS`.
trial_value__starttime: The start time of the objective function evaluation.
trial_value__endtime: The end time of the objective function evaluation.
# Information about the task
benchmark_id: The identifier of the benchmark collection the task belongs to.
task_type: The task type, either blackbox, multi-fidelity, multi-objective, or multi-fidelity-objective / momf.
subset_id: The subset id, mostly `None`, `dev` or `test`.
set: Another name for `subset_id`.
task.optimization_resources.n_trials: The optimization resources for the tasks in terms of number of trials.Dataset Creation
Curation Rationale
[More Information Needed]
Source Data
Data Collection and Processing
[More Information Needed]
Who are the source data producers?
[More Information Needed]
Annotations [optional]
Annotation process
[More Information Needed]
Who are the annotators?
[More Information Needed]
Personal and Sensitive Information
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations.
Citation [optional]
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Dataset Card Authors [optional]
[More Information Needed]
Dataset Card Contact
[More Information Needed]
Metadata
Metadata
Assignees
Labels
No labels