code 107: Database connection error.

From `runbenchmark.20201223T011330.log` file saved to S3 (same error occurred on all datasets)

```
[DEBUG] [amlb.benchmark:01:13:30.196] Using constraint definition: { 'cores': 8,
  'folds': 10,
  'max_runtime_seconds': 3600,
  'min_vol_size_mb': 1000000,
  'name': '1h8c'}.
[INFO] [amlb.benchmarks.openml:01:13:30.196] Loading openml suite 269.
[ERROR] [amlb:01:13:30.498] https://www.openml.org/api/v1/xml/study/269 returned code 107: Database connection error. Usually due to high server load. Please wait for N seconds and try again. - None
Traceback (most recent call last):
  File "runbenchmark.py", line 118, in <module>
    bench = amlb.Benchmark(args.framework, args.benchmark, args.constraint)
  File "/repo/amlb/benchmark.py", line 75, in __init__
    self.benchmark_def, self.benchmark_name, self.benchmark_path = rget().benchmark_definition(benchmark_name, self.constraint_def)
  File "/repo/amlb/resources.py", line 181, in benchmark_definition
    hard_defaults, tasks, benchmark_path, benchmark_name = benchmark_load(name, self.config.benchmarks.definition_dir)
  File "/repo/amlb/benchmarks/parser.py", line 19, in benchmark_load
    benchmark_name, benchmark_path, tasks = load_oml_benchmark(name)
  File "/repo/amlb/benchmarks/openml.py", line 38, in load_oml_benchmark
    suite = openml.study.get_suite(oml_id)
  File "/repo/venv/lib/python3.7/site-packages/openml/study/functions.py", line 29, in get_suite
    suite = cast(OpenMLBenchmarkSuite, _get_study(suite_id, entity_type="task"))
  File "/repo/venv/lib/python3.7/site-packages/openml/study/functions.py", line 71, in _get_study
    xml_string = openml._api_calls._perform_api_call(call_suffix, "get")
  File "/repo/venv/lib/python3.7/site-packages/openml/_api_calls.py", line 62, in _perform_api_call
    __check_response(response, url, file_elements)
  File "/repo/venv/lib/python3.7/site-packages/openml/_api_calls.py", line 143, in __check_response
    raise __parse_server_exception(response, url, file_elements=file_elements)
openml.exceptions.OpenMLServerException: https://www.openml.org/api/v1/xml/study/269 returned code 107: Database connection error. Usually due to high server load. Please wait for N seconds and try again. - None
```

This was caused by the following:

```
python runbenchmark.py AutoGluon openml/s/269 1h8c -m aws -f 0 -p 100
```

Note: This is on a [fork](https://github.com/Innixma/automlbenchmark/tree/autogluon-workspace) of the automlbenchmark repo where I have set max p to 100.

Running `test` datasets works correctly.

When I run the classification datasets, ~45 of the 66 datasets fail with similar errors, while the others succeed.

```
python runbenchmark.py AutoGluon openml/s/271 1h8c -m aws -f 0 -p 100
```

This used to work fine in previous versions of automlbenchmark (From May 28th) where I would sometimes even use `-p 400` equivalents with the original 39 datasets.

It seems odd that it is trying to fetch the study instead of a particular dataset. Why does an EC2 instance need to fetch the study? It is only supposed to be training a single dataset. Also, perhaps it makes sense to have a logarithmic retry loop on accessing the data? This currently seems quite brittle.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

code 107: Database connection error. #229

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

code 107: Database connection error. #229

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions