Skip to content

Commit 7dcef7d

Browse files
authored
Merge pull request #37 from MITLibraries/TIMX-356-limit-parallel-containers
Limit parallel containers of Transmogrifier
2 parents adfe9c7 + bfcf42a commit 7dcef7d

File tree

6 files changed

+277
-147
lines changed

6 files changed

+277
-147
lines changed

README.md

Lines changed: 12 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,15 @@ Compare transformed TIMDEX records from two versions (A,B) of Transmogrifier.
66

77
`abdiff` is the name of the CLI application in this repository that performs an A/B test of Transmogrifier.
88

9+
## Development
10+
11+
- To preview a list of available Makefile commands: `make help`
12+
- To install with dev dependencies: `make install`
13+
- To update dependencies: `make update`
14+
- To run unit tests: `make test`
15+
- To lint the repo: `make lint`
16+
- To run the app: `pipenv run abdiff --help`
17+
918
## Concepts
1019

1120
A **Job** in `abdiff` represents the A/B test for comparing the results from two versions of Transmogrifier. When a job is first created, a working directory and a JSON file `job.json` with an initial set of configurations is created.
@@ -82,7 +91,9 @@ AWS_SESSION_TOKEN=# passed to Transmogrifier containers for use
8291

8392
```text
8493
WEBAPP_HOST=# host for flask webapp
85-
WEBAPP_PORT# port for flask webapp
94+
WEBAPP_PORT=# port for flask webapp
95+
TRANSMOGRIFIER_MAX_WORKERS=# max number of Transmogrifier containers to run in parallel; default is 6
96+
TRANSMOGRIFIER_TIMEOUT=# timeout for a single Transmogrifier container; default is 5 hours
8697
```
8798

8899
## CLI commands
@@ -147,28 +158,3 @@ Options:
147158
-h, --help Show this message and exit.
148159
```
149160

150-
## Development
151-
152-
- To preview a list of available Makefile commands: `make help`
153-
- To install with dev dependencies: `make install`
154-
- To update dependencies: `make update`
155-
- To run unit tests: `make test`
156-
- To lint the repo: `make lint`
157-
- To run the app: `pipenv run abdiff --help`
158-
159-
## Environment Variables
160-
161-
### Required
162-
163-
```shell
164-
WORKSPACE=### Set to `dev` for local development, this will be set to `stage` and `prod` in those environments by Terraform.
165-
```
166-
167-
### Optional
168-
169-
```shell
170-
```
171-
172-
173-
174-

abdiff/config.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ class Config:
1313
OPTIONAL_ENV_VARS = (
1414
"WEBAPP_HOST",
1515
"WEBAPP_PORT",
16+
"TRANSMOGRIFIER_MAX_WORKERS",
17+
"TRANSMOGRIFIER_TIMEOUT",
1618
)
1719

1820
def __getattr__(self, name: str) -> Any: # noqa: ANN401
@@ -31,6 +33,18 @@ def webapp_port(self) -> int:
3133
port = self.WEBAPP_PORT or "5000"
3234
return int(port)
3335

36+
@property
37+
def transmogrifier_max_workers(self) -> int:
38+
"""Maximum number of Transmogrifier containers to run in parallel."""
39+
max_workers = self.TRANSMOGRIFIER_MAX_WORKERS or 6
40+
return int(max_workers)
41+
42+
@property
43+
def transmogrifier_timeout(self) -> int:
44+
"""Timeout for a single Transmogrifier container."""
45+
timeout = self.TRANSMOGRIFIER_TIMEOUT or 60 * 60 * 5 # 5 hours default
46+
return int(timeout)
47+
3448

3549
def configure_logger(logger: logging.Logger, *, verbose: bool) -> str:
3650
if verbose:

abdiff/core/exceptions.py

Lines changed: 8 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -3,30 +3,25 @@ def __init__(self, run_id: str) -> None:
33
super().__init__(f"No Docker containers were found with label run='{run_id}'.")
44

55

6-
class DockerContainerRuntimeExceededTimeoutError(Exception):
7-
def __init__(self, containers: list, timeout: int) -> None:
8-
self.containers = containers
6+
class DockerContainerTimeoutError(Exception):
7+
def __init__(self, container_id: str | None, timeout: int) -> None:
8+
self.container_id = container_id
99
self.timeout = timeout
1010
super().__init__(self.get_formatted_message())
1111

1212
def get_formatted_message(self) -> str:
13-
container_ids = [container.id for container in self.containers]
14-
return (
15-
f"Timeout of {self.timeout} seconds exceeded."
16-
f"{len(container_ids)} container(s) is/are still running:"
17-
f"{container_ids}."
18-
)
13+
return f"Container {self.container_id} exceed timeout of {self.timeout} seconds."
1914

2015

2116
# core function errors
22-
class DockerContainerRunFailedError(Exception):
23-
def __init__(self, containers: list) -> None:
24-
self.containers = containers
17+
class DockerContainerRuntimeError(Exception):
18+
def __init__(self, container_id: str) -> None:
19+
self.container_id = container_id
2520
super().__init__(self.get_formatted_message())
2621

2722
def get_formatted_message(self) -> str:
2823
return (
29-
f"The following Docker containers exited with an error: {self.containers}. "
24+
f"Container {self.container_id} did not exit cleanly. "
3025
"Check the logs in transformed/logs.txt to identify the error."
3126
)
3227

0 commit comments

Comments
 (0)