Project roadmap

This is a tracking issue used to document the current set of features we would like to integrate into gantry.

This thread should also be used to discuss new directions for the project.

- [ ] #73
- [ ] #75
- [ ] #74
- [ ] #76
- [ ] #77

-----

## Plan

1. In the pilot phase, we will only be implementing predictions for requests, and ensuring that they will only increase compared to current allocations.
2. If we see success in the pilot, we'll implement functionality which retries jobs with higher memory allocations if they've been shown to fail due to OOM kills.
3. Then, we will "drop the floor" and allow the predictor to allocate less memory than the package is used to. At this step, requests will be fully implemented.
4. Limits for CPU and memory will be implemented.
5. Next, we want to introduce some experimentation in the system and perform a [scaling study](https://github.com/spack/spack-gantry/issues/76).
6. Design a scheduler that decides which instance type a job should be placed on based on cost and expected usage and runtime.

## Evaluation

The success of this framework can be evaluated against a number of factors:

- Has the cost per job changed?
- Are jobs being killed due to resource contention?
- What is the error distribution of our predictions?
- How much waste is there per build type?




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Project roadmap #71

Plan

Evaluation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Project roadmap #71

Description

Plan

Evaluation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions