Skip to content

2.0 - Scheduler Plugins #64

@pflarr

Description

@pflarr

Scheduler Plugins

The process of running and scheduling jobs is as follows. Steps that actually involve the scheduler are in bold:

  1. Test configs are resolved into almost final individual test configurations.
  2. Tests are grouped by scheduler.
  3. The scheduler relevant section of each test config has its variables resolved.
  4. The scheduler relevent test sections are given to the scheduler, which returns a set of minimum resource requirements across all tests and the 'sched' variable values.
  5. The 'sched' variables are used to completely resolve the variables in each test config.
  6. A 'PavTest' object is created for each test, and the test directories (and ID's) are created.
  7. The scheduler is given the requirements set and the list of tests to schedule.
  8. The scheduler writes a kickoff script that will individually run each test via pavilion.
  9. The scheduler schedules the kickoff script.
  10. The scheduler verifies that the kickoff script should actually run given the resources still available.
  11. Pavilion gives the user the list of test ID's and confirms that the tests have been kicked off.

For schedulers that actually schedule jobs on a cluster, the kickoff script is expected to run on an allocation sized to the largest test it expects to run. The tests themselves should run on pieces of that allocation scheduled within itself. This may not be possible on all schedulers, but is for slurm (and probably Moab). The kickoff script does the following:

  1. Sets up the environment necessary to run pavilion.
  2. Loops through each test serially:
    a. Issues the pav do_build command for the test.
    b. Issues the pav do_run command to run the test.
  3. Updates the test status (using the pav status command) before and after each step.

The pav do_run command does the following.

  1. Finds the test based on test id.
  2. Writes the test run script.
  3. Schedules and runs the test run script.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions