-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Milestone
Description
Scheduler Plugins
The process of running and scheduling jobs is as follows. Steps that actually involve the scheduler are in bold:
- Test configs are resolved into almost final individual test configurations.
- Tests are grouped by scheduler.
- The scheduler relevant section of each test config has its variables resolved.
- The scheduler relevent test sections are given to the scheduler, which returns a set of minimum resource requirements across all tests and the 'sched' variable values.
- The 'sched' variables are used to completely resolve the variables in each test config.
- A 'PavTest' object is created for each test, and the test directories (and ID's) are created.
- The scheduler is given the requirements set and the list of tests to schedule.
- The scheduler writes a kickoff script that will individually run each test via pavilion.
- The scheduler schedules the kickoff script.
- The scheduler verifies that the kickoff script should actually run given the resources still available.
- Pavilion gives the user the list of test ID's and confirms that the tests have been kicked off.
For schedulers that actually schedule jobs on a cluster, the kickoff script is expected to run on an allocation sized to the largest test it expects to run. The tests themselves should run on pieces of that allocation scheduled within itself. This may not be possible on all schedulers, but is for slurm (and probably Moab). The kickoff script does the following:
- Sets up the environment necessary to run pavilion.
- Loops through each test serially:
a. Issues thepav do_buildcommand for the test.
b. Issues thepav do_runcommand to run the test. - Updates the test status (using the
pav statuscommand) before and after each step.
The pav do_run command does the following.
- Finds the test based on test id.
- Writes the test run script.
- Schedules and runs the test run script.
Metadata
Metadata
Assignees
Labels
No labels