-
Notifications
You must be signed in to change notification settings - Fork 347
Open
Labels
Description
This issue is part of the ongoing models-team project to revamp our CI testing:
- An effort to make tests easier to understand/contribute/fix
- Fill in any testing gaps in our tests
- Optimize the CI load (no redundant tests)
- And above all, make tests green.
Description
With new features coming to our models such as new sampling parameters (logbprobs, seed, n, penalties) and prefetcher modules, we need to expand our test coverage. Since these sweep also include full sequence length sweep they should remain on their own separate pipeline, to avoid mixing with the more important demos and unit tests.
Proposal
- Create new
[Arch] Model Param Sweep Testspipelines - Create tier 1 (and some tier 2) sweep tests and add to the new pipeline
- Jobs must be split by model
- Different flavours of the same model (same codebase, different weights) should be split, E.g. Llama3-1B,3B,8B, etc.
- This ensures that if a model fails, it will not prevent the runs of the models that run after
- Different flavours of the same model (same codebase, different weights) should be split, E.g. Llama3-1B,3B,8B, etc.
- Models must be organised by a 3-tier system
- Tier 1, the most important models will always contain all the mandatory tests
- Tier 3, the least important, will just contain a demo without any performance validation, to ensure that the model still runs on the latest SW version
- Filter system for every tier and model
- A user should be able to quickly select to run the whole pipeline, or either a specific tier group of models, or either just a single model, based on needs.
- This ensures that CI won't be as clogged
- A user should be able to quickly select to run the whole pipeline, or either a specific tier group of models, or either just a single model, based on needs.
Progress
- Present the proposal to Infra team to get greenlight
- Create the new pipelines (Start with WH pipelines first, then move to BH)
- Create the relevant model tests
- Add tests to the pipeline in the correct tier
Reactions are currently unavailable