| status | in_progress |
|---|---|
| domain | nemesis |
| created | 2026-01-30 |
| last_updated | 2026-03-15 |
| owner | pehala |
Currently all nemesis are located in single file with over 7000 lines, making it hard for both humans and AIs to understand. All logic lives in Nemesis class to facilitate sharing of functionality between nemesis, this results in extremely bloated class, making it hard to understand what we already have code for. This results in often code duplication, where one functionality is implemented for multiple nemesis in slightly different way in different util functions.
Another problem is that Nemesis classes, which are used for filtering and carry all the flags, are actually never executed when used in SisyphusMonkey (i.e. the biggest usecase) Methods are carriers of functionality and classes with methods are only matched through regex, i.e. never actually executed. This method of discovery prevents further improvements
- Make classes executable even with Sisyphus
- Split nemesis.py file into individual modules(either files or directories) by group of nemesis
- Like "topology" or "modify_table"
- Those modules would keep common functionality close but encapsulated within the same context
- Revisit documentation of existing nemesis
- Given we will be touching almost every nemesis, it makes sense to better document them all
- Migrate utility function from Nemesis class into their respective Nemesis classes/module
Objective: Make Nemesis classes be executed when run with Sisyphus with minimal changes
- This will decouple the logic from the massive Nemesis class and serve as foundation for other phases
- Value added here is low, but it is stepping stone for other phases
Implementation:
- Split Nemesis class into NemesisBaseClass and NemesisRunner
- Abstract NemesisBaseClass is the new class for all nemesis, it contains disrupt method which is callable and executes the disruption
- Nemesis is renamed to NemesisRunner and still contains the same method
- Change existing nemesis to have NemesisBaseClass as parent class and execute all disrupt methods on NemesisRunner
- Change the discovery and execution mechanism to work with the new classes
Testing:
- Unit test: Update unit tests
- Longevity: Parallel nemesis, Performance test, Short Longevity, Single nemesis run
Expected Impact: Almost none
- Goal is to make the smallest amount of changes possible
- User facing name of the Nemesis will change from disrupt methods to classes
Objective: Allow viewing history before and after the naming change from previous phase
Implementation:
- Use internal map to map the old nemesis to new one in Argus
Expected Impact:
- History of nemesis executions will take in account both old and new names
Objective: Change nemesis.py into nemesis directory
- Group all code related to nemesis into one module
- Easier to find relevant code
- NemesisRegistry, NodeAllocator etc.
- Separate Monkeys into separate submodule
- First phase of splitting, prepared ground for Phase 3
- Provide auto-discovery mechanism
- NemesisRegistry currently used inheritance-based discovery, but for that to work the subclasses need to be already imported. This is not a problem with single file, but it is a problem when you have nemeses scattered across the entire module. To address this we need auto-import mechanism that will ensure that nemesis are always discovered.
- This will be a widespread change that is better to do in separate phase
Implementation (To be verified with PoC):
- Convert nemesis.py into a module
- Transfer all relevant files into nemesis module
- Create auto-discovery mechanism
- Extract all Monkeys into separate submodule
Testing:
- Unit test: Update unit tests
- Longevity: Performance test, Short Longevity, Single nemesis run
Expected Impact:
- Future PRs can now split nemesis effortlessly
- Better structure of all nemesis related code
- No changes to actual tests and coverage
Objective: Bring existing documentation up to date
Implementation:
- Update existing claims about nemesis
- Add/update sections about how to run nemesis
- Add/update sections about how to develop nemesis
- Add/update section about the machinery behind nemesis
- NemesisRegistry, NodeAllocator, NemesisJobGenerator etc.
- Created comprehensive Nemesis Developer Guide at
docs/nemesis.md - Fixed outdated references in AGENTS.md, contrib.md, writing-plans skill
Expected Impact:
- Simpler entrypoint for new developers to understand the concept and implementation
Objective: Extract all nemesis into a logically grouped modules outside of the main file
Implementation (Done for each group):
- Identify a group of nemesis to move
- Create module for the nemesis
- Can be both file or module, depending on the need
- Move the classes and all of the related code
- Improve code/util methods if needed
- Update documentation for the moved nemesis
- Include team ownership details
Expected Impact:
- Codebase with clear structure with reasonably sized files
- Better nemesis code by revisiting old nemesis