Open
Description
The tests mostly cover the golden paths, i.e. users getting steps right or triggering a message step. There's lots of untested code. It can be found pretty easily by running tests with coverage. Some stuff is tested by test_frontend but that's slow and more difficult to measure coverage for so we should have pure Python tests for it. I'm particularly interested in the edge cases of bad code submissions from users, e.g:
- Messing up an
ExerciseStep
in various ways, e.g. writing a solution that satisfies your own sample inputs but not the other tests. - Messages in steps that are returned manually, i.e.
return dict(message='...')
instead of aMessageStep
. These tests would be specific to such steps. - Submissions that trigger
Disallowed
- Submissions that trigger linting errors