|
| 1 | +# Agent Instructions |
| 2 | + |
| 3 | +Guidelines for AI agents working in the Swiss Data Science Center (SDSC) codebase. |
| 4 | + |
| 5 | +## General values |
| 6 | + |
| 7 | +- Explicit over Implicit: Code should be readable and obvious. Avoid "magic" logic. |
| 8 | +- Robustness: Prioritize error handling and edge cases over happy-path-only code. |
| 9 | +- Statelessness: Avoid mutable global state. Prefer pure functions. |
| 10 | +- Testability: Code must be designed to be easily tested (dependency injection, small units). |
| 11 | +- Conciseness: Prefer self-documenting code and terse documentation without filler. |
| 12 | +- Modularity: Aim for components with low coupling and high cohesion. |
| 13 | + |
| 14 | +## Project structure |
| 15 | + |
| 16 | +When creating directories, default to our standard project structure, unless the project has an existing alternative. |
| 17 | + |
| 18 | +- `docs`: All documentation-related files. |
| 19 | +- `examples`: Examples showing how to use the software. |
| 20 | +- `external`: Imported third party resources. |
| 21 | +- `src`: Where your source code lives. |
| 22 | +- `tools`: All configurations and scripts which are not part of the source. |
| 23 | + - `configs`: config for tools (e.g. formatters, linters). |
| 24 | + - `images`: `Containerfile` definitions of OCI images. |
| 25 | + - `just`: just modules (e.g. image.just). |
| 26 | + - `nix`: Nix flake and code. |
| 27 | + - `ci`: CI related tooling/scripts. |
| 28 | + - `scripts`: Additional scripts. |
| 29 | + |
| 30 | +## Toolchain |
| 31 | + |
| 32 | +Use the project's existing toolchain when available. Otherwise, default to the following. |
| 33 | + |
| 34 | +* `just` as command runner. |
| 35 | +* `nix` flakes for devShells. |
| 36 | +* `podman` for OCI images. |
| 37 | +* `prek` as git-hook manager. |
| 38 | +* `vendir` to manage external dependencies. |
| 39 | +* For python components: `uv`, `ruff`. |
| 40 | +* For javascript components: `pnpm`. |
| 41 | + |
| 42 | +## Rules by Topic |
| 43 | + |
| 44 | +These general rules define our high level development practices. |
| 45 | +For framework-specific recommendations, refer to specific skills. |
| 46 | + |
| 47 | +### Code Style |
| 48 | + |
| 49 | +- All imports/dependencies should be at the top of the file rather than inside functions. |
| 50 | +- Use named placeholders in format strings ("{index}"), not positional ("{}"). |
| 51 | +- Rule of three: at the third repetition, extract an abstraction. |
| 52 | +- Only the public, user-facing contract (API + config formats) needs stability. |
| 53 | + Internal interfaces may change freely; mark them as internal so the boundary is clear. |
| 54 | + |
| 55 | +### Error Handling |
| 56 | + |
| 57 | +Failures are part of a function's contract, expressed in its return/error type. |
| 58 | + |
| 59 | +- Never abort the process on recoverable conditions. Model failures as values or |
| 60 | + typed errors the caller can handle, not as abrupt termination. |
| 61 | +- Ban "assume success" shortcuts that convert a recoverable error into a crash |
| 62 | + (ignoring a returned error). |
| 63 | +- Propagate, don't swallow: attach context at each |
| 64 | + layer, and preserve the underlying cause so the root is recoverable from the trace. |
| 65 | + |
| 66 | +### Testing |
| 67 | + |
| 68 | +- Every component that generates or handles data carries correctness requirements. |
| 69 | +- Apply the cheapest verification (static analysis / type checking) to all code; |
| 70 | + treat its failures as build failures. |
| 71 | +- Use property-based tests when relevant: generate many inputs and assert |
| 72 | + invariants rather than hand-picked cases. |
| 73 | +- Where call order matters (e.g. stateful APIs), generate sequences of operations, |
| 74 | + not just independent inputs |
| 75 | +- End-to-end tests verify composition, not logic: run the full pipeline and assert |
| 76 | + system-level invariants. Keep them few, focused on the seams between components. |
| 77 | +- Reserve example/unit tests for simple pure functions and for pinning regressions |
| 78 | + a stronger test uncovers. |
| 79 | + |
| 80 | +### Performance |
| 81 | + |
| 82 | +- Pre-allocate when size is known |
| 83 | +- No allocations in hot paths |
| 84 | +- Prefer worst-case over average-case algorithms |
| 85 | +- No optimization without profiling data |
| 86 | + |
| 87 | +### Determinism |
| 88 | + |
| 89 | +When writing generators, output should be a pure function of (seed, config). |
| 90 | + |
| 91 | +- Use explicit seed from config. |
| 92 | +- Only iterate over ordered collections. |
| 93 | +- No ambient inputs (wall-clock time, unsorted filesystem listings, ...). |
| 94 | +- Don't let thread/task scheduling affect your outputs. |
| 95 | +- Pin all serialization options (key order, separators, encoding, float precision) |
| 96 | + so the same data always serializes to the same bytes. |
| 97 | + |
| 98 | +### Dependencies |
| 99 | + |
| 100 | +- Evaluate before adding: determinism, control, criticality, performance, vs |
| 101 | + maintenance/security risk and the long-term cost of owning a reimplementation. |
| 102 | +- Default to a proven, well-maintained dependency for solved, non-core problems. |
| 103 | +- Implement in-house only when a criterion genuinely fails and the surface is small |
| 104 | + enough to own, test, and verify. |
| 105 | +- Upgrade policy is to pin to the major (breaking) boundary by default. |
| 106 | +- Reproducibility comes from a committed lockfile for every deployable. |
| 107 | +- Published libraries: avoid over-constraining consuming libraries by setting a |
| 108 | + low floor. |
| 109 | +- Move dependencies into relevant groups when the language allows (e.g. dev, docs, test). |
| 110 | + |
| 111 | +### Documentation |
| 112 | + |
| 113 | +- Document why, not what. The code shows what; comments explain why. |
| 114 | +- Use the language's documentation convention (doc comments / docstrings) at module |
| 115 | + level and on every public type and function. |
| 116 | +- Describe functions in the imperative mood ("Send payload", not "Sends a payload"). |
| 117 | + Document parameters, return value, and failure modes. |
| 118 | +- ASCII-only preferred in code and documentation. |
| 119 | + |
| 120 | +### Validation Workflow |
| 121 | + |
| 122 | +- Validation commands should be defined by the task runner (e.g. make or just). |
| 123 | +- If no command is provided, always run tools using the project configuration. |
| 124 | +- Run validation after every code change. |
| 125 | + |
| 126 | +| Task | Example Command | |
| 127 | +|------|---------| |
| 128 | +| Format | `just format` | |
| 129 | +| Lint | `make check` | |
| 130 | +| Test | `just test` | |
| 131 | + |
| 132 | +## Key Reminders |
| 133 | + |
| 134 | +1. Run validation after changes - never skip |
| 135 | +2. Use project recipes when available, not raw commands. |
| 136 | +3. Whenever making major code changes, make sure to update outdated documentation and tests. |
| 137 | + |
| 138 | +## Review Checklist |
| 139 | + |
| 140 | +When reviewing code: |
| 141 | + |
| 142 | +- [ ] Tests added/updated for new functionality |
| 143 | +- [ ] Error messages are actionable |
| 144 | +- [ ] No hardcoded paths or credentials |
| 145 | +- [ ] Documentation updated (README, docstrings, comment-based help) |
| 146 | +- [ ] Backward compatibility maintained |
| 147 | + |
| 148 | +--- |
| 149 | + |
| 150 | +*This file is optimized for both AI agents and human contributors. When in doubt, prioritize clarity and maintainability over cleverness.* |
0 commit comments