geti-instant-learn/.cursorrules at main · rajeshgangireddy/geti-instant-learn · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
# Geti Instant Learn - AI Assistant Rules

> **Note**: This file is synchronized with `.github/copilot-instructions.md`. Both Cursor and VS Code (GitHub Copilot) use these same standards.

## Table of Contents
- [Project Overview](#project-overview)
- [Coding Standards](#coding-standards)
  - [Python Environment](#python-environment-management)
  - [Python Code](#python-code)
  - [TypeScript/React](#typescriptreact-code)
  - [General Principles](#general-principles)
- [Writing Style](#writing-style)
- [Documentation Standards](#documentation-standards)
- [Testing](#testing-guidelines)
- [File Organization](#file-organization)
- [Git & PRs](#git-commit-messages)
- [Performance & Security](#performance-considerations)
- [AI/ML Guidelines](#aiml-specific-guidelines)

## Project Overview

Full-stack application with:
- **Backend**: Python FastAPI (`application/backend/`)
- **Frontend**: React/TypeScript (`application/ui/`)
- **Library**: Zero/Few-shot Vision and Text Prompt (`library/`)

## Coding Standards

### Python Environment Management
- **Always use `uv`** for package management and virtual environments
- Use `uv` generated virtual environments (`.venv`)
- Install with `uv pip install` or `uv sync`
- Create environments with `uv venv`
- Never use `pip` directly
- Ensure `.venv` is in `.gitignore`

### Python Code
- Follow PEP 8
- Use type hints for all functions
- Prefer `pathlib.Path` over string paths
- Use `ruff` for linting and formatting
- Address all ruff warnings

**Docstrings** - Google style format:
```python
def function_name(param1: str, param2: int) -> bool:
    """Brief description of function.

    Args:
        param1: Description of param1
        param2: Description of param2

    Returns:
        Description of return value

    Raises:
        ValueError: Description of when this is raised

    Examples:
        >>> result = function_name("test", 42)
        >>> print(result)
        True

        Multi-line example without prompt:

        from module import function_name

        result = function_name("test", 42)
        if result:
            print("Success")
    """
```

- Use `logging` instead of `print()`
- Prefer dataclasses or Pydantic models
- Use context managers for resource management

### TypeScript/React Code
- Use functional components with hooks
- Prefer named exports over default exports
- Use TypeScript strict mode with explicit types
- Follow component structure in `application/ui/src/`
- Use proper prop types and interfaces
- Implement error boundaries
- Use React Query for data fetching

### General Principles
- **DRY**: Extract common logic
- **Single Responsibility**: One clear purpose per function/class
- **Error Handling**: Handle errors with informative messages
- **Testing**: Write tests for new functionality
- **Security**: Use environment variables for secrets
- **Performance**: Consider implications, especially for ML operations

## Writing Style

Apply to code comments, documentation, commit messages, and PR descriptions:

**Be Concise**
- Remove unnecessary words
- Avoid repeating ideas
- Use 10 words instead of 20
- Prefer short sentences

**Be Direct**
- State the point immediately
- Use active voice
- Remove hedging language ("may", "might", "could potentially")

**Use Simple Language**
- Choose simple words over complex ones
- Avoid jargon unless necessary
- Break complex sentences into shorter ones

**Sound Natural**
- Write as if explaining to a colleague
- Avoid formulaic transitions ("Furthermore", "Moreover", "Additionally")
- Don't use numbered lists when a paragraph works
- Avoid: "It is important to note that", "It should be mentioned", "It is worth noting"
- Vary sentence length naturally

**Academic But Accessible**
- Use technical terms when needed, explain domain-specific ones
- Prefer clarity over impressive vocabulary

**Examples:**

❌ "It is important to note that the workshop aims to establish a comprehensive platform that serves to bring together researchers from diverse backgrounds in order to facilitate meaningful collaboration."

✅ "The workshop brings together researchers from diverse backgrounds to facilitate collaboration."

❌ "The methodology demonstrates significant improvements in terms of performance metrics."

✅ "The method improves performance."

## Documentation Standards

**Code Comments**
- Write self-documenting code
- Add comments only for the "why", not the "what"
- Update comments when code changes

**README Files**
- Each major component needs a README.md
- Include: purpose, installation, usage examples, configuration, troubleshooting

**API Documentation**
- Document endpoints with OpenAPI/Swagger
- Include request/response examples
- Document error responses and status codes

**Inline Documentation**
- Use JSDoc for TypeScript/JavaScript
- Use docstrings for Python
- Document parameters, return values, exceptions
- Include usage examples for complex functions

## Testing Guidelines

**Python Tests**
- Use `pytest` with `uv` (e.g., `uv run pytest`)
- Place in `tests/unit/` and `tests/integration/`
- Name files: `test_<module_name>.py`
- Name functions: `test_<feature>_<scenario>`
- Use fixtures from `conftest.py`
- Aim for >80% coverage
- Mock external dependencies

**TypeScript Tests**
- Use Vitest for unit tests
- Use Playwright for E2E tests
- Name files: `<component>.test.ts(x)`
- Use descriptive `describe` and `it` blocks
- Mock API calls and dependencies

## File Organization

**Backend**
```
application/backend/src/
├── api/          # API routes and endpoints
├── core/         # Core business logic
├── db/           # Database models and migrations
├── repositories/ # Data access layer
├── schemas/      # Pydantic schemas
├── services/     # Business logic services
└── utils/        # Utility functions
```

**Frontend**
```
application/ui/src/
├── api/          # API client and hooks
├── components/   # Reusable UI components
├── features/     # Feature-specific code
├── routes/       # Page components
└── assets/       # Static assets
```

**Library**
```
library/src/instantlearn/
├── configs/      # Configuration management
├── data/         # Data loading and processing
├── inference/    # Inference engine
├── policy/       # Policy implementations
└── trainer/      # Training logic
```

## Git Commit Messages
Use conventional commits:
- `feat:` - new features
- `fix:` - bug fixes
- `docs:` - documentation changes
- `refactor:` - code refactoring
- `test:` - adding tests
- `chore:` - maintenance tasks

Write clear, concise messages. Reference issue numbers.

## Pull Request Guidelines
- Use the PR template (`.github/pull_request_template.md`)
- Fill out: Description, Type of Change, Related Issues, Changes Made, Examples, Breaking Changes
- Provide usage examples and before/after comparisons
- Follow conventional commit format for PR title
- **Tip**: Draft PRs in `tmp_PR_TEMPLATE_<branch-name>.md` for preview

## Performance Considerations
- Lazy load heavy dependencies
- Use async/await for I/O
- Implement caching
- Optimize database queries (indexes, avoid N+1)
- Profile before optimizing
- Consider memory usage for ML models

## Security Best Practices
- Validate and sanitize inputs
- Use parameterized queries
- Implement authentication and authorization
- Store secrets in environment variables
- Keep dependencies updated
- Follow OWASP guidelines

## AI/ML Specific Guidelines
- Document model architectures and hyperparameters
- Version control training configurations
- Log training metrics and artifacts
- Handle model inference errors
- Consider inference latency and throughput
- Document model limitations and assumptions

## Questions to Consider Before Coding
1. Does this align with project architecture?
2. Can I reuse existing utilities/components?
3. How will this be tested?
4. What error cases need handling?
5. Are there performance implications?
6. Does this need documentation?
7. Security considerations?

## When Suggesting Code Changes
- Explain reasoning
- Consider backward compatibility
- Highlight breaking changes
- Suggest related test updates
- Note configuration changes
- Consider impact on existing functionality