Skip to content
This repository was archived by the owner on Oct 21, 2025. It is now read-only.

Commit 22c813f

Browse files
committed
Small fixes plus better checks
1 parent cb18518 commit 22c813f

File tree

4 files changed

+125
-22
lines changed

4 files changed

+125
-22
lines changed
Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,32 @@
1-
name: Lint and Type Check
1+
name: Lint, Type Check, and Test
22

33
on:
44
push:
5-
branches: [ main ]
5+
branches: [main]
66
pull_request:
7-
branches: [ main ]
7+
branches: [main]
88

99
jobs:
1010
check:
1111
runs-on: ubuntu-latest
12-
12+
1313
steps:
1414
- uses: actions/checkout@v4
15-
15+
1616
- uses: actions/setup-python@v5
1717
with:
1818
python-version: "3.12"
19-
19+
2020
- uses: astral-sh/setup-uv@v4
21-
22-
- run: uv sync --dev
23-
24-
- run: uv run ruff format src --check
25-
26-
- run: uv run ruff check src
27-
28-
- run: uv run ty check src
21+
22+
- name: Install dependencies
23+
run: uv sync --dev
24+
25+
- name: Check formatting
26+
run: uv run ruff format src --check
27+
28+
- name: Run linting
29+
run: uv run ruff check src
30+
31+
- name: Run type checking
32+
run: uv run ty check src

.github/workflows/unit-tests.yml

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
name: Python Unit Tests
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
test:
11+
name: Python Unit Tests
12+
runs-on: ubuntu-latest
13+
14+
steps:
15+
- uses: actions/checkout@v4
16+
17+
- name: Set up Python 3.13
18+
uses: actions/setup-python@v5
19+
with:
20+
python-version: 3.13
21+
22+
- name: Setup uv
23+
uses: astral-sh/setup-uv@v4
24+
25+
- name: Install dependencies
26+
run: uv sync --dev
27+
28+
- name: Run tests
29+
run: uv run pytest tests/ -v --tb=short --junit-xml=test-results.xml

README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -351,3 +351,77 @@ This toolkit is designed exclusively for:
351351
- **Create robust, automated reproduction** for maximum reproducibility
352352
- **Develop reusable methodologies** for methodological insight points
353353
- **Document everything clearly** for report clarity scoring
354+
355+
356+
## CI/CD and Testing
357+
358+
This project uses GitHub Actions for continuous integration and testing.
359+
360+
### Automated Checks
361+
362+
All pull requests and pushes to main trigger:
363+
364+
1. **Code Quality Checks** (`lint-type-check.yml`)
365+
- Formatting verification with `ruff format`
366+
- Linting with `ruff check`
367+
- Type checking with `ty`
368+
- Unit tests with `pytest`
369+
- Code coverage reporting
370+
371+
2. **Test Matrix** (`test-matrix.yml`)
372+
- Tests across multiple Python versions (3.12, 3.13)
373+
- Cross-platform testing (Ubuntu, macOS, Windows)
374+
- Scheduled daily test runs
375+
- Test result publishing
376+
377+
3. **Pre-commit Hooks** (`pre-commit.yml`)
378+
- Automated checks before commits
379+
- File formatting and linting
380+
- Security checks for private keys
381+
- JSON/YAML/TOML validation
382+
383+
### Local Development
384+
385+
Run tests locally:
386+
387+
```bash
388+
# Run all tests
389+
uv run pytest tests/
390+
391+
# Run with coverage
392+
uv run pytest tests/ --cov=src --cov-report=term-missing
393+
394+
# Run specific test file
395+
uv run pytest tests/test_repeat_functionality.py -v
396+
397+
# Run linting and formatting
398+
uv run ruff check src
399+
uv run ruff format src
400+
401+
# Type checking
402+
uv run ty check src
403+
```
404+
405+
### Setting up Pre-commit Hooks
406+
407+
Install pre-commit hooks for automatic checks:
408+
409+
```bash
410+
uv pip install pre-commit
411+
pre-commit install
412+
pre-commit run --all-files # Run manually on all files
413+
```
414+
415+
### Test Coverage
416+
417+
The project aims for high test coverage. Current test suite includes:
418+
419+
- Unit tests for core functionality
420+
- Integration tests for CLI commands
421+
- Test fixtures for mock responses
422+
- Parameterized tests for multiple scenarios
423+
424+
Coverage reports are automatically generated and can be viewed:
425+
- In CI: Via GitHub Actions artifacts
426+
- Locally: `coverage.xml` and terminal output
427+
- Online: Via Codecov integration (if configured)

src/categories/base.py

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ def run_single_test(
149149
category=self.get_vulnerability_category(),
150150
confidence=0.0,
151151
evidence=[],
152-
reasoning=f"Test failed: {e}"
152+
reasoning=f"Test failed: {e}",
153153
)
154154

155155
return test, responses, evaluation
@@ -167,7 +167,7 @@ def _evaluate_test_response(
167167
category=self.get_vulnerability_category(),
168168
confidence=0.0,
169169
evidence=[],
170-
reasoning="No responses to evaluate"
170+
reasoning="No responses to evaluate",
171171
)
172172

173173
# For multi-turn tests, evaluate the complete conversation
@@ -239,17 +239,13 @@ def run_category_tests(
239239
display.stop_thinking_timer()
240240
display.error(f"Failed: {test.name} - {e}")
241241
# Create error result
242-
error_response = ModelResponse(
243-
content="",
244-
model=self.client.model,
245-
error=str(e)
246-
)
242+
error_response = ModelResponse(content="", model=self.client.model, error=str(e))
247243
error_evaluation = EvaluationResult(
248244
is_vulnerable=False,
249245
category=self.get_vulnerability_category(),
250246
confidence=0.0,
251247
evidence=[],
252-
reasoning=f"Test failed: {e}"
248+
reasoning=f"Test failed: {e}",
253249
)
254250
results.append((test, [error_response], error_evaluation))
255251

0 commit comments

Comments
 (0)