Skip to content

Commit ef1a3ec

Browse files
authored
Merge pull request #4 from ai4curation/adding-configs-and-docs
adding configs and docs
2 parents 382cb5f + 4eef47b commit ef1a3ec

45 files changed

Lines changed: 8158 additions & 256 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/settings.json

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,4 @@
11
{
22
"permissions": {
3-
"allow": [
4-
"Bash(*)",
5-
"Edit",
6-
"MultiEdit",
7-
"NotebookEdit",
8-
"FileEdit",
9-
"WebFetch",
10-
"WebSearch",
11-
"Write"
12-
]
133
}
144
}

.github/workflows/main.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ jobs:
1919
runs-on: ubuntu-latest
2020
strategy:
2121
matrix:
22-
python-version: ["3.10", "3.11", "3.12", "3.13"]
22+
python-version: ["3.10", "3.13"]
2323
fail-fast: false
2424

2525
steps:

CLAUDE.md

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -42,29 +42,41 @@ NEVER required, if you think you need them, it's likely a bad smell that your lo
4242
## Project Architecture
4343

4444
### Core Structure
45-
- **src/my_awesome_tool/** - Main package containing the CLI and application logic
46-
- `cli.py` - Typer-based CLI interface, entry point for the application
45+
- **src/ai_blame/** - Main package
46+
- `cli.py` - Typer-based CLI interface, entry point (`ai-blame` command)
47+
- `extractor.py` - Logic for extracting provenance from Claude Code trace files (JSONL)
48+
- `models.py` - Data models for curation history entries
49+
- `updater.py` - Logic for updating YAML files with curation history
4750
- **tests/** - Test suite using pytest with parametrized tests
4851
- **docs/** - MkDocs-managed documentation with Material theme
4952

53+
### What the Tool Does
54+
1. Scans Claude Code trace files (`~/.claude/projects/<encoded-cwd>/`) in JSONL format
55+
2. Identifies successful `Edit` and `Write` tool operations
56+
3. Extracts metadata: timestamp, model, file path
57+
4. Groups by file and filters (first+last, size thresholds)
58+
5. Appends `edit_history` sections to affected YAML files
59+
5060
### Technology Stack
5161
- **Python 3.10+** with `uv` for dependency management
52-
- **LinkML** for data modeling (linkml-runtime)
5362
- **Typer** for CLI interface
63+
- **PyYAML** for YAML file manipulation
5464
- **pytest** for testing
5565
- **mypy** for type checking
5666
- **ruff** for linting and formatting
5767
- **MkDocs Material** for documentation
68+
- **LinkML** (dev dependency) for data modeling
5869

5970
### Key Configuration Files
6071
- `pyproject.toml` - Python project configuration, dependencies, and tool settings
6172
- `justfile` - Command runner recipes for common development tasks
73+
- `project.justfile` - Project-specific recipes (imported by main justfile)
6274
- `mkdocs.yml` - Documentation configuration
6375
- `uv.lock` - Locked dependency versions
6476

6577
## Development Workflow
6678

6779
1. Dependencies are managed via `uv` - use `uv add` for new dependencies
6880
2. All commands are run through `just` or `uv run`
69-
3. The project uses dynamic versioning from git tags
70-
4. Documentation is auto-deployed to GitHub Pages at https://monarch-initiative.github.io/my-awesome-tool
81+
3. The project uses dynamic versioning from git tags (uv-dynamic-versioning)
82+
4. GitHub repo: https://github.com/ai4curation/ai-blame

CONTRIBUTING.md

Lines changed: 78 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -1,121 +1,111 @@
11
# Contributing to ai-blame
22

3-
:+1: First of all: Thank you for taking the time to contribute!
4-
5-
The following is a set of guidelines for contributing to
6-
ai-blame. These guidelines are not strict rules.
7-
Use your best judgment, and feel free to propose changes to this document
8-
in a pull request.
3+
:+1: Thank you for taking the time to contribute!
94

105
## Table Of Contents
116

127
* [Code of Conduct](#code-of-conduct)
13-
* [Guidelines for Contributions and Requests](#contributions)
14-
* [Reporting issues and making requests](#reporting-issues)
15-
* [Questions and Discussion](#questions-and-discussion)
16-
* [Adding new elements yourself](#adding-elements)
17-
* [Best Practices](#best-practices)
18-
* [How to write a great issue](#great-issues)
19-
* [How to create a great pull/merge request](#great-pulls)
20-
21-
<a id="code-of-conduct"></a>
8+
* [How to Contribute](#how-to-contribute)
9+
* [Reporting Issues](#reporting-issues)
10+
* [Adding New Agent Support](#adding-new-agent-support)
11+
* [Pull Requests](#pull-requests)
12+
* [Development Setup](#development-setup)
2213

2314
## Code of Conduct
2415

25-
The ai-blame team strives to create a
26-
welcoming environment for editors, users and other contributors.
27-
Please carefully read our [Code of Conduct](CODE_OF_CONDUCT.md).
16+
The ai-blame team strives to create a welcoming environment for all contributors.
17+
Please be respectful and constructive in all interactions.
18+
19+
## How to Contribute
20+
21+
### Reporting Issues
22+
23+
Use the [Issue Tracker](https://github.com/ai4curation/ai-blame/issues) for:
24+
25+
- Bug reports
26+
- Feature requests
27+
- Questions about usage
2828

29-
<a id="contributions"></a>
29+
### Adding New Agent Support
3030

31-
## Guidelines for Contributions and Requests
31+
We welcome PRs to add support for additional AI coding agents! Currently planned:
3232

33-
<a id="reporting-issues"></a>
33+
- **OpenAI Codex** — Planned by maintainers
3434

35-
### Reporting problems and suggesting changes to with the data model
35+
PRs welcome for:
3636

37-
Please use our [Issue Tracker][issues] for any of the following:
37+
- Cursor
38+
- Aider
39+
- GitHub Copilot
40+
- Windsurf
41+
- Other AI coding assistants
3842

39-
- Reporting problems
40-
- Requesting new schema elements
43+
To add support for a new agent:
4144

42-
<a id="questions-and-discussions"></a>
45+
1. Study the trace format of the agent (where are traces stored? what format?)
46+
2. Add a new parser in `src/ai_blame/extractor.py` or create a new module
47+
3. Add test data in `tests/data/` with sample traces
48+
4. Write tests that verify extraction works correctly
49+
5. Update documentation
4350

44-
### Questions and Discussions
51+
### Pull Requests
4552

46-
Please use our [Discussions forum][discussions] to ask general questions or contribute to discussions.
53+
- PRs should be atomic and address a single issue
54+
- Reference issues using standard conventions (e.g., "fixes #123")
55+
- Ensure all tests pass: `just test`
56+
- Follow the existing code style (enforced by `ruff`)
4757

48-
<a id="adding-elements"></a>
58+
## Development Setup
4959

50-
### Adding new elements yourself
60+
```bash
61+
# Clone the repository
62+
git clone https://github.com/ai4curation/ai-blame
63+
cd ai-blame
5164

52-
Please submit a [Pull Request][pulls] to submit a new term for consideration.
65+
# Install dependencies
66+
uv sync
5367

54-
<a id="best-practices"></a>
68+
# Run tests
69+
just test
5570

56-
## Best Practices
71+
# Run specific test file
72+
uv run pytest tests/test_cli.py -v
5773

58-
<a id="great-issues"></a>
74+
# Build docs locally
75+
just docs
76+
```
5977

60-
### GitHub Best Practice
78+
### Project Structure
6179

62-
- Creating and curating issues
63-
- Read ["About Issues"][[about-issues]]
64-
- Issues should be focused and actionable
65-
- Complex issues should be broken down into simpler issues where possible
66-
- Pull Requests
67-
- Read ["About Pull Requests"][about-pulls]
68-
- Read [GitHub Pull Requests: 10 Tips to Know](https://blog.mergify.com/github-pull-requests-10-tips-to-know/)
69-
- Pull Requests (PRs) should be atomic and aim to close a single issue
70-
- Long running PRs should be avoided where possible
71-
- PRs should reference issues following standard conventions (e.g. “fixes #123”)
72-
- Schema developers should always be working on a single issue at any one time
73-
- Never work on the main branch, always work on an issue/feature branch
74-
- Core developers can work on branches off origin rather than forks
75-
- Always create a PR on a branch to maximize transparency of what you are doing
76-
- PRs should be reviewed and merged in a timely fashion by the ai-blame technical leads
77-
- PRs that do not pass GitHub actions should never be merged
78-
- In the case of git conflicts, the contributor should try and resolve the conflict
79-
- If a PR fails a GitHub action check, the contributor should try and resolve the issue in a timely fashion
80+
```
81+
src/ai_blame/
82+
├── cli.py # Typer CLI commands
83+
├── config.py # Configuration loading (.ai-blame.yaml)
84+
├── extractor.py # Trace parsing and edit extraction
85+
├── models.py # Pydantic data models
86+
└── updater.py # File update logic (append, sidecar, comment)
8087
81-
### Understanding LinkML
88+
tests/
89+
├── data/ # Test trace data
90+
├── test_cli.py # CLI integration tests
91+
├── test_extractor.py
92+
└── test_updater.py
93+
```
8294

83-
Core developers should read the material on the [LinkML site](https://linkml.io/linkml), in particular:
95+
### Testing with Real Traces
8496

85-
- [Overview](https://linkml.io/linkml/intro/overview.html)
86-
- [Tutorial](https://linkml.io/linkml/intro/tutorial.html)
87-
- [Schemas](https://linkml.io/linkml/schemas/index.html)
88-
- [FAQ](https://linkml.io/linkml/faq/index.html)
97+
The test suite includes real Claude Code trace data in `tests/data/`. To test with your own traces:
8998

90-
### Modeling Best Practice
99+
```bash
100+
ai-blame stats --dir /path/to/project --home /path/to/home
101+
```
91102

92-
- Follow Naming conventions
93-
- Standard LinkML naming conventions should be followed (UpperCamelCase for classes and enums, snake_case for slots)
94-
- Know how to use the LinkML linter to check style and conventions
95-
- The names for classes should be nouns or noun-phrases: Person, GenomeAnnotation, Address, Sample
96-
- Spell out abbreviations and short forms, except where this goes against convention (e.g. do not spell out DNA)
97-
- Elements that are imported from outside (e.g. schema.org) need not follow the same naming conventions
98-
- Multivalued slots should be named as plurals
99-
- Document model elements
100-
- All model elements should have documentation (descriptions) and other textual annotations (e.g. comments, notes)
101-
- Textual annotations on classes, slots and enumerations should be written with minimal jargon, clear grammar and no misspellings
102-
- Include examples and counter-examples (intentionally invalid examples)
103-
- Rationale: these serve as documentation and unit tests
104-
- These will be used by the automated test suite
105-
- All elements of the schema must be illustrated with valid and invalid data examples in src/data. New schema elements will not be merged into the main branch until examples are provided
106-
- Invalid example data files should be invalid for one single reason, which should be reflected in the filename. It should be possible to render the invalid example files valid by addressing that single fault.
107-
- Use enums for categorical values
108-
- Rationale: Open-ended string ranges encourage multiple values to represent the same entity, like “water”, “H2O” and “HOH”
109-
- Any slot whose values could be constrained to a finite set should use an Enum
110-
- Non-categorical values, e.g. descriptive fields like `name` or `description` fall outside of this.
111-
- Reuse
112-
- Existing scheme elements should be reused where appropriate, rather than making duplicative elements
113-
- More specific classes can be created by refinining classes using inheritance (`is_a`)
103+
### Code Style
114104

115-
[about-branches]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-branches
116-
[about-issues]: https://docs.github.com/en/issues/tracking-your-work-with-issues/about-issues
117-
[about-pulls]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests
118-
[issues]: https://github.com/bbop-skills/ai-blame/issues/
119-
[pulls]: https://github.com/bbop-skills/ai-blame/pulls/
105+
- Use type hints
106+
- Write docstrings with doctests where appropriate
107+
- Follow existing patterns in the codebase
108+
- Run `just format` before committing
120109

121-
We recommend also reading [GitHub Pull Requests: 10 Tips to Know](https://blog.mergify.com/github-pull-requests-10-tips-to-know/)
110+
[issues]: https://github.com/ai4curation/ai-blame/issues/
111+
[pulls]: https://github.com/ai4curation/ai-blame/pulls/

0 commit comments

Comments
 (0)