Skip to content

Commit 565df7b

Browse files
committed
Modernize build system and update installation docs
Replaces setup.py with pyproject.toml for PEP 517/518 compliance, updates all documentation to recommend modern pip/conda workflows, and clarifies Python version support (now 3.10+). Updates CI to use the build tool, adds CLAUDE.md for Claude-specific guidance, and revises installation and development instructions across README and docs for clarity and modern best practices.
1 parent 0718d95 commit 565df7b

File tree

7 files changed

+635
-82
lines changed

7 files changed

+635
-82
lines changed

.github/copilot-instructions.md

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# Copilot Instructions for replay_trajectory_classification
2+
3+
## Repository Summary
4+
5+
`replay_trajectory_classification` is a Python package for decoding spatial position from neural activity and categorizing trajectory types, specifically designed for analyzing hippocampal replay events in neuroscience research. The package provides state-space models that can decode position from both spike-sorted cells and clusterless spikes, with support for GPU acceleration and complex 1D/2D environments.
6+
7+
## High-Level Repository Information
8+
9+
- **Size**: ~63MB with 28 Python files
10+
- **Type**: Scientific Python package for computational neuroscience
11+
- **Primary Language**: Python 3.10+ (configured for Python 3.13 in current environment)
12+
- **Key Dependencies**: NumPy, SciPy, scikit-learn, numba, xarray, dask, matplotlib, pandas
13+
- **Documentation**: Sphinx-based documentation with ReadTheDocs hosting
14+
- **License**: MIT
15+
16+
## Environment Setup and Build Instructions
17+
18+
### Prerequisites
19+
Always use conda for environment management due to complex scientific dependencies:
20+
21+
```bash
22+
# Update conda first (required)
23+
conda update -n base conda
24+
25+
# Create environment from environment.yml (required)
26+
conda env create -f environment.yml
27+
28+
# Activate environment
29+
conda activate replay_trajectory_classification
30+
```
31+
32+
### Installation Commands
33+
**ALWAYS install in development mode for code changes:**
34+
35+
```bash
36+
# Modern development installation (recommended)
37+
pip install -e .
38+
39+
# With optional development tools
40+
pip install -e '.[dev]' # Includes ruff, jupyter, testing tools
41+
pip install -e '.[test]' # Testing dependencies only
42+
pip install -e '.[docs]' # Documentation building tools
43+
```
44+
45+
**Note**: The repository has been fully modernized to use `pyproject.toml`. The old `setup.py develop` command is no longer available.
46+
47+
### Validation Commands
48+
49+
#### Package Import Test
50+
```bash
51+
python -c "import replay_trajectory_classification; print('Package imported successfully')"
52+
```
53+
**Expected output**: "Cupy is not installed or GPU is not detected. Ignore this message if not using GPU" followed by "Package imported successfully"
54+
55+
#### Linting
56+
```bash
57+
# Modern linting (preferred)
58+
ruff check replay_trajectory_classification/
59+
60+
# Legacy flake8 still works
61+
flake8 replay_trajectory_classification/ --max-line-length=88 --select=E9,F63,F7,F82 --show-source --statistics
62+
```
63+
**Expected**: Minimal output (style issues are non-breaking)
64+
65+
#### Notebook Testing (CI Validation)
66+
The main test suite runs Jupyter notebooks. Test individual notebooks:
67+
```bash
68+
jupyter nbconvert --to notebook --ExecutePreprocessor.kernel_name=python3 --execute notebooks/tutorial/01-Introduction_and_Data_Format.ipynb --output-dir=/tmp
69+
```
70+
**Time required**: ~2-3 minutes per notebook
71+
**Expected**: Notebook executes without errors
72+
73+
#### Documentation Build
74+
```bash
75+
# First install docs dependencies
76+
pip install -r docs/requirements-docs.txt
77+
78+
# Note: Documentation build has dependency issues with jupytext in Makefile
79+
# The docs can be built but require manual intervention
80+
```
81+
82+
## Continuous Integration
83+
84+
The repository uses GitHub Actions (`.github/workflows/PR-test.yml`):
85+
- **Trigger**: All pushes
86+
- **OS**: Ubuntu latest only
87+
- **Python**: 3.11 (but environment.yml uses current conda defaults)
88+
- **Test Process**: Executes all 5 tutorial notebooks sequentially
89+
- **Environment**: Uses conda with channels: conda-forge, franklab, edeno
90+
- **Installation**: `pip install -e .` after conda environment setup
91+
92+
## Project Architecture and Layout
93+
94+
### Core Package Structure (`replay_trajectory_classification/`)
95+
- **`__init__.py`**: Main API exports (ClassifierBase, Decoders, Environment, etc.)
96+
- **`classifier.py`**: Base classes for trajectory classification with both sorted/clusterless approaches
97+
- **`decoder.py`**: Core decoding functionality
98+
- **`environments.py`**: Spatial environment representation with discrete grids
99+
- **`core.py`**: Low-level computational functions
100+
- **`likelihoods/`**: Subpackage with various likelihood models (KDE, GLM, multiunit, GPU variants)
101+
102+
### Key Configuration Files
103+
- **`environment.yml`**: Conda environment specification with scientific computing stack
104+
- **`setup.py`**: Package configuration and dependencies
105+
- **`.readthedocs.yaml`**: Documentation build configuration
106+
- **`docs/conf.py`**: Sphinx documentation configuration
107+
- **`docs/requirements-docs.txt`**: Documentation build dependencies
108+
109+
### Documentation (`docs/`)
110+
- **Sphinx-based** with ReadTheDocs hosting
111+
- **API docs**: Auto-generated from docstrings
112+
- **Installation guide**: `installation.md`
113+
- **Build system**: Makefile (but has jupytext dependency issues)
114+
115+
### Tutorials (`notebooks/tutorial/`)
116+
Five comprehensive Jupyter notebooks demonstrate package usage:
117+
1. **01-Introduction_and_Data_Format.ipynb**: Data format requirements
118+
2. **02-Decoding_with_Sorted_Spikes.ipynb**: Single movement model with sorted spikes
119+
3. **03-Decoding_with_Clusterless_Spikes.ipynb**: Single movement model with clusterless approach
120+
4. **04-Classifying_with_Sorted_Spikes.ipynb**: Multiple movement models with sorted spikes
121+
5. **05-Classifying_with_Clusterless_Spikes.ipynb**: Multiple movement models with clusterless spikes
122+
123+
### Dependencies Not Obvious from Structure
124+
- **track_linearization**: External package for spatial track handling (imported in `__init__.py`)
125+
- **regularized_glm**: Custom GLM implementation
126+
- **GPU dependencies**: CuPy for GPU acceleration (optional)
127+
- **franklab & edeno conda channels**: Required for specialized neuroscience packages
128+
129+
## Important Development Notes
130+
131+
### Environment Requirements
132+
- **ALWAYS** use the conda environment - pip-only installations will fail due to complex scientific dependencies
133+
- **GPU support** requires CuPy installation (optional, warnings are normal without GPU)
134+
- **Documentation builds** may require manual intervention due to jupytext path issues
135+
136+
### Testing Approach
137+
- **Integration testing**: All 5 tutorial notebooks must execute successfully
138+
- **CI dependency**: Notebooks test real scientific workflows, not isolated functions
139+
140+
### Common Issues and Workarounds
141+
- **Documentation build**: Makefile expects jupytext in PATH but may not find conda environment version
142+
- **Setup.py warnings**: Deprecation warnings are expected but installation succeeds
143+
- **GPU warnings**: "Cupy not installed" messages are normal for CPU-only environments
144+
- **Long notebook execution**: Tutorial notebooks can take 2-3 minutes each to execute
145+
146+
### File Exclusions (from .gitignore)
147+
Key files to exclude from commits:
148+
- Jupyter checkpoint files (`.ipynb_checkpoints`)
149+
- Build artifacts (`_build`, `_autosummary`, `dist/`)
150+
- Data files (`*.mat`, `*.csv`, `*.nc`)
151+
- Cache files (`__pycache__`, `*.prof`)
152+
153+
## Validation Checklist for Changes
154+
155+
1. **Environment setup**: Conda environment creates successfully
156+
2. **Installation**: `python setup.py develop` or `pip install -e .` succeeds
157+
3. **Import test**: Package imports without errors (GPU warnings OK)
158+
4. **Lint check**: flake8 passes with specified parameters
159+
5. **Notebook execution**: All tutorial notebooks run successfully
160+
6. **CI compatibility**: Changes don't break the GitHub Actions workflow
161+
162+
## Final Note
163+
164+
This package serves active neuroscience research. Changes should maintain scientific accuracy and computational efficiency. The codebase prioritizes correctness over traditional software engineering practices (hence notebook-based testing). Trust these instructions and only search for additional information if specific technical details are missing or incorrect.

.github/workflows/PR-test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ jobs:
7171
shell: bash -l {0}
7272
run: |
7373
python -V
74-
pip install --upgrade pip
74+
pip install --upgrade pip build
7575
pip install -e .
7676
7777
# Execute tutorial notebooks to ensure they run end-to-end

CLAUDE.md

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
`replay_trajectory_classification` is a Python package for decoding spatial position from neural activity and categorizing trajectory types in hippocampal replay events. This is a computational neuroscience package that prioritizes scientific accuracy and computational efficiency.
8+
9+
## Essential Commands
10+
11+
### Environment Setup (REQUIRED)
12+
13+
Always use conda - pip-only installations will fail due to complex scientific dependencies:
14+
15+
```bash
16+
# Update conda first (required)
17+
conda update -n base conda
18+
19+
# Create environment from environment.yml
20+
conda env create -f environment.yml
21+
22+
# Activate environment
23+
conda activate replay_trajectory_classification
24+
25+
# Development installation (required for code changes)
26+
pip install -e .
27+
28+
# Install with optional dependencies
29+
pip install -e '.[dev]' # Development tools (ruff, jupyter, etc.)
30+
pip install -e '.[test]' # Testing tools
31+
pip install -e '.[docs]' # Documentation tools
32+
```
33+
34+
### Validation Commands
35+
36+
Test package installation:
37+
38+
```bash
39+
python -c "import replay_trajectory_classification; print('Package imported successfully')"
40+
```
41+
42+
*Note: "Cupy not installed" warnings are normal without GPU*
43+
44+
Lint check (modern):
45+
46+
```bash
47+
ruff check replay_trajectory_classification/
48+
```
49+
50+
Legacy flake8 also works:
51+
```bash
52+
flake8 replay_trajectory_classification/ --max-line-length=88 --select=E9,F63,F7,F82 --show-source --statistics
53+
```
54+
55+
### Testing
56+
57+
The main test suite runs tutorial notebooks (takes ~10-15 minutes total):
58+
59+
```bash
60+
# Test individual notebook
61+
jupyter nbconvert --to notebook --ExecutePreprocessor.kernel_name=python3 --execute notebooks/tutorial/01-Introduction_and_Data_Format.ipynb --output-dir=/tmp
62+
63+
# Test all notebooks (CI equivalent)
64+
for nb in notebooks/tutorial/*.ipynb; do
65+
jupyter nbconvert --to notebook --inplace --ExecutePreprocessor.kernel_name=python3 --ExecutePreprocessor.timeout=1800 --execute "$nb"
66+
done
67+
```
68+
69+
## Architecture Overview
70+
71+
### Core Package Structure (`replay_trajectory_classification/`)
72+
73+
**Main API Classes** (exported in `__init__.py`):
74+
75+
- `SortedSpikesClassifier/Decoder`: For spike-sorted neural data
76+
- `ClusterlessClassifier/Decoder`: For clusterless (unsorted) neural data
77+
- `Environment`: Spatial environment representation with discrete grids
78+
- Movement models: `RandomWalk`, `EmpiricalMovement`, `Identity`, etc.
79+
80+
**Key Modules**:
81+
82+
- `classifier.py`: Base classes for trajectory classification (49k lines)
83+
- `decoder.py`: Core decoding functionality
84+
- `environments.py`: Spatial environment handling (33k lines)
85+
- `core.py`: Low-level computational functions
86+
- `likelihoods/`: Subpackage with various likelihood models
87+
88+
**Likelihood Algorithms** (`likelihoods/`):
89+
90+
- **Sorted spikes**: GLM-based (`spiking_likelihood_glm.py`), KDE-based (`spiking_likelihood_kde.py`)
91+
- **Clusterless**: Multiunit likelihood variants (`multiunit_likelihood*.py`)
92+
- **GPU support**: `*_gpu.py` variants require CuPy installation
93+
- **Calcium imaging**: `calcium_likelihood.py`
94+
95+
### Dependencies Architecture
96+
97+
**Essential External Packages**:
98+
99+
- `track_linearization`: Spatial track handling (imported in `__init__.py`)
100+
- `regularized_glm`: Custom GLM implementation
101+
- Standard scientific stack: NumPy, SciPy, scikit-learn, pandas, xarray
102+
103+
**Conda Channels Required**:
104+
105+
- `conda-forge`: Standard scientific packages
106+
- `franklab`: Lab-specific neuroscience tools
107+
- `edeno`: Author's specialized packages
108+
109+
## Development Guidelines
110+
111+
### Code Patterns
112+
113+
- **State-space models**: Core pattern throughout codebase
114+
- **Fit/Estimate pattern**: Likelihood functions use `fit_*` (training) and `estimate_*` (inference)
115+
- **GPU/CPU variants**: Many functions have `*_gpu` equivalents requiring CuPy
116+
- **Clusterless vs Sorted**: Dual pathways throughout for different data types
117+
118+
### Critical Requirements
119+
120+
1. **Always use conda environment** - complex scientific dependencies
121+
2. **Development mode installation** - `python setup.py develop` for code changes
122+
3. **Notebook-based testing** - integration tests via tutorial notebooks
123+
4. **Scientific accuracy priority** - computational correctness over traditional software practices
124+
125+
### Common Issues
126+
127+
- **GPU warnings**: "Cupy not installed" messages are normal for CPU-only setups
128+
- **Long notebook execution**: Tutorial notebooks take 2-3 minutes each
129+
- **Setup.py deprecation warnings**: Expected but installation succeeds
130+
- **Documentation builds**: May require manual intervention due to jupytext dependency issues
131+
132+
### Files to Never Modify
133+
134+
- Tutorial notebooks in `notebooks/tutorial/` - these are integration tests
135+
- `environment.yml` - carefully balanced scientific dependencies
136+
- Version string in `replay_trajectory_classification/__init__.py` - managed by maintainers
137+
138+
## Testing Strategy
139+
140+
**Primary Testing**: Execute all 5 tutorial notebooks sequentially
141+
142+
1. `01-Introduction_and_Data_Format.ipynb`: Data format requirements
143+
2. `02-Decoding_with_Sorted_Spikes.ipynb`: Basic sorted spike decoding
144+
3. `03-Decoding_with_Clusterless_Spikes.ipynb`: Basic clusterless decoding
145+
4. `04-Classifying_with_Sorted_Spikes.ipynb`: Multi-model classification
146+
5. `05-Classifying_with_Clusterless_Spikes.ipynb`: Multi-model clusterless
147+
148+
**CI Pipeline**: GitHub Actions runs all notebooks on Ubuntu with Python 3.11
149+
150+
## Modernization Status
151+
152+
**✅ COMPLETE - All 3 Phases Implemented**:
153+
154+
**Phase 1 & 2 (Foundation + Metadata)**:
155+
- ✅ Added `pyproject.toml` with modern PEP 621 project metadata
156+
- ✅ Configured `ruff` as modern linter/formatter
157+
- ✅ Migrated all dependencies and optional dependency groups
158+
- ✅ Updated Python version support (3.10+, was incorrectly 3.11+)
159+
160+
**Phase 3 (Build System Modernization)**:
161+
-**Removed `setup.py` completely** - now pure `pyproject.toml`
162+
- ✅ Explicit package configuration for clean builds
163+
- ✅ Updated CI/CD to include modern `build` tool
164+
- ✅ Wheel building works perfectly without legacy setup.py
165+
166+
**Current State**:
167+
- **Pure modern installation**: `pip install -e .` (no more setup.py)
168+
- **Optional dependencies**: `pip install -e '.[dev]'`, `'.[test]'`, `'.[docs]'`
169+
- **Modern linting**: `ruff check` (67 style issues detected but non-breaking)
170+
- **Modern building**: `python -m build --wheel` creates proper wheels
171+
- **Full compatibility**: All notebooks and functionality preserved
172+
173+
## Key Architectural Decisions
174+
175+
1. **Notebook-based testing**: Integration tests using real scientific workflows rather than isolated unit tests
176+
2. **GPU-optional design**: All functionality works on CPU with GPU acceleration available
177+
3. **Dual data pathways**: Separate but parallel implementations for sorted vs clusterless data
178+
4. **Conda-first distribution**: Complex dependency tree requires conda package management
179+
5. **Scientific reproducibility**: Version pinning and environment specification prioritized over flexibility
180+
6. **Incremental modernization**: Modern tooling added alongside legacy support for seamless transition

0 commit comments

Comments
 (0)