Thank you for your interest in contributing! This guide will help you get set up and understand the project structure.
We welcome contributions from students, researchers, and industry professionals working on financial fraud detection and graph neural networks.
- Quick Start
- Development Environment
- Project Structure
- Graph Schema
- Running Tests
- Code Style
- Submitting Changes
# Clone the repository
git clone https://github.com/Brijeshrath67/AegisGraph-Sentinel-2.0.git
cd AegisGraph-Sentinel-2.0
# Create virtual environment (Python 3.9+ required)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Configure settings
cp config/config.yaml.example config/config.yaml
# Generate synthetic training data
python -m src.data.data_generator
# Train the model
python -m src.training.trainer
# Start the API server
python -m src.api.main- Minimum: Python 3.9
- Recommended: Python 3.11+
Create a .env file in the project root:
# Optional: Override default settings
LOG_LEVEL=INFO
DEVICE=cpu # cuda, mps, or cpu
MODEL_PATH=models/htgnn_best.pt- PyTorch (2.0+) - Deep learning framework
- PyTorch Geometric - Graph neural networks
- FastAPI - REST API
- NetworkX - Graph manipulation
- Redis - Caching layer
- Neo4j - Graph database (for production)
- librosa - Voice stress analysis
AegisGraph Sentinel 2.0/
├── config/ # Configuration files
│ ├── config.yaml # Main configuration
│ └── thresholds.yaml # Detection thresholds
├── src/
│ ├── models/ # HTGAT and neural network models
│ │ ├── htgat.py # Heterogeneous Temporal Graph Attention
│ │ ├── temporal_encoding.py
│ │ └── risk_model.py
│ ├── features/ # Feature extraction modules
│ │ ├── behavioral_biometrics.py # Keystroke dynamics
│ │ ├── velocity_calculator.py # Transaction velocity
│ │ ├── entropy_calculator.py # Graph entropy
│ │ ├── honeypot_escrow.py # Innovation 2
│ │ ├── predictive_mule_identification.py # Innovation 4
│ │ ├── voice_stress_analysis.py # Innovation 5
│ │ ├── blockchain_evidence.py # Innovation 6
│ │ └── aegis_oracle_explainer.py # Innovation 3
│ ├── inference/ # Risk scoring and explanation
│ │ ├── risk_scorer.py
│ │ └── explainer.py
│ ├── training/ # Training pipeline
│ │ ├── trainer.py
│ │ └── losses.py
│ ├── api/ # FastAPI service
│ │ ├── main.py
│ │ └── schemas.py
│ └── utils/ # Helper utilities
│ └── helpers.py
├── tests/ # Unit tests
├── data/ # Generated datasets (created at runtime)
├── models/ # Saved model checkpoints (created at runtime)
└── app.py # Streamlit web interface
The fraud detection graph contains 5 node types:
| Node Type | Description | Example |
|---|---|---|
| Account | Bank account | ACC123456789 |
| Device | Mobile/computer | DEV_abc123 |
| ATM | ATM machine | ATM_MUM_001 |
| Merchant | Payment merchant | MERCHANT_XYZ |
| IP | IP address | 192.168.1.1 |
Connections between nodes:
| Edge Type | Description | Attributes |
|---|---|---|
| Transfer | Money transfer between accounts | amount, timestamp, mode |
| Login | Device login to account | timestamp, location, success |
| Withdrawal | Cash withdrawal | amount, atm_id, timestamp |
| Association | Social linking | relationship_type |
# Account node
{
'account_id': str,
'account_type': str, # savings, current, wallet
'balance': float,
'is_mule': bool,
'risk_score': float # 0-1
}
# Device node
{
'device_id': str,
'device_type': str,
'fingerprint': str
}# Transfer edge
{
'amount': float,
'timestamp': datetime,
'mode': str, # UPI, IMPS, NEFT, RTGS
'transaction_id': str
}
# Login edge
{
'timestamp': datetime,
'ip_address': str,
'location': str,
'success': bool
}The system builds a dynamic subgraph for each transaction:
- Extract k-hop neighbors (k=3 by default)
- Limit to 1000 nodes / 5000 edges for performance
- Apply temporal filtering (configurable window)
pytest tests/pytest --cov=src tests/pytest tests/test_models.py
pytest tests/test_api.py
pytest tests/test_features.pypytest -k "test_risk"
pytest -v # Verbose output# Terminal 1: Start API
python -m src.api.main
# Terminal 2: Run tests
pytest tests/test_api.py -vOnce the API is running:
# Health check
curl http://localhost:8000/health
# Run comprehensive test
python test_all_innovations_comprehensive.pyWe use Black for code formatting:
black src/ tests/We use isort:
isort src/ tests/We use mypy:
mypy src/Install pre-commit to run checks automatically:
pip install pre-commit
pre-commit installflake8 src/ tests/issue-#<number>- For fixing specific issues (e.g.,issue-#12)feature/<name>- For new features (e.g.,feature/add-lateral-movement)hotfix/<name>- For urgent fixes
-
Create a branch:
git checkout -b issue-#12
-
Make your changes:
- Follow code style guidelines
- Add tests for new functionality
- Update documentation if needed
-
Run tests locally:
pytest tests/ # All tests should pass -
Commit with clear messages:
git add . git commit -m "feat: Add lateral movement detection - Track betweenness centrality history per account - Detect spikes from baseline using std multiplier - Add +0.25 risk for lateral movement - References MITRE ATT&CK TA0008"
-
Push and create PR:
git push origin issue-#12
Then create a PR on GitHub with:
- Clear title describing the change
- Summary of what was changed and why
- Links to related issues
We follow Conventional Commits:
feat:New featurefix:Bug fixdocs:Documentationrefactor:Code refactoringtest:Adding testschore:Maintenance
fix: Resolve hardcoded threshold in risk_scorer.pyfeat: Add honeypot escrow activation logicdocs: Update CONTRIBUTING.md with graph schema
- Issues: Open a GitHub issue for bugs or feature requests
- Discussions: Use GitHub Discussions for questions
- Documentation: Check
docs/folder for detailed docs
We welcome contributions from students, researchers, and industry professionals!
For GSoC applicants: See our project ideas and don't hesitate to ask questions about the graph-based fraud detection approach.