An advisory-only quantitative research platform for the Nepal Stock Exchange (NEPSE) — combining data-quality assurance, technical analysis, realistic backtesting, and explainable ML signal fusion.
Important
This project is research-only. It does not provide financial advice, guarantee profit, or execute live trades. All outputs are advisory and require human review. There is no live broker execution and no autonomous trading.
- Overview
- Key Features
- Architecture
- Tech Stack
- Getting Started
- Usage
- API Reference
- Project Structure
- Implementation Roadmap
- Testing & Quality
- Contributing
- Research Boundary
- License
- Maintainer
The NEPSE AI Trading Research Platform is a comprehensive quantitative research system built specifically for the Nepal Stock Exchange. It delivers an end-to-end workflow — from raw market-data ingestion through strategy development, realistic backtesting, and advisory signal generation — designed for individual traders, finance students, and small quant teams.
NEPSE is a small, illiquid market where data quality and execution assumptions are critical. This platform addresses that by enforcing trust-score data-quality gates, modelling realistic transaction costs, and making every signal explainable.
These are research/evaluation gates — not guarantees of trading profit.
| Metric | Target |
|---|---|
| Sharpe Ratio | > 1.2 |
| Maximum Drawdown | < 20% |
| Win Rate | > 55% |
| Data Trust (90% of symbols) | ≥ 0.7 |
| Feature Freshness | < 24h |
- Multi-source data ingestion — ShareSansar, MeroLagani scraper with manual CSV fallbacks.
- Data-quality gating — automated trust scoring (completeness, consistency, freshness, volume, cross-source) with
NORMAL/DEGRADED/SAFE_MODEsystem states. - Realistic backtesting — fees (0.5%), slippage (5 bps), liquidity filters, partial fills, execution delay, stop-loss / take-profit / trailing-stop exits, and benchmark comparison.
- Explainable signal fusion — combines technical, ML, and sentiment signals with calibrated confidence and feature attribution (SHAP).
- Machine learning suite — baseline models (logistic / random forest / XGBoost), LSTM forecasting, XLM-R sentiment, and an experimental meta-learning research module (RL/GNN planned, not yet implemented).
- MLOps & governance — model registry, drift monitoring, automated retraining, and human-approval promotion gates.
- 15-page Streamlit dashboard + a notebook-driven research workflow and a full REST API with auto-generated Swagger docs.
Data Sources → Ingestion → Validation → Database → Feature Engineering
→ Data Quality → Trust Scoring → Backtesting → Dashboard / API / Alerts
The system is organized into three layers:
- User interfaces — Streamlit dashboard, Jupyter notebooks, and the FastAPI REST API.
- Backend services — ingestion, data quality, feature engineering, backtesting, signal fusion, risk management, ML training/inference, and MLflow tracking.
- Data & storage — PostgreSQL (optionally TimescaleDB), Redis cache/queue, and disk artifacts for models, datasets, and backtest results.
See
docs/ARCHITECTURE.mdfor the full system design and data model.
| Layer | Technologies |
|---|---|
| Backend & API | Python 3.12, FastAPI, Uvicorn, Pydantic v2 |
| Database & ORM | PostgreSQL 13+ (TimescaleDB optional), SQLAlchemy 2.0, Alembic |
| Cache & Queue | Redis |
| ML & Tracking | scikit-learn, XGBoost, PyTorch (LSTM), MLflow, NumPy/Pandas |
| Dashboard & Research | Streamlit, Jupyter, Plotly |
| DevOps | Docker & Docker Compose, GitHub Actions, Kubernetes (infra/k8s/) |
| Quality | pytest, ruff, mypy |
- Python 3.12
- PostgreSQL 13+ and Redis (or use Docker Compose, which provisions both)
- Git
**Windows (PowerShell)**
# 1. Clone the repository
git clone https://github.com/Aashish-po/Nepse-AI-Trading-System.git
cd "Nepse-AI-Trading-System"
# 2. Create and activate a virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1
# 3. Install dependencies
python -m pip install -U pip
pip install -r requirements-dev.txt
# 4. Create your environment file
copy .env.example .env**macOS / Linux (bash)**
# 1. Clone the repository
git clone https://github.com/Aashish-po/Nepse-AI-Trading-System.git
cd Nepse-AI-Trading-System
# 2. Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
# 3. Install dependencies
python -m pip install -U pip
pip install -r requirements-dev.txt
# 4. Create your environment file
cp .env.example .envCopy .env.example to .env and fill in the required values. Key variables:
| Variable | Description |
|---|---|
DATABASE_URL |
PostgreSQL connection string |
REDIS_URL |
Redis connection string |
JWT_SECRET_KEY |
Secret for JWT auth (min. 32 chars) |
MLFLOW_TRACKING_URI |
MLflow tracking server URI |
Warning
Never commit your .env file or any secrets. See Documents/7_Security.md for the full secrets-management policy.
The fastest way to bring up the full stack (PostgreSQL, Redis, API, dashboard, MLflow):
docker compose -f infra/docker-compose.yml up --builduvicorn backend.app.main:app --host 127.0.0.1 --port 8000 --app-dir .- Health check: http://127.0.0.1:8000/health
- Interactive Swagger UI: http://127.0.0.1:8000/docs
pip install -r dashboard/requirements-dashboard.txt
streamlit run dashboard/app.pyThe dashboard runs at http://localhost:8501 with 15 pages: Market Overview, Strategies, Backtesting, Signals, Live Signals, Features, Data Sources, Alerts, System Status, ML Models, Analytics, MLOps, Explainability, Paper Trading, and Factor Analysis.
python scripts/seed_symbols.pyPopulates the database with NEPSE stock symbols for initial testing.
Price data is scraped independently of the backend and ingested from CSV:
python nepse_data/scraper.py --today # scrape ShareSansar + Merolagani -> nepse_data/data/
# then ingest the scraped CSVs into the prices table:
curl -X POST "http://localhost:8000/market/ingest/batch?symbol=NABIL" \
-H "Content-Type: application/json" -d '{"source": "csv_ingestion"}'The scraper writes one CSV per source per day to nepse_data/data/{sharesansar,merolagani}/YYYY-MM-DD.csv.
CsvIngestionService (backend/app/services/csv_ingestion.py) reads those files, normalizes both
sources to OHLCV (Merolagani floor-sheet ticks are aggregated per symbol), merges them on
(symbol, date), and upserts into prices. Symbols must already exist in stocks (seed first);
unresolved symbols are skipped and reported. Omit symbol from the request to ingest all symbols.
alembic upgrade head # apply latest schema
alembic revision --autogenerate -m "description" # create a new migration
alembic downgrade -1 # roll back one revisionThe platform supports a notebook-driven research cycle:
1. Idea → 2. Notebook Experiment → 3. Backtest → 4. Strategy Integration → 5. Dashboard/Report
Start with research/notebooks/ (01_idea_to_backtest.ipynb, 02_backtest_and_export.ipynb, 03_integrate_bundle.ipynb). All experimental results must be validated through the tested backtesting pipeline before integration.
The REST API exposes auth, market data, features, data quality, strategies/backtests, signals, ML, portfolio, explainability, governance, MLOps, and analytics routes. Full interactive docs are available at /docs when the API is running.
**View full endpoint table**
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Health check (status, environment, version, scope) |
| GET | /health/live |
Liveness probe (process up) |
| GET | /health/ready |
Readiness probe (database reachable) |
| GET | /metrics |
Prometheus metrics (request counts, latency histogram) |
| POST | /auth/register |
Register a new user |
| POST | /auth/login |
Login and receive access token |
| GET | /market/prices |
List price data with optional symbol filter |
| POST | /market/ingest |
Ingest a single price record for a stock |
| POST | /market/ingest/batch |
Batch ingest OHLCV data for a date range |
| POST | /features/generate |
Compute features for a single symbol/date |
| POST | /features/generate-batch |
Compute features for a date range (single symbol) |
| POST | /features/generate-multi |
Compute features across multiple symbols |
| GET | /data-quality/trust/{symbol}/{date} |
Get trust score and quality details |
| GET | /data-quality/safe/{symbol}/{date} |
Check if data is safe to use (trust >= 0.7) |
| GET | /data-quality/summary/{symbol} |
Symbol quality summary (avg trust, unsafe days, issues) |
| POST | /data-quality/reports/daily |
Generate daily data quality report |
| GET | /data-quality/alerts |
List data quality alerts |
| POST | /data-quality/alerts/{alert_id}/acknowledge |
Acknowledge an alert |
| GET | /data-quality/trends/{symbol} |
Trust score trend over 30 days |
| GET | /data-quality/freshness/{symbol}/{date} |
Check data freshness (last update vs expected) |
| GET | /data-quality/system-mode |
Get system mode (NORMAL / DEGRADED / SAFE_MODE) |
| GET | /data-quality/cross-validate/{symbol}/{date} |
Cross-validate price across active data sources |
| GET | /data-quality/source-accuracy/{source_id} |
Accuracy score for a data source |
| GET | /data-quality/weighted-price/{symbol}/{date} |
Source-weighted average price |
| GET | /data-quality/source-drift/{source_id} |
Detect drift in a data source's record volume |
| GET | /data-quality/mode-history |
System mode history |
| POST | /data-quality/sources/recover-blacklisted |
Attempt to recover blacklisted sources |
| POST | /data-quality/trust/apply-decay |
Apply time-based decay to old trust scores |
| POST | /strategies/ |
Create a new strategy |
| GET | /strategies/ |
List all strategies |
| GET | /strategies/{strategy_id} |
Get strategy details |
| POST | /strategies/backtests |
Run backtest for a strategy |
| GET | /strategies/backtests/{backtest_id} |
Get backtest results |
| POST | /strategies/benchmarks/compare |
Compare strategy vs buy-and-hold / NEPSE |
| POST | /ml/train |
Train an ML model |
| GET | /ml/models |
List trained models |
| GET | /ml/predict/{symbol} |
Get model prediction for a symbol |
| POST | /portfolio/account |
Create/reset a portfolio account |
| GET | /portfolio/account/snapshot |
Portfolio snapshot (equity, cash, positions) |
| POST | /portfolio/optimize |
Optimize allocation (equal / risk-parity / mean-variance) |
| GET | /explain/models/{model_id}/importance |
Global feature importance (SHAP + fallbacks) |
| POST | /explain/models/{model_id}/predict |
Local attribution + trade explanation |
| GET | /governance/models |
List models by governance state |
| POST | /governance/models/{model_id}/submit |
Submit a model for approval |
| POST | /governance/models/{model_id}/approve |
Approve a model |
| POST | /governance/models/{model_id}/production |
Mark an approved model production-ready |
| GET | /mlops/champion |
Select best model by metric |
| GET | /mlops/rank |
Rank registered models by metric |
| GET | /mlops/models/{model_id}/retrain-assessment |
Assess whether a model needs retraining |
| POST | /mlops/retrain |
Trigger a retraining run |
| POST | /mlops/evolve |
Evolutionary hyperparameter search |
| GET | /analytics/market-overview |
Market overview with top gainers/losers |
| GET | /analytics/signals |
Signal explorer with filters and summary |
| POST | /analytics/portfolio |
Portfolio analytics from an equity curve |
| POST | /alerts/evaluate |
Evaluate rule-based alerts |
**View directory tree**
.
├── .github/workflows/ # CI (ci.yml), image publish (docker-publish.yml), Pages
├── AGENTS.md # Common dev commands
├── pyproject.toml # Project + ruff/mypy config
├── requirements.txt # Runtime dependencies
├── requirements-dev.txt # Dev/test dependencies
├── alembic.ini # Alembic config
├── pytest.ini # pytest config
│
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI application entry point
│ │ ├── core/ # config, logging, security, dependencies
│ │ ├── api/routes/ # auth, health, market, features, data_quality, strategies,
│ │ │ # signals, ml, lstm, portfolio, explainability, mlops, analytics
│ │ ├── models/ # SQLAlchemy ORM models
│ │ ├── schemas/ # Pydantic request/response contracts
│ │ ├── services/ # Business logic (ingestion, backtest, signal_fusion, ...)
│ │ └── db/migrations/ # Alembic revisions 0001 … 0011
│ └── tests/ # pytest suite (incl. phase gates)
│
├── ml/ # ML / research modules (training, inference, lstm, sentiment, meta_learning, ...)
├── strategies/ # Strategy definitions and experiments
├── backtesting/ # Backtesting helpers
├── features/ # Technical indicators (features/indicators.py)
├── scripts/ # seed_symbols.py, smoke_test.py, backup_db.sh
│
├── dashboard/ # Streamlit dashboard (single app, 15 pages)
├── research/notebooks/ # Idea → backtest → integrate notebook workflow
├── docs/ # Reference specs (ARCHITECTURE, PHASES, SUCCESS_METRICS, ...)
├── infra/ # docker-compose.yml, docker/, k8s/
└── models/ # Trained model artifacts *.joblib (gitignored)
Phases 0–13 are implemented in code; Phase 14 (experimental AI) is partially present — ml/meta_learning.py and the sentiment module exist, while RL/GNN (PPO, DQN, GNN, ensembles) are not yet implemented. Phase numbering matches the phase-gated tests (test_phase6_validation.py, test_phase8_gate.py, test_phase10_integration.py).
| Phase | Focus Area | Key Deliverables |
|---|---|---|
| 0 | Foundation | Project structure, env config, CI, /health endpoint |
| 1 | Backend & Database | FastAPI skeleton, SQLAlchemy ORM, Alembic migrations, JWT auth/RBAC, symbol seeding |
| 2 | Data Ingestion & Quality | Ingestion service, validation rules, trust scoring, data-quality gate, alerting |
| 3 | Feature Engineering | Technical indicators (RSI/SMA/EMA/MACD/ATR), returns/volatility, point-in-time feature store |
| 4 | Data Quality & Reliability | Trust model, daily reports, quality alerts, system modes (NORMAL/DEGRADED/SAFE_MODE) |
| 5 | Strategy & Backtesting | Strategy registry, realistic backtesting, benchmark comparison |
| 6 | Research Workflow & Dashboard | Streamlit dashboard, notebook workflow, export validation |
| 7 | Baseline ML | Logistic regression, random forest, XGBoost, walk-forward training, promotion gates |
| 8 | LSTM & Sentiment | LSTM next-day forecasting, XLM-R/lexicon sentiment, NEPSE market calendar |
| 9 | Signal Fusion & Risk | Signal fusion engine, risk manager, position sizing (advisory; enforced: false) |
| 10 | Portfolio Optimization | In-memory account simulation, allocation methods (equal/risk-parity/mean-variance) |
| 11 | Explainability (SHAP) | Feature importance, local attribution, trade explanations, model governance |
| 12 | MLOps / Monitoring / Retraining | Model selection, auto-retraining, hyperparameter evolution, drift monitoring |
| 13 | Production Hardening & Deployment | Docker images, Kubernetes manifests, CI/CD, Prometheus monitoring, DB backups |
| 14 | Experimental AI | Meta-learning research module (ml/meta_learning.py); PPO/DQN/GNN/ensembles not yet implemented |
Phase 14 RL/GNN modules are planned experimental research code and not yet implemented. Live broker execution and autonomous trading remain out of scope.
# Run the full test suite
python -m pytest backend/tests/ -v
# Lint
python -m ruff check backend/
# Type check
mypy backend/ --ignore-missing-imports --explicit-package-basesThe CI pipeline (.github/workflows/ci.yml) runs ruff, the full pytest suite, import smoke tests, and mypy on every push and pull request to main and develop.
Contributions are welcome! To get started:
-
Fork the repository and create a feature branch from
main(e.g.git checkout -b feature/your-feature). -
Set up your environment (see Installation).
-
Make your changes, following the existing code style.
-
Validate locally before pushing:
python -m ruff check backend/ python -m pytest backend/tests/ -v
-
Commit with a clear message and open a pull request against
main, describing the change and linking any related issues.
Please ensure all tests pass and lint is clean — CI must be green before a PR can be merged. Common development commands are documented in AGENTS.md.
This platform is intentionally limited to ingestion, data quality, features, realistic backtesting, strategy research, dashboards, and advisory outputs. The following are explicitly out of scope:
- ❌ Live broker execution or order placement
- ❌ Autonomous / automated trading
- ❌ Financial advice or guaranteed returns
- ❌ Options, derivatives, or forex (equities only)
All advisory outputs require human review. See docs/RISK_DISCLAIMER.md for the full disclaimer.
This project is licensed under the MIT License — see the LICENSE file for details.
- GitHub: @Aashish-po
- Repository: Nepse-AI-Trading-System
Built for the NEPSE quantitative-research community. ⭐ Star the repo if you find it useful!