This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtCopy .env.example to .env and fill in RPC settings before running collector or bot flows.
python tools/collect_continuous.py
python scripts/build_dataset_new.py --lifecycle-dir data/training --output-dir data/datasets
python scripts/run_hybrid_training.py --output-dir data/models --lifecycle-dir data/training
python -m src.trader.botpython -m unittest discover -s tests -p "test_*.py"
python -m unittest tests.core.test_rpc_config
python -m unittest tests.model.test_run_hybrid_training_cli
python -m unittest tests.smoke.test_surviving_workflow_importstools/memectl is the repo-supported wrapper for long-running services, but it depends on the shell helpers in tools/lib and enforces Linux/Darwin.
./tools/memectl collector start
./tools/memectl collector status
./tools/memectl collector logs -f
./tools/memectl bot start
./tools/memectl bot status
./tools/memectl bot logs -fRoot CLAUDE.md covers repo-wide workflow only. Before editing inside config/, src/, src/core/, src/data/, src/trader/, or tools/, read the nearest AGENTS.md in that subtree for more specific guidance.
This repository is a plain Python application repo, not a packaged library. src is the import root, and several entry scripts prepend the repo root to sys.path.
The current repo is organized around a single main workflow:
- collect token lifecycle data from FourMeme/BSC
- build datasets and features from lifecycle JSONL files
- train a hybrid model
- buy side: CatBoost classifier
- sell side: BC warmstart + PPO policy
- run the bot on live events for paper or real trading
The most important entrypoints are:
tools/collect_continuous.py— realtime lifecycle collectionscripts/build_dataset_new.py— dataset build CLIscripts/run_hybrid_training.py— hybrid training CLIsrc/trader/bot.py— bot runtime entrypoint
config/config.pyhandles RPC-role separation, listener mode, contract config, and provider validation.config/trading_config.pyholds trading toggles and risk parameters.
src/core/ws_manager.pymanages websocket connectivity.src/core/listener.pyis the main FourMeme event listener. It uses websocket head tracking plus HTTPget_logspolling/fallback for robustness.
tools/collect_continuous.pyorchestrates listener + queue workers + periodic save/flush.src/data/collector.pymaintains in-memory lifecycle state, incremental flushes, and final snapshots.- Collector resume behavior depends on both persisted lifecycle files and
data/training/collector_runtime_state.json; moving collector state between environments without that checkpoint changes where listener resume starts.
src/data/dataset_builder.pyloads lifecycle files and produces training samples.src/data/feature_extractor.pycontains feature extraction logic used by dataset building and bot inference.scripts/build_dataset_new.pyauto-discovers lifecycle input from--lifecycle-dir,DATASET_LIFECYCLE_DIR, or default data directories if not provided explicitly.
src/pipeline/train_hybrid.pyorchestrates buy-model training, BC warmstart, PPO finetuning, and manifest output.src/model/buy_catboost.pycontains the buy classifier.src/model/hybrid_inference.pyloadsbuy_model.cbm,buy_threshold.json, and optionalsell_policy.zip.src/rl/*contains the sell-side RL environment, reward, PPO training, and BC warmstart.
src/trader/bot.pywires listener, collector, model loading, inference, and position management together.src/core/trader.pyis the transaction executor. It uses dedicated HTTP RPC for trade submission rather than the websocket listener connection.
Expected repo-local artifacts:
- lifecycle data:
data/training/lifecycle_*.jsonlanddata/training/lifecycle_incremental_*.jsonl - datasets:
data/datasets/*.jsonl - trained models:
data/models/buy_model.cbmbuy_threshold.jsonbc.ptsell_policy.ziphybrid_manifest.json
The bot can run without model artifacts, but it falls back to data collection behavior if no trained hybrid model is found.
Prefer the role-specific env vars from .env.example:
BSC_WSS_URLfor websocket listener connectivityBSC_LOG_HTTP_ENDPOINTSfor listenerget_logspolling poolBSC_TRADE_HTTP_RPCfor sending transactions
BSC_HTTP_RPC is treated as legacy fallback compatibility, not the preferred path.
This repository can send real on-chain transactions when trading is enabled. Keep ENABLE_TRADING=false unless the user explicitly asks for real trading changes or validation. Treat PRIVATE_KEY and RPC credentials as sensitive runtime configuration.
Tests are primarily unittest-driven even though filenames follow pytest-style naming. Prefer python -m unittest ... commands when validating targeted changes.
tools/memectlis intended for Linux/macOS shells.systemd/README.mddocuments service installation for the collector.- When working on Windows, prefer direct Python entrypoints over
memectlunless the user specifically wants service-wrapper changes.
The current core workflow is the collector/dataset/hybrid-bot path described above. Files such as src/core/processor.py and four_meme_buyer.py appear less central than the main path and should be treated cautiously before modifying or relying on them.