Welcome to the BTC Dip Detector project! This machine learning repository is built to detect significant dips in Bitcoin prices using historical data, statistical indicators, and classification models. This project is being developed as part of The Knowledge House 2025 Data Science Fellowship.
This diagram shows how DipDetectorML fetches live Bitcoin data, checks for price dips, runs a machine learning forecast, and then delivers alerts and dashboard updates to the user
Team Name: Wave Riders ๐
Team Members:
- Jessenia (Lead)
- Kachi
- Mohammed
- Rosania
- Belkis
DipDetectorML is a hybrid system that combines:
- โก Rule-based live alerts โ monitors Bitcoin (BTC) in real time and emails users when dips of 2%, 5%, 10%, 15%, or 20% occur.
- ๐ค Machine learning forecasting โ trains a Random Forest model on historical BTC data to predict the probability of a โฅ20% monthly dip and sends a monthly warning email if risk is high.
Itโs designed for long-term Bitcoin accumulators who never sellโonly buy โas mucho and as often as possible.โ By layering real-time alerts with predictive foresight, DipDetectorML helps users supercharge their Dollar Cost Averaging (DCA) strategy, stacking more BTC at better prices.
Bitcoin is one of the most volatile assets in modern finance:
- ๐ Daily or weekly dips of ~10โ20% can occur during volatile periods.
- ๐ Even in bull markets, ~20โ27% corrections are common โhealthy pullbacks.โ
- ๐ In Feb 2025, BTC saw a ~17% monthly drop.
- ๐ป Full bear markets can bring 50โ80% drawdowns.
Static buy orders in exchange apps are tied to fixed prices, quickly go stale as BTC trends, and provide no foresight.
DipDetectorML is different: alerts are percentage-based (adaptive at any price) and we also provide a predictive monthly risk signal. Itโs built for accumulators who DCA steadily but size up on deep dips and plan liquidity ahead of downturns.
This project is also a real-world ML case study:
- ๐งน Data collection/cleaning from crypto APIs
- ๐ ๏ธ Feature engineering (returns, volatility, RSI-14, log (volume))
- ๐ง Model training with class imbalance
- ๐ฌ Deployment: live polling, CSV logging, multi-recipient email
End-to-end system for monitoring and predicting BTC dips:
-
Live dip alerts
- Polls CoinGecko
/coins/marketsevery 60s - Checks 24h & 7d % changes vs thresholds (2/5/10/15/20)
- Appends each sample to
data/live_btc_log.csv(proof + charts) - Emails all subscribers whose preferences match the dip
- Polls CoinGecko
-
Historical training + ML forecast
- Pulls daily OHLC + volume via
/market_chart - Features: 1/3/7-day returns, rolling vol (7/14d), RSI-14, log(volume)
- Trains RandomForestClassifier to predict a โฅ20% next-month dip risk
- Saves model to
models/rf_monthly.pkl - Runs monthly to send one โhigh-riskโ forecast email
- Pulls daily OHLC + volume via
Primary users are:
- Retail crypto investors and long-term accumulators.
- Students and analysts learning applied machine learning on financial time-series.
- Educators and instructors reviewing our project for technical rigor.
They want:
- ๐ง Clear, reliable email alerts.
- ๐ Simple visuals and dashboards.
- ๐ฎ Actionable insights (probabilities, not just thresholds).
Clone the repo and set up dependencies:
git clone https://github.com/your-username/DipDetectorML.git
cd DipDetectorML
python3 -m venv .venv
source .venv/bin/activate # Mac/Linux
.\.venv\Scripts\activate # Windows
pip install -r requirements.txtBTC-DipDetector-ML/
โ
โโโ Data/
โ โโโ bitcoin_cleaned_with_features.csv
โ
โโโ files/
โ โโโ dip_detector_model.pkl
โ โโโ feature_list.pkl
โ โโโ monthly_forecast.py
โ
โโโ Images/
โ โโโ DipDetectorML_Architectural_Flowchart.png
โ
โโโ Notebooks/
โ โโโ Belkis.ipynb
โ โโโ Jessenia.ipynb
โ โโโ ML_Random_Forest_Mohammed.ipynb
โ โโโ Onyekachi.ipynb
โ โโโ Rosania.ipynb
โ
โโโ DipDetectorML_FlowChart.md
โโโ DipDetectorMLapp.py
โโโ core.py
โโโ requirements.txt
โโโ README.md
โโโ .env
Live BTC prices: CoinGecko API ๐ฆ
Historical CSV: cached from CoinGeckoโs /market_chart endpoint
Model outputs: /data/ml_monthly_prob.json
Weโll build a 20-example test set covering:
Easy: clear 20% dips (label = 1).
Medium: borderline dips (10โ19% changes).
Hard: false positives (volatile days that recover).
Out-of-scope: sideways markets with no significant dips.
Correctness defined by: modelโs probability alignment with true label (โฅ0.5 = dip, else no dip), plus evaluation on AUC and precision/recall.
Key libraries used in this project:
pandas, numpy โ data wrangling
requests โ API calls (CoinGecko)
scikit-learn, joblib โ ML (Random Forest)
matplotlib, seaborn, plotly โ visualization
streamlit โ dashboard demo
boto3, python-dotenv โ (future SES integration)
See requirements.txt for full list.
Scikit-learn: Logistic Regression, Random Forest (main ML).
XGBoost / LightGBM: optional boosted-tree experiments.
(Stretch) TensorFlow/Keras for LSTM.
Requests / Pandas: API ingestion and data wrangling.
Matplotlib / Seaborn: visualization.
SMTP integration: alerts/notifications.
The Random Forest classifier was trained on BTC daily OHLC + features like lagged returns, RSI, and volatility.
AUC: 0.74
Precision: 0.68
Recall: 0.55
Confusion Matrix + Classification Report: included in /reports/metrics.txt.
Example test row outputs include:
Date
Engineered features
Predicted probability
Actual next-day return
Label (dip / no dip)
- Live alerts: System polls CoinGecko and triggers emails on threshold breach. (For demo we temporarily set a tiny threshold, e.g., โ0.1%, to force an alert and prove delivery.)
- Random Forest model: Trains on cached CSV of historical BTC features; outputs a probability of a โฅ20% monthly dip.
- Notebook includes 10โ20 test rows with: date, engineered features, predicted probability, actual next-day return, and label (dip/no dip).
- Report AUC and precision/recall at the chosen probability cutoff.
- Include confusion matrix and classification report (Mohammedโs code).
- SMTP wiring is in progress (Kachi). Alerts trigger correctly; configuring credentials/routes for reliable inbox delivery is the remaining step.
DipDetectorML is a smarter accumulation assistant for long-term holders:
Provides immediate situational awareness (rule-based alerts).
Adds predictive foresight for portfolio risk management (ML monthly forecast).
Educationally, demonstrates both classical ML (Random Forest, Logistic Regression) and time-series approaches (possible LSTM as a stretch goal).
- ๐ Better than static buy orders: static limit prices go stale; percent-based alerts adapt at any price level.
- ๐งญ Control + discipline: we notify; you decide sizing. Example rule-set:
- 5% dip โ advance weekly DCA
- 10% dip โ 2ร DCA
- 20% dip โ 3ร DCA
- ๐ฎ Predictive foresight: the monthly crash forecast helps you stage cash ahead of major downturns rather than reacting late.
- โ๏ธ Class imbalance: true โฅ20% dips are rare; requires careful thresholding and metrics.
- โณ Non-stationarity: crypto regimes change; models can drift.
- ๐ API rate limits: CoinGecko free tier is 100k requests/month โ poll responsibly.
- ๐ฏ Scope: MVP focuses on BTC; multi-asset remains future work.
- ๐ง Email delivery: SMTP configuration still being finalized.
Address imbalance with class_weight="balanced" in Random Forest/LogReg.
Consider expanding dataset to include multiple coins (ETH, XRP) for training, even if evaluation is BTC-only.
Keep LSTM/GRU as a stretch goal; Random Forest = main deliverable.
Allow flexible thresholds for rule-based alerts to balance alert frequency.
- ๐ Backtests: compare DCA-only vs DCA + DipDetectorML over 1โ3 years (extra BTC stacked).
- ๐ Add ETH, XRP, and others; cross-asset signals.
- ๐ฐ Add news/sentiment features (Fed/CPI, ETF flows, exchange events).
- ๐ Model upgrades: XGBoost/LightGBM; sequence models (LSTM/GRU).
- ๐ฒ Slack bot or mobile push as alternatives to email.
- ๐ฅ User preferences: per-user tiers, unsubscribe links, dashboards.
- โ๏ธ AWS SES + DynamoDB for production alerts.
- Jessenia (Lead): ๐ Slack + GitHub setup, README, standups, documentation, demo coordination, final submission.
- Onyekachi: โก Built live API polling + dip logic, CSV logging; actively integrating SMTP for email delivery.
- Mohammed: ๐ค Implemented RandomForestClassifier, engineered features, ran Grid/RandomizedSearchCV, produced AUC/precision/recall, confusion matrix, and saved the model with
joblib. - Rosania: ๐ Drafted documentation and background research.
- Belkis: โ๏ธ Email intergration collaboration.
- Maurice (Mentor) ๐ก โ for constructive guidance, review, and unwavering support throughout the project.
- Farukh (Instructor) ๐ โ for technical instruction, ML insight and insightful feedback.
- Gaurav (Instructor) ๐ โ for teaching, feedback, and continued assistance.
Special thanks also to CoinGecko ๐ฆ for providing free API access that made the live alert system possible.
MIT License.
CoinGecko API data subject to their terms of service: https://www.coingecko.com/en/terms


