git clone https://github.com/<your-username>/review-sentiment-classifier.git
cd review-sentiment-classifier
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
playwright install chromiumpython src/train.pyMonitor live training at http://localhost:6006 by running in a second terminal:
tensorboard --logdir outputs/runspython src/evaluate.py outputs/model_epoch3.ptpython src/predict.py "The food was incredible, best restaurant I've been to in years!"python src/scraper.py "https://www.google.com/maps/place/..."Results are saved to data/google_reviews.csv.
Why DistilBERT? 40% smaller and 60% faster than BERT with only a ~3% accuracy drop — the right trade-off for local training on consumer hardware.
Why MAE alongside accuracy? Star ratings are ordinal — being off by 1 star is fundamentally different from being off by 4. MAE captures prediction magnitude, accuracy does not.
Why 5% of the dataset? Transformer fine-tuning exhibits strong diminishing returns on data volume. 5% produces a model within ~5% accuracy of full-data training at a fraction of the compute cost.
Why MPS? Apple's Metal Performance Shaders provide 3–5x speedup over CPU for transformer workloads on M1/M2 MacBooks — no code changes needed beyond device selection.
- Python 3.9+
- PyTorch 2.0+ (with MPS support for Apple Silicon)
- See
requirements.txtfor full dependencies