pip install -r requirements.txtpython generate_sample_data.pyThis creates a sample dataset with 1000 patients and ~15% sepsis rate.
python train.pyThis will:
- Load and preprocess the data
- Create time-series features
- Handle imbalanced data
- Train the model
- Save results
python evaluate.pypython inference.py --data data/raw/sepsis_data.csv --output results/predictions.csvAfter training, you should see:
- Model saved to
models/sepsis_model.pkl - Feature importance plot in
results/feature_importance.png - Test metrics in
results/test_metrics.csv
Edit configs/config.yaml to:
- Change model algorithm (xgboost, lightgbm, catboost)
- Adjust hyperparameters
- Modify feature engineering settings
- Change imbalanced learning method
See README.md for detailed documentation.
Issue: ModuleNotFoundError
- Solution: Make sure you're in the project root directory and dependencies are installed
Issue: FileNotFoundError for data
- Solution: Run
generate_sample_data.pyfirst
Issue: Memory errors
- Solution: Reduce
n_patientsingenerate_sample_data.pyor reducemax_featuresin config