This project focuses on optimizing cell cryopreservation protocols by leveraging machine learning models to predict cell viability and recovery rates. It includes scripts for data preprocessing, model training (Random Forest and XGBoost), and optimization using both Differential Evolution and Bayesian Optimization techniques.
(Other files like Beginning.py, convert.py, first_process.py, second_process.py are helper scripts for the data processing pipeline)
-
Clone the repository:
git clone https://github.com/Fu-fu-f/Cell-Cryopreservation-Optimization.git cd Cell-Cryopreservation-Optimization -
Create a virtual environment (recommended):
python3 -m venv venv source venv/bin/activate -
Install the required dependencies:
pip install -r requirements.txt
The project involves a sequence of steps from data processing to model training and finally optimization.
The initial data is in Data_raw.xlsx. The following scripts process it sequentially:
Beginning.py: Performs initial cleaning.convert.py: Converts the Excel file to CSV (Data_raw.csv).first_process.py: Expands ingredient columns (processed_data.csv).second_process.py: Combines similar columns to producefinal_data.csv.
You can run them in order, although the final processed data (final_data.csv) is already included in the repository.
You can train the models yourself or use the pre-trained models provided.
-
To train XGBoost models:
python XGboost/train_models.py
This will generate
Recovery_model.joblibandViability_model.joblibinsideXGboost/trained_models/. -
To train Random Forest models:
python "Algorithm random forest/train_rf_models.py"This will generate
Recovery_model_rf.joblibandViability_model_rf.joblibinsideAlgorithm random forest/trained_rf_models/.
Once the models are trained (or using the provided ones), you can run the optimization scripts. These scripts are interactive and will prompt you for input.
-
Differential Evolution with XGBoost models:
python XGboost/differential_evolution_optimization.py
-
Differential Evolution with Random Forest models:
python "Algorithm random forest/random_forest.py" -
Bayesian Optimization with XGBoost models:
python xgboost_bayesian/xgboost_bayesian_optimization.py