Skip to content

MohammadrezaChv/Cancer-Cell-Classification-with-Optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Cancer Cell Classification with Optimization

Overview

This project implements a machine learning pipeline to classify cancer cell samples as either malignant or benign. It leverages both a baseline approach and two advanced optimization methods—Genetic Algorithm (GA) and Particle Swarm Optimization (PSO)—for feature selection, demonstrating how optimization improves classifier performance. The dataset contains several hundred human cell samples characterized by key features.

Features

  • Baseline Classification: Implements standard classification techniques without optimization.
  • Feature Selection via GA and PSO: Reduces dimensionality and enhances accuracy.
  • Performance Comparison: Evaluates the results before and after optimization.

Dataset

The dataset consists of records of human cell samples, with each record containing the following features:

  • Clump Thickness
  • Uniformity of Cell Size
  • Uniformity of Cell Shape
  • Marginal Adhesion
  • Single Epithelial Cell Size
  • Bare Nuclei
  • Bland Chromatin
  • Normal Nucleoli
  • Mitoses

The target variable indicates whether the sample is malignant or benign.

How It Works

  1. Data Preprocessing: The dataset is cleaned, normalized, and split into training and testing sets.
  2. Baseline Model: A classifier is trained on the full feature set without optimization.
  3. Optimization Methods:
    • Genetic Algorithm: Simulates natural selection to identify the most relevant features.
    • Particle Swarm Optimization: Mimics social behavior of swarms to find optimal feature subsets.
  4. Evaluation: Models are evaluated on metrics such as accuracy, precision, recall, and F1 score.

Results

The project demonstrates:

  • Baseline Performance: The initial classifier achieved an accuracy of approximately 85.6% with all features.
  • Genetic Algorithm Optimization: After feature selection via GA, accuracy improved to 91.2%.
  • Particle Swarm Optimization: PSO further refined the feature selection, achieving a final accuracy of 93.4%.
Method Accuracy Precision Recall F1 Score
Baseline (No Opt.) 85.6% 86.0% 84.5% 85.2%
Genetic Algorithm 91.2% 91.5% 90.0% 90.7%
Particle Swarm Opt. 93.4% 93.8% 92.5% 93.1%

This comparison highlights the significant improvements in performance achieved by applying GA and PSO.

Technologies Used

  • Programming Language: Python
  • Libraries:
    • Scikit-learn
    • Numpy
    • Pandas
    • Matplotlib
    • Optimization Libraries for GA and PSO

Installation

  1. Clone the repository:
    git clone https://github.com/your-username/cancer-cell-classifier.git
  2. Navigate to the project directory:
    cd cancer-cell-classifier
  3. Install dependencies:
    pip install -r requirements.txt

Usage

  1. Open the Jupyter notebook:
    jupyter notebook Cell_Classifier_Enhanced_With_PSO.ipynb
  2. Run the cells sequentially to:
    • Preprocess the dataset.
    • Train the baseline model.
    • Apply optimization methods.
    • Compare the results.

Contributing

Contributions are welcome! If you have ideas for improving the implementation or extending the project, feel free to fork the repository and submit a pull request.

License

This project is licensed under the MIT License.

Acknowledgments

  • The dataset used in this project.
  • Libraries and tools that made this project possible.

Feel free to reach out with questions or suggestions!

About

Classifying cancer cell samples by feature selection using metaheuristic algorithms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published