Feature Selection using Simulated Annealing

This project demonstrates a feature selection technique using a binary simulated annealing algorithm. The goal is to identify the most relevant features in the "Heart Failure Prediction" dataset to predict heart disease effectively. The project uses a Support Vector Machine (SVM) as the classifier to evaluate the selected features.

Project Structure

Notebook/: This directory contains the core Python source code and the Jupyter notebook for experimentation.
- FS.py: Contains functions for feature selection and evaluation.
- SA.py: Implements the binary simulated annealing algorithm.
- testing.ipynb: A Jupyter notebook that demonstrates the workflow of the project, from data loading and preprocessing to feature selection and evaluation.
requirements.txt: A list of Python dependencies required to run the project.
README.md: This file, providing an overview and instructions.

Setup

Prerequisites

Python 3.x
pip (Python package installer)

Dependencies

To run this project, you need to install the required Python libraries. You can install them using pip and the requirements.txt file:

pip install -r requirements.txt

Dataset

This project uses the "Heart Failure Prediction" dataset, which should be named heart.csv and placed in the root directory of the project. The dataset is not included in this repository, but it can be obtained from sources like Kaggle.

Usage

The primary workflow is demonstrated in the Notebook/testing.ipynb Jupyter notebook. To run the project, you can follow these steps:

Launch Jupyter Notebook:
```
jupyter notebook
```
Open and run the notebook:
- Navigate to Notebook/testing.ipynb.
- Run the cells in the notebook to see the feature selection process in action.

The notebook will:

Load and preprocess the heart.csv dataset.
Use the binary_simulated_annealing function from SA.py to find the best subset of features.
Evaluate the performance of the selected features using an SVM classifier.
Print the best feature combination and the corresponding accuracy score.

How It Works

The project employs a binary simulated annealing algorithm to explore the feature space. Each feature is represented by a bit in a binary array (solution), where 1 means the feature is selected, and 0 means it is not. The algorithm iteratively generates new candidate solutions by flipping bits and evaluates them based on the performance of an SVM classifier. The goal is to find the combination of features that minimizes the classifier's error rate (1 - accuracy).

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Notebook		Notebook
TeX		TeX
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Feature Selection using Simulated Annealing

Project Structure

Setup

Prerequisites

Dependencies

Dataset

Usage

How It Works

About

Uh oh!

Contributors 2

Uh oh!

Languages

License

TheMn/feature-selection-simulated-annealing

Folders and files

Latest commit

History

Repository files navigation

Feature Selection using Simulated Annealing

Project Structure

Setup

Prerequisites

Dependencies

Dataset

Usage

How It Works

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages