Heart Sounds Segmentation

This project implements an automated heart sound segmentation system using deep learning. Originally developed in MATLAB, this version has been reimplemented using PyTorch and PyTorch Lightning frameworks while maintaining the same neural network architecture for direct performance comparison. The system utilizes the Fourier Synchrosqueezed Transform (FSST) for signal processing, implemented using MATLAB-generated C++ code.

Dataset

The project uses the DavidSpringerHSS dataset, which has been adapted for seamless integration with PyTorch data loaders. This dataset consists of CSV files containing heart sound recordings and their corresponding segmentation labels.

The labeling scheme is as follows:

1 -> Sound 1 (S1)
2 -> Systolic interval
3 -> Sound 2 (S2)
4 -> Diastolic interval

These labels represent the four key components of the cardiac cycle that the model aims to identify.

Training

The training pipeline splits the data into three subsets: training, validation, and testing. The model is optimized using the ADAM optimizer with a dynamic learning rate that decreases by 10% after each epoch. Training is performed with a batch size of 50.

To prevent overfitting, the implementation incorporates several regularization techniques:

Early stopping to halt training when validation performance plateaus
Gradient clipping to stabilize training
Learning rate scheduling for optimal convergence

Evaluation

Model performance is rigorously evaluated using 10-fold cross-validation. The following metrics are tracked to ensure comprehensive performance assessment:

Accuracy: Overall correctness of predictions
Precision: Measure of prediction quality
Recall: Measure of prediction completeness
F1 Score: Harmonic mean of precision and recall
Area under the ROC (AUROC): Overall classification performance

All models were trained using torch.float32 precision.

LSTM + CrossEntropy

The baseline model uses a bidirectional LSTM with CrossEntropy loss:

Class	Accuracy (mean ± std)	Precision (mean ± std)	Recall (mean ± std)	F1 (mean ± std)	AUROC (mean ± std)
S1	0.8966 ± 0.0148	0.8812 ± 0.0171	0.8966 ± 0.0148	0.8887 ± 0.0117	0.9908 ± 0.0019
Sys. int	0.9226 ± 0.0089	0.9252 ± 0.0136	0.9226 ± 0.0089	0.9238 ± 0.0103	0.9937 ± 0.0020
S2	0.8891 ± 0.0141	0.8920 ± 0.0107	0.8891 ± 0.0141	0.8905 ± 0.0119	0.9934 ± 0.0017
Dias. int	0.9585 ± 0.0078	0.9623 ± 0.0059	0.9585 ± 0.0078	0.9604 ± 0.0055	0.9939 ± 0.0018
Average	0.9167 ± 0.0114	0.9152 ± 0.0118	0.9167 ± 0.0114	0.9159 ± 0.0099	0.9930 ± 0.0019

LSTM + CRF

Adding a Conditional Random Field (CRF) layer on top of the LSTM improves sequence modeling by learning transition probabilities between cardiac states. The CRF enforces valid state transitions (S1 → Systole → S2 → Diastole → S1) during both training and inference:

Class	Accuracy (mean ± std)	Precision (mean ± std)	Recall (mean ± std)	F1 (mean ± std)	AUROC (mean ± std)
S1	0.9239 ± 0.0115	0.9191 ± 0.0131	0.9239 ± 0.0115	0.9214 ± 0.0088	0.9949 ± 0.0010
Sys. int	0.9399 ± 0.0126	0.9469 ± 0.0101	0.9399 ± 0.0126	0.9433 ± 0.0076	0.9958 ± 0.0011
S2	0.9106 ± 0.0166	0.9154 ± 0.0073	0.9106 ± 0.0166	0.9128 ± 0.0073	0.9955 ± 0.0007
Dias. int	0.9715 ± 0.0053	0.9691 ± 0.0083	0.9715 ± 0.0053	0.9702 ± 0.0040	0.9959 ± 0.0007
Average	0.9365 ± 0.0115	0.9376 ± 0.0097	0.9365 ± 0.0115	0.9369 ± 0.0069	0.9955 ± 0.0008

Note

AUROC is computed using marginal probabilities from the forward-backward algorithm, which properly incorporates the learned transition constraints. The CRF model outperforms the baseline across all metrics.

Usage

To run the example yourself you need to install pixi.sh. Then you will simply run:

pixi install

Once it finishes downloading the dependencies on your machine, you will be able to run the training and evaluation.

pixi run python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
examples		examples
hss		hss
scripts		scripts
test		test
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pixi.lock		pixi.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart Sounds Segmentation

Dataset

Training

Evaluation

LSTM + CrossEntropy

LSTM + CRF

Usage

References

About

Uh oh!

Uh oh!

Languages

License

alvgaona/heart-sounds-segmentation

Folders and files

Latest commit

History

Repository files navigation

Heart Sounds Segmentation

Dataset

Training

Evaluation

LSTM + CrossEntropy

LSTM + CRF

Usage

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages