This repository contains a Python pipeline for extracting ECG-derived features from clinical electrophysiological recordings. The code was developed for processing ECG signals stored in EDF or FIF files and extracting both heart rate variability (HRV) and ECG morphology features.
The pipeline uses MNE-Python for reading electrophysiological recordings and NeuroKit2 for ECG cleaning, R-peak detection, HRV analysis, and ECG waveform delineation.
The script performs the following steps:
- Loads EDF/FIF files using MNE.
- Detects the ECG/EKG channel automatically.
- Resamples the ECG signal to a target sampling frequency.
- Cleans the ECG signal using NeuroKit2.
- Detects R-peaks.
- Removes suspicious R-peaks based on RR interval outliers.
- Extracts HRV features.
- Extracts ECG morphology features.
- Saves extracted features as JSON files.
The code is designed to work with EDF/FIF files named using a patient identifier followed by the recording condition.
Example file names:
SubID_Pre_Ictal_1.edf
SubID_Inter_Ictal_1.edf
The patient ID is extracted from the first part of the file name before the first underscore.
For example:
100_Pre_Ictal_1.edf -> patient ID: 100
The condition is detected from the file name:
| Filename pattern | Detected condition |
|---|---|
Pre_Ictal, Pre-Ictal, Preictal |
preictal |
Inter_Ictal, Inter-Ictal, Interictal |
interictal |
Post_Ictal, Post-Ictal, Postictal |
postictal |
Ictal |
ictal |
If your file naming convention is different, update the parse_condition() function in the script.
The pipeline extracts time-domain, frequency-domain, and nonlinear HRV features, including:
- MeanNN
- SDNN
- RMSSD
- SDSD
- MinNN
- MaxNN
- LF
- HF
- LF/HF
- SD1
- SD2
- SD1/SD2
- CSI
- CVI
- Sample entropy
- Fuzzy entropy
- Rényi entropy
- Permutation entropy
- Dispersion entropy
- Spectral entropy
- Higuchi fractal dimension
- Lempel-Ziv complexity
The pipeline also extracts morphology-related ECG features, including:
- QRS duration
- ST segment
- PR interval
- PR segment
- QT interval
- P-wave amplitude
- Q-wave amplitude
- R-wave amplitude
- S-wave amplitude
- T-wave amplitude
- Q angle
- R angle
- S angle
Interval features are saved in milliseconds, and amplitude features are saved in microvolts.
The code was run using the following package versions:
mne==1.8.0
neurokit2==0.2.10
Other required packages include:
numpy
scipy
matplotlib
tqdm
You can install the required packages using:
pip install -r requirements.txtSave the main script as:
ecg_feature_extraction.py
Then run it from the command line:
python ecg_feature_extraction.py --input_dir "path/to/edf_or_fif_files" --output_dir "path/to/output_folder" --fs 256To save raw-vs-cleaned ECG plots as well:
python ecg_feature_extraction.py --input_dir "path/to/edf_or_fif_files" --output_dir "path/to/output_folder" --fs 256 --save_plotsFor each patient and condition, the pipeline saves a JSON file containing the extracted ECG features.
Example output files:
SubID_preictal_ECG_features.json
SubID_interictal_ECG_features.json
Each JSON file contains:
- file name
- patient ID
- condition
- recording duration
- sampling rate
- number of detected R-peaks
- ECG cleaning information
- R-peak filtering information
- HRV features
- ECG morphology features
Alaei, H. (2026). ECG Feature Extraction from EDF/FIF Recordings. GitHub repository:
https://github.com/Hesam-lab/ecg-feature-extraction