Safeguards to avoid crashing when using external R peaks #1181
Safeguards to avoid crashing when using external R peaks #1181Viri1990 wants to merge 3 commits into
Conversation
|
Thanks for opening this pull request! We'll make sure it's perfect before merging 🤗 |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the stability of the ECG delineation process by introducing several safeguards. These changes prevent potential crashes that could occur when external R-peak inputs lead to out-of-bounds array accesses or attempts to perform operations on empty data slices, thereby making the ecg_delineate function more robust. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to add safeguards to prevent crashes when using external R peaks for ECG delineation. However, some of these safeguards are incomplete or misplaced, potentially leading to denial-of-service (DoS) vulnerabilities if the input data is user-controlled. For instance, the safeguard in _prominence_find_s_wave has a check for an empty slice placed after the np.argmax operation it's meant to protect, and another check fails to account for negative indices.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #1181 +/- ##
==========================================
+ Coverage 57.78% 57.80% +0.02%
==========================================
Files 310 310
Lines 15680 15684 +4
==========================================
+ Hits 9060 9066 +6
+ Misses 6620 6618 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi Viri1990, thank you for the request. I see you are trying to skip any r peaks that are found outside the range and trying to account for those. I am wondering if silently skipping them is the optimal choice here, or if we should throw an error at the start of the delineate function if the r peaks indices extend past the data array. I worry that people might think that everything has run normally, without issue, while maybe their r-peak indices list had an offset of sorts which led to all the r-peaks to actually be in the wrong place etc. Did this happen to you where an external R peak array extended past the data range? How did that happen? What do you think? Thank you! |
Hii Johannes, The problem that I faced here is that the Rpeaks used to split the ECG signal in beats ("ecg_sig" in line 800) is not taking into account the corner case in which the Rpeak is located at the boundary, so the ecg_sig goes from 0 to n, and the Rpeak is located at n, so the typical IndexError raises and crashes the execution. Thanks!! |
|
Hi Viri, sorry for the late response. I just looked at this section of code for 30 minutes and wondered how reliable it is overall. Firstly, I think the length checks should be replaced by any checks to see if there are even valid values in the result. If not, then we could return anyway. However, then I looked at the fallback code that comes afterwards for the s wave and saw that if s == r, then it does a minima check on the raw data itself. This feels pretty hacky and I wonder how reliable this even is. My opinion would be to actually remove this fallback, but I wonder about what you think. I have posted a diff below of my exploration (which also keeps the minima fallback in, but essentially is dead code) to get your opinion on this. I have also added more comments to the code since it was very unclear without it. It is also more robust then the length check since for the s wave, the length will always be more than 0 since its including the rpeak in the check. I just also changed the q wave to use this too for cleanliness. Looking forward to your answer . diff --git a/neurokit2/ecg/ecg_delineate.py b/neurokit2/ecg/ecg_delineate.py
index 49d919675..45e97c772 100644
--- a/neurokit2/ecg/ecg_delineate.py
+++ b/neurokit2/ecg/ecg_delineate.py
@@ -772,18 +772,31 @@ def _prominence_find_q_wave(weight_minima, current_wave, max_r_rise_time):
if "ECG_R_Peaks" not in current_wave:
return
q_bound = max(current_wave["ECG_R_Peaks"] - max_r_rise_time, 0)
- if len(weight_minima[q_bound : current_wave["ECG_R_Peaks"]]) == 0:
+
+ # Keep to minima values as we want the deepest trough before the R
+ q_potential = weight_minima[q_bound : current_wave["ECG_R_Peaks"]]
+
+ if not q_potential.any():
return
- current_wave["ECG_Q_Peaks"] = np.argmax(weight_minima[q_bound : current_wave["ECG_R_Peaks"]]) + q_bound
+
+ current_wave["ECG_Q_Peaks"] = np.argmax(q_potential) + q_bound
def _prominence_find_s_wave(sig, weight_minima, current_wave, max_qrs_interval):
if "ECG_Q_Peaks" not in current_wave:
return
s_bound = current_wave["ECG_Q_Peaks"] + max_qrs_interval
- if len(weight_minima[current_wave["ECG_R_Peaks"] : s_bound] > 0) == 0:
+
+ # Change to a true list since we want the first downward peak
+ s_potential = weight_minima[current_wave["ECG_R_Peaks"] : s_bound] > 0
+
+ # r peak is 0, so we can keep it in and check if there is an s wave
+ if not (s_potential).any():
return
- s_wave = np.argmax(weight_minima[current_wave["ECG_R_Peaks"] : s_bound] > 0) + current_wave["ECG_R_Peaks"]
+
+ s_wave = np.argmax(s_potential) + current_wave["ECG_R_Peaks"]
+
+ # If s is the same as r, check for the lowest value in the qrs bound
current_wave["ECG_S_Peaks"] = (
np.argmin(sig[current_wave["ECG_R_Peaks"] : s_bound]) + current_wave["ECG_R_Peaks"]
if s_wave == current_wave["ECG_R_Peaks"]
|
Hello again Johannes, Thanks for your effort! |

Three checks to avoid crashing when sending external R peaks to delineate the ECG