Skip to content

Apply hard timeout to compare.py subprocess in pam.py#1

Open
ds17f wants to merge 1 commit into
principis:masterfrom
ds17f:fix/pam-subprocess-timeout
Open

Apply hard timeout to compare.py subprocess in pam.py#1
ds17f wants to merge 1 commit into
principis:masterfrom
ds17f:fix/pam-subprocess-timeout

Conversation

@ds17f
Copy link
Copy Markdown

@ds17f ds17f commented May 5, 2026

Replace subprocess.call with subprocess.run(timeout=...) so a wedged compare.py cannot hang the PAM stack indefinitely.

The bug

pam.py invokes compare.py with subprocess.call(...), which has no timeout. If compare.py hangs — for example, the IR camera opens but never returns a frame, or a v4l2 ioctl blocks forever — the entire PAM auth path is stuck waiting on the subprocess. Real-world symptom: sudo hangs forever with no way to fall back to password auth.

compare.py does enforce its own per-attempt timeout (video.timeout) internally, but only when its read loop is running. If the camera never returns the first frame, that timer never starts.

The fix

compare_timeout = config.getint(\"video\", \"timeout\") + 5
try:
    status = subprocess.run([...], timeout=compare_timeout).returncode
except subprocess.TimeoutExpired:
    pamh.conversation(pamh.Message(pamh.PAM_ERROR_MSG, \"Face detection timeout reached\"))
    syslog.syslog(syslog.LOG_INFO, \"Failure, compare.py exceeded \" + str(compare_timeout) + \"s and was killed\")
    syslog.closelog()
    return pamh.PAM_AUTH_ERR

Bound is video.timeout + 5s. The +5s is grace for compare.py to exit cleanly under normal conditions; the SIGKILL path only triggers if compare.py itself is unresponsive.

On TimeoutExpired we emit the same conversation message and return code as the existing status == 11 path, plus a more diagnostic syslog line that records the configured bound.

Observed before/after

  • Before: stuck IR camera → sudo hangs forever, can't fall back to password.
  • After: sudo waits ~timeout + 5 seconds → Face detection timeout reached → falls through to next auth method.

Scope

  • One file, +15/-2.
  • Pure subprocess-management change; recognition logic untouched.
  • Theoretical regression: compare.py legitimately taking longer than its own configured timeout to exit cleanly. In practice that means something is already wrong, and the bound (timeout + 5) gives a generous grace window.

Note: filed against principis/howdy because pam.py was removed from boltgolt/howdy master in the C++ PAM rewrite (3.0.0). The bug is identical in scope to boltgolt/howdy v2.6.1, which principis tracks.

See also boltgolt#1107 (ffmpeg_reader bug fixes) and boltgolt#1108 (dark_threshold doc clarification) — adjacent fixes against the upstream Python codebase that principis pulls from.

Replace subprocess.call with subprocess.run(timeout=...) so a wedged
compare.py (e.g. a camera that opens but never returns a frame, or a
v4l2 ioctl that blocks forever) cannot hang the PAM stack indefinitely.

The timeout bound is video.timeout + 5s. video.timeout is already the
per-attempt limit compare.py applies internally; the +5s grace lets
compare.py exit cleanly under normal conditions and only triggers the
SIGKILL path if compare.py itself is unresponsive.

On TimeoutExpired we emit the same conversation message and syslog
entry as the existing status==11 path, and return PAM_AUTH_ERR. The
log message also records the configured bound for diagnostics.

Observed before this change: a stuck IR camera causing 'sudo' to hang
forever with no way to fall back to password auth. Observed after:
sudo waits ~timeout+5 seconds, prints 'Face detection timeout reached',
and falls through to the next auth method.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant