Skip to content

[BOUNTY #819] Fix indefinite hanging on certain hosts#964

Open
zhaog100 wants to merge 1 commit intoprojectdiscovery:mainfrom
zhaog100:feat/fix-hanging-connections
Open

[BOUNTY #819] Fix indefinite hanging on certain hosts#964
zhaog100 wants to merge 1 commit intoprojectdiscovery:mainfrom
zhaog100:feat/fix-hanging-connections

Conversation

@zhaog100
Copy link

@zhaog100 zhaog100 commented Mar 18, 2026

Summary

Fixes #819 — tlsx hangs indefinitely during long-running scans (e.g., 30k targets over ~18 hours).

Root Cause

  • The processInputElementWorker in internal/runner/runner.go called ConnectWithOptions without any overall timeout safety net. If a connection stalled (e.g., TCP accepted but TLS handshake never completes), the goroutine would block forever.
  • The openssl client in pkg/tlsx/openssl/openssl.go dialed with context.TODO(), meaning no timeout was applied to the TCP connection phase.
  • This manifested as hangs after thousands of successful connections, with output truncated mid-JSON line.

Changes

  1. internal/runner/runner.go: Added connectWithTimeout() wrapper that enforces a per-connection timeout via context.WithTimeout. For cipher/version enumeration modes, the timeout is tripled since multiple connections are made.
  2. pkg/tlsx/openssl/openssl.go: Replaced context.TODO() with a timeout context derived from options.Timeout for the dial call.

Testing

Closes #819

Summary by CodeRabbit

  • Bug Fixes
    • Improved TLS connection timeout handling to prevent indefinite hangs during connection attempts
    • Enhanced error reporting to distinguish connection timeout failures from other error types with clearer timeout-specific messaging
    • Optimized timeout behavior for TLS enumeration operations with dynamic timeout adjustment

Add per-connection timeout to prevent tlsx from hanging indefinitely
during long-running scans.

- Add connectWithTimeout() wrapper in runner that enforces timeout via
  context, preventing goroutines from blocking forever on stalled
  connections (especially with cipher/version enumeration)
- Fix openssl client dial using context.TODO() (no timeout) to use
  context.Background() with explicit timeout derived from options.Timeout

Closes projectdiscovery#819
@neo-by-projectdiscovery-dev
Copy link

neo-by-projectdiscovery-dev bot commented Mar 18, 2026

Neo - PR Security Review

No security issues found

Highlights

  • Adds timeout safety mechanisms to prevent indefinite hanging during TLS scans
  • Introduces connectWithTimeout() wrapper with context-based timeout enforcement
  • Replaces context.TODO() with proper timeout context for openssl dial operations
Hardening Notes
  • Consider adding validation in UseOpenSSLBinary() at pkg/tlsx/openssl/common.go:99 to verify the binary path doesn't contain shell metacharacters, though current exec.CommandContext usage is already safe
  • The goroutine spawned in connectWithTimeout() at internal/runner/runner.go:445 continues execution even after timeout; consider passing the context into ConnectWithOptions to enable early cancellation

Comment @pdneo help for available commands. · Open in Neo

@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

Walkthrough

The changes add timeout mechanisms to prevent indefinite hangs during TLS connection operations. A dynamic per-connection timeout (defaulting to 10 seconds, tripled for enumeration modes) is applied to TLS connections in the runner, while dial operations receive per-call timeout contexts. Both changes use Go's context timeout pattern to bound operations.

Changes

Cohort / File(s) Summary
Timeout handling for TLS connections
internal/runner/runner.go
Introduces connectWithTimeout helper function that wraps ConnectWithOptions in a goroutine with context-based timeout. Dynamically computes connection timeout from runner options (10s default, tripled for TLS enumeration modes). Adds context deadline error handling with timeout-specific logging.
Timeout context for dial operations
pkg/tlsx/openssl/openssl.go
Replaces context.TODO() with per-call timeout context (dialCtx) created via context.WithTimeout, ensuring dial operations are bounded by explicit timeouts.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 No more hanging on the line,
With timeout guards, things work just fine!
Context whispers, "time's up, friend,"
Before the threads reach endless end. 🕐

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the main change: fixing indefinite hanging on certain hosts by applying connection timeouts.
Linked Issues check ✅ Passed Code changes implement required fixes from issue #819: connectWithTimeout enforces per-connection timeouts in runner.go, and timeout context replaces TODO() in openssl.go to prevent indefinite hangs.
Out of Scope Changes check ✅ Passed All changes are scoped to addressing the hanging issue: timeout enforcement in runner.go and openssl.go with no extraneous modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/runner/runner.go (1)

438-455: Potential goroutine accumulation when timeouts occur.

The timeout context is not propagated to ConnectWithOptions (the Service and underlying TLS clients create their own internal contexts from context.Background() per pkg/tlsx/tlsx.go:63-85, pkg/tlsx/tls/tls.go:109-114, pkg/tlsx/ztls/ztls.go:117-122). When the outer context times out, the spawned goroutine continues running until ConnectWithOptions completes naturally.

This is a reasonable workaround given the interface doesn't accept a context, and it does achieve the primary goal of preventing the main scan from hanging. The leak is bounded by concurrency since workers process tasks sequentially. However, under high timeout rates during long scans, abandoned goroutines may accumulate.

Consider adding a comment documenting this limitation:

📝 Suggested documentation
 // connectWithTimeout wraps ConnectWithOptions with a context timeout to prevent indefinite hangs
+// Note: The context is not propagated to the underlying TLS client, so timed-out connections
+// continue in the background until they complete naturally. This prevents worker stalls but
+// may result in goroutine accumulation under high timeout conditions.
 func (r *Runner) connectWithTimeout(ctx context.Context, tlsxService *tlsx.Service, host, ip, port, sni string) (*clients.Response, error) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/runner/runner.go` around lines 438 - 455, The connectWithTimeout
function spawns a goroutine that calls tlsx.Service.ConnectWithOptions but
cannot propagate the outer ctx into ConnectWithOptions (which lacks a context
parameter), so when ctx times out the goroutine keeps running until
ConnectWithOptions returns and abandoned goroutines can accumulate under high
timeout rates; update connectWithTimeout to include a clear comment above the
function (or at least inside it) documenting this limitation: mention that
ConnectWithOptions does not accept a context, explain that the goroutine will
continue after ctx.Done(), state that this leak is bounded by worker concurrency
and the design choice to avoid changing the tlsx.Service interface, and note
recommended mitigations (reduce concurrency/timeouts or upstream API change to
accept context) so future maintainers understand the tradeoff.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/runner/runner.go`:
- Around line 438-455: The connectWithTimeout function spawns a goroutine that
calls tlsx.Service.ConnectWithOptions but cannot propagate the outer ctx into
ConnectWithOptions (which lacks a context parameter), so when ctx times out the
goroutine keeps running until ConnectWithOptions returns and abandoned
goroutines can accumulate under high timeout rates; update connectWithTimeout to
include a clear comment above the function (or at least inside it) documenting
this limitation: mention that ConnectWithOptions does not accept a context,
explain that the goroutine will continue after ctx.Done(), state that this leak
is bounded by worker concurrency and the design choice to avoid changing the
tlsx.Service interface, and note recommended mitigations (reduce
concurrency/timeouts or upstream API change to accept context) so future
maintainers understand the tradeoff.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5b35d084-b814-43b1-92c7-abeeba861dc3

📥 Commits

Reviewing files that changed from the base of the PR and between d13b67f and bef527d.

📒 Files selected for processing (2)
  • internal/runner/runner.go
  • pkg/tlsx/openssl/openssl.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tlsx hangs indefinitely for some hosts

1 participant