[BOUNTY #819] Fix indefinite hanging on certain hosts#964
[BOUNTY #819] Fix indefinite hanging on certain hosts#964zhaog100 wants to merge 1 commit intoprojectdiscovery:mainfrom
Conversation
Add per-connection timeout to prevent tlsx from hanging indefinitely during long-running scans. - Add connectWithTimeout() wrapper in runner that enforces timeout via context, preventing goroutines from blocking forever on stalled connections (especially with cipher/version enumeration) - Fix openssl client dial using context.TODO() (no timeout) to use context.Background() with explicit timeout derived from options.Timeout Closes projectdiscovery#819
Neo - PR Security ReviewNo security issues found Highlights
Hardening Notes
Comment |
WalkthroughThe changes add timeout mechanisms to prevent indefinite hangs during TLS connection operations. A dynamic per-connection timeout (defaulting to 10 seconds, tripled for enumeration modes) is applied to TLS connections in the runner, while dial operations receive per-call timeout contexts. Both changes use Go's context timeout pattern to bound operations. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/runner/runner.go (1)
438-455: Potential goroutine accumulation when timeouts occur.The timeout context is not propagated to
ConnectWithOptions(theServiceand underlying TLS clients create their own internal contexts fromcontext.Background()perpkg/tlsx/tlsx.go:63-85,pkg/tlsx/tls/tls.go:109-114,pkg/tlsx/ztls/ztls.go:117-122). When the outer context times out, the spawned goroutine continues running untilConnectWithOptionscompletes naturally.This is a reasonable workaround given the interface doesn't accept a context, and it does achieve the primary goal of preventing the main scan from hanging. The leak is bounded by concurrency since workers process tasks sequentially. However, under high timeout rates during long scans, abandoned goroutines may accumulate.
Consider adding a comment documenting this limitation:
📝 Suggested documentation
// connectWithTimeout wraps ConnectWithOptions with a context timeout to prevent indefinite hangs +// Note: The context is not propagated to the underlying TLS client, so timed-out connections +// continue in the background until they complete naturally. This prevents worker stalls but +// may result in goroutine accumulation under high timeout conditions. func (r *Runner) connectWithTimeout(ctx context.Context, tlsxService *tlsx.Service, host, ip, port, sni string) (*clients.Response, error) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runner/runner.go` around lines 438 - 455, The connectWithTimeout function spawns a goroutine that calls tlsx.Service.ConnectWithOptions but cannot propagate the outer ctx into ConnectWithOptions (which lacks a context parameter), so when ctx times out the goroutine keeps running until ConnectWithOptions returns and abandoned goroutines can accumulate under high timeout rates; update connectWithTimeout to include a clear comment above the function (or at least inside it) documenting this limitation: mention that ConnectWithOptions does not accept a context, explain that the goroutine will continue after ctx.Done(), state that this leak is bounded by worker concurrency and the design choice to avoid changing the tlsx.Service interface, and note recommended mitigations (reduce concurrency/timeouts or upstream API change to accept context) so future maintainers understand the tradeoff.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/runner/runner.go`:
- Around line 438-455: The connectWithTimeout function spawns a goroutine that
calls tlsx.Service.ConnectWithOptions but cannot propagate the outer ctx into
ConnectWithOptions (which lacks a context parameter), so when ctx times out the
goroutine keeps running until ConnectWithOptions returns and abandoned
goroutines can accumulate under high timeout rates; update connectWithTimeout to
include a clear comment above the function (or at least inside it) documenting
this limitation: mention that ConnectWithOptions does not accept a context,
explain that the goroutine will continue after ctx.Done(), state that this leak
is bounded by worker concurrency and the design choice to avoid changing the
tlsx.Service interface, and note recommended mitigations (reduce
concurrency/timeouts or upstream API change to accept context) so future
maintainers understand the tradeoff.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5b35d084-b814-43b1-92c7-abeeba861dc3
📒 Files selected for processing (2)
internal/runner/runner.gopkg/tlsx/openssl/openssl.go
Summary
Fixes #819 — tlsx hangs indefinitely during long-running scans (e.g., 30k targets over ~18 hours).
Root Cause
processInputElementWorkerininternal/runner/runner.gocalledConnectWithOptionswithout any overall timeout safety net. If a connection stalled (e.g., TCP accepted but TLS handshake never completes), the goroutine would block forever.pkg/tlsx/openssl/openssl.godialed withcontext.TODO(), meaning no timeout was applied to the TCP connection phase.Changes
internal/runner/runner.go: AddedconnectWithTimeout()wrapper that enforces a per-connection timeout viacontext.WithTimeout. For cipher/version enumeration modes, the timeout is tripled since multiple connections are made.pkg/tlsx/openssl/openssl.go: Replacedcontext.TODO()with a timeout context derived fromoptions.Timeoutfor the dial call.Testing
go build ./...passesCloses #819
Summary by CodeRabbit