fix: support comma-separated items in list file (issue #859)#965
fix: support comma-separated items in list file (issue #859)#965FraktalDeFiDAO wants to merge 5 commits intoprojectdiscovery:mainfrom
Conversation
…tdiscovery#819) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
WalkthroughAdds synchronization to PDCP and file writer components, improves TLS cipher handshake handling with per-connection timeouts and context-aware handshakes, extends input parsing to accept comma-separated items, and introduces a race-test exercising concurrent UploadWriter usage. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant CipherLoop as CipherEnumerator
participant Pool
participant BaseConn as BaseConnection
participant TLS as TLSHandshake
participant Server
Caller->>CipherLoop: request EnumerateCiphers()
loop per-cipher
CipherLoop->>Pool: acquire base connection
alt acquire success
Pool-->>CipherLoop: base conn
CipherLoop->>BaseConn: clone TLS config, create ctx with timeout
CipherLoop->>TLS: start HandshakeContext(ctx) (goroutine)
TLS->>Server: TLS handshake
alt handshake success
Server-->>TLS: ServerHello / cipher
TLS-->>CipherLoop: success (cipher)
else timeout / error
TLS-->>CipherLoop: error / timeout
end
CipherLoop->>BaseConn: defer close
else acquire fail
Pool-->>CipherLoop: error (skip cipher)
end
end
CipherLoop-->>Caller: collected ciphers
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/pdcp/writer.go (1)
53-58:⚠️ Potential issue | 🟠 MajorFinish protecting the remaining shared-state reads.
These new locks cover the setters,
upload(), andgetRequest(), butuploadChunk()still readsu.assetGroupID/u.TeamIDat Line 196 and the deferred log inautoCommit()reads them again at Lines 131-135 withoutRLock. Becauseinternal/pdcp/race_test.gonow mutatesSetAssetID()concurrently, the writer can still race on those paths.🔒 Suggested follow-up
+func (u *UploadWriter) snapshotMeta() (assetID, assetName, teamID string) { + u.mu.RLock() + defer u.mu.RUnlock() + return u.assetGroupID, u.assetGroupName, u.TeamID +} + func (u *UploadWriter) getRequest(bin []byte) (*retryablehttp.Request, error) { var method, url string - u.mu.RLock() - assetID := u.assetGroupID - assetName := u.assetGroupName - teamID := u.TeamID - u.mu.RUnlock() + assetID, assetName, teamID := u.snapshotMeta()Use the same snapshot helper in
uploadChunk()and in the deferred logging block insideautoCommit().Also applies to: 105-122, 223-228, 239-243
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/pdcp/writer.go` around lines 53 - 58, The code still reads shared fields without locking in uploadChunk() and the deferred logging in autoCommit(); use the existing snapshot helper (the same one used in upload() / getRequest()) to capture u.assetGroupID and u.TeamID under the RLock and then use that snapshot inside uploadChunk() and the deferred log in autoCommit(); likewise replace direct reads in the other affected blocks (the setter paths and the regions mentioned around getRequest()/upload() and lines ~223-243) with the snapshot helper to eliminate races on assetGroupID/TeamID.
🧹 Nitpick comments (2)
internal/runner/runner.go (1)
443-449: Consider extracting the comma-splitting logic into a helper function.The identical comma-splitting logic is duplicated for both file input and stdin. Extracting it would reduce code duplication and make future maintenance easier.
♻️ Suggested refactor
Add a helper function:
// processCommaSeparatedLine splits a line by comma and processes each item func (r *Runner) processCommaSeparatedLine(text string, inputs chan taskInput) { for _, item := range strings.Split(text, ",") { item = strings.TrimSpace(item) if item != "" { r.processInputItem(item, inputs) } } }Then simplify both call sites:
scanner := bufio.NewScanner(file) for scanner.Scan() { text := scanner.Text() if text != "" { - // Split by comma like -u flag does - for _, item := range strings.Split(text, ",") { - item = strings.TrimSpace(item) - if item != "" { - r.processInputItem(item, inputs) - } - } + r.processCommaSeparatedLine(text, inputs) } }Apply the same change for the stdin block.
Also applies to: 458-464
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runner/runner.go` around lines 443 - 449, Duplicate comma-splitting logic should be extracted into a helper on Runner to avoid duplication: add a method like processCommaSeparatedLine(text string, inputs chan taskInput) on Runner that iterates strings.Split(text, ","), trims each item and calls r.processInputItem(item, inputs) for non-empty items; then replace the duplicated blocks (the loop that currently calls strings.Split in both the file-input and stdin branches) with calls to r.processCommaSeparatedLine(text, inputs) to keep behavior identical and centralize the logic.internal/pdcp/race_test.go (1)
15-66: This is still a stress harness, not a regression test.Right now it only passes if nothing panics or hangs. Without a deterministic assertion—or a CI job that actually runs it under
go test -race—a future synchronization regression can still come back green.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/pdcp/race_test.go` around lines 15 - 66, The test TestUploadWriterExploit is a nondeterministic stress harness; convert it into a deterministic regression test that fails on hang/panic instead of silently passing. Replace the free-running harness with a bounded scenario: use pdcp.NewUploadWriterCallback to create writer and a context with a firm timeout, run the concurrent SetAssetID and callback goroutines but signal completion via channels, and require writer.Close() to return before the timeout (use select on done or ctx.Done()); if Close blocks or any goroutine panics report t.Fatal. Keep references to writer.GetWriterCallback, writer.SetAssetID, writer.Close and the pdcp.NewUploadWriterCallback creation so the test reliably fails when the synchronization bug reappears and can be run under go test -race in CI.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/pdcp/race_test.go`:
- Around line 16-23: The test's fixed 20s context deadline can deadlock because
autoCommit() stops reading u.data on ctx.Done(), while the test waits for sender
goroutines to finish before calling writer.Close(), allowing senders to block
forever on channel sends; fix by avoiding a deadline that races with sender
shutdown—either remove the context timeout and use cancel/close-based shutdown,
or ensure the test calls writer.Close() (or cancel the context) before wg.Wait()
so autoCommit() unblocks u.data sends; locate and update the test setup where
ctx is created and where wg.Wait()/writer.Close() are used, and adjust to
cancel/close first (references: autoCommit, u.data, writer.Close, ctx, wg.Wait
in the failing test).
In `@pkg/tlsx/tls/tls.go`:
- Line 239: baseCfg is being mutated each loop by assigning to
baseCfg.CipherSuites, which can cause races or persistent state; inside the loop
in getConfig (or whichever function iterates), create a per-iteration copy of
the TLS config (e.g., clone baseCfg into a local variable) and set
localCfg.CipherSuites = []uint16{tlsCiphers[v]} instead of mutating baseCfg,
ensuring you reference baseCfg, localCfg, CipherSuites, tlsCiphers and the
getConfig/loop code path so the shared config is never modified across
iterations.
In `@pkg/tlsx/ztls/ztls.go`:
- Around line 261-262: The loop builds TLS clients with the wrong cipher because
baseCfg.CipherSuites is set after calling tls.Client(baseConn, baseCfg) and
baseCfg is mutated across iterations; fix by cloning the TLS config before each
iteration (do not reuse/mutate baseCfg) and set cfg.CipherSuites =
[]uint16{ztlsCiphers[v]} on the cloned config prior to calling
tls.Client(baseConn, cfg) so each client/handshake uses the intended cipher
(refer to baseCfg, tls.Client and ztlsCiphers[v] in ztls.go).
---
Outside diff comments:
In `@internal/pdcp/writer.go`:
- Around line 53-58: The code still reads shared fields without locking in
uploadChunk() and the deferred logging in autoCommit(); use the existing
snapshot helper (the same one used in upload() / getRequest()) to capture
u.assetGroupID and u.TeamID under the RLock and then use that snapshot inside
uploadChunk() and the deferred log in autoCommit(); likewise replace direct
reads in the other affected blocks (the setter paths and the regions mentioned
around getRequest()/upload() and lines ~223-243) with the snapshot helper to
eliminate races on assetGroupID/TeamID.
---
Nitpick comments:
In `@internal/pdcp/race_test.go`:
- Around line 15-66: The test TestUploadWriterExploit is a nondeterministic
stress harness; convert it into a deterministic regression test that fails on
hang/panic instead of silently passing. Replace the free-running harness with a
bounded scenario: use pdcp.NewUploadWriterCallback to create writer and a
context with a firm timeout, run the concurrent SetAssetID and callback
goroutines but signal completion via channels, and require writer.Close() to
return before the timeout (use select on done or ctx.Done()); if Close blocks or
any goroutine panics report t.Fatal. Keep references to
writer.GetWriterCallback, writer.SetAssetID, writer.Close and the
pdcp.NewUploadWriterCallback creation so the test reliably fails when the
synchronization bug reappears and can be run under go test -race in CI.
In `@internal/runner/runner.go`:
- Around line 443-449: Duplicate comma-splitting logic should be extracted into
a helper on Runner to avoid duplication: add a method like
processCommaSeparatedLine(text string, inputs chan taskInput) on Runner that
iterates strings.Split(text, ","), trims each item and calls
r.processInputItem(item, inputs) for non-empty items; then replace the
duplicated blocks (the loop that currently calls strings.Split in both the
file-input and stdin branches) with calls to r.processCommaSeparatedLine(text,
inputs) to keep behavior identical and centralize the logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9b1cb8c8-07aa-4903-9754-f3b4d58c5bf4
⛔ Files ignored due to path filters (2)
.github/workflows/go-test.ymlis excluded by!**/*.ymlassets/root-certs.pemis excluded by!**/*.pem
📒 Files selected for processing (8)
internal/pdcp/race_test.gointernal/pdcp/writer.gointernal/runner/runner.gopkg/output/file_writer.gopkg/tlsx/tls/tls.gopkg/tlsx/tls/tls_test.gopkg/tlsx/ztls/ztls.gopkg/tlsx/ztls/ztls_test.go
| creds := &pdcpauth.PDCPCredentials{ | ||
| Server: "http://localhost:8080", | ||
| APIKey: "test-key", | ||
| } | ||
|
|
||
| // Use a longer timeout to allow the race to manifest | ||
| ctx, cancel := context.WithTimeout(context.Background(), 20*time.Second) | ||
| defer cancel() |
There was a problem hiding this comment.
The deadline can deadlock this test under backpressure.
In internal/pdcp/writer.go, autoCommit() stops consuming u.data when ctx.Done() fires, but this test waits for all sender goroutines before calling writer.Close(). If the 20s deadline expires first—or something on localhost:8080 is slow instead of refusing fast—the callback goroutines can block forever on channel sends and wg.Wait() never returns.
🧪 Suggested follow-up
import (
"context"
"fmt"
+ "io"
+ "net/http"
+ "net/http/httptest"
"sync"
"testing"
- "time"
"github.com/projectdiscovery/tlsx/pkg/tlsx/clients"
"github.com/projectdiscovery/tlsx/internal/pdcp"
pdcpauth "github.com/projectdiscovery/utils/auth/pdcp"
)
func TestUploadWriterExploit(t *testing.T) {
+ server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+ w.Header().Set("Content-Type", "application/json")
+ _, _ = io.WriteString(w, `{"id":"0123456789abcdefghij"}`)
+ }))
+ defer server.Close()
+
creds := &pdcpauth.PDCPCredentials{
- Server: "http://localhost:8080",
+ Server: server.URL,
APIKey: "test-key",
}
- ctx, cancel := context.WithTimeout(context.Background(), 20*time.Second)
+ ctx, cancel := context.WithCancel(context.Background())
defer cancel()Also applies to: 47-65
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/pdcp/race_test.go` around lines 16 - 23, The test's fixed 20s
context deadline can deadlock because autoCommit() stops reading u.data on
ctx.Done(), while the test waits for sender goroutines to finish before calling
writer.Close(), allowing senders to block forever on channel sends; fix by
avoiding a deadline that races with sender shutdown—either remove the context
timeout and use cancel/close-based shutdown, or ensure the test calls
writer.Close() (or cancel the context) before wg.Wait() so autoCommit() unblocks
u.data sends; locate and update the test setup where ctx is created and where
wg.Wait()/writer.Close() are used, and adjust to cancel/close first (references:
autoCommit, u.data, writer.Close, ctx, wg.Wait in the failing test).
| _ = baseConn.Close() | ||
| }() | ||
| stats.IncrementCryptoTLSConnections() | ||
| baseCfg.CipherSuites = []uint16{tlsCiphers[v]} |
There was a problem hiding this comment.
Shared config mutation across iterations.
baseCfg is obtained once before the loop (line 194) and is mutated on each iteration by assigning to CipherSuites. While the current code runs sequentially, this pattern is error-prone:
- If concurrency is ever introduced, this becomes a data race.
- The mutation persists across iterations, which could cause subtle bugs if
getConfigreturns a shared or cached config.
Consider cloning the config inside the loop or using a local config variable per iteration.
🛠️ Suggested fix
for _, v := range toEnumerate {
func() {
// create new baseConn and pass it to tlsclient
baseConn, err := pool.Acquire(context.Background())
if err != nil {
return
}
defer func() {
_ = baseConn.Close()
}()
stats.IncrementCryptoTLSConnections()
- baseCfg.CipherSuites = []uint16{tlsCiphers[v]}
+ cfg := baseCfg.Clone()
+ cfg.CipherSuites = []uint16{tlsCiphers[v]}
- conn := tls.Client(baseConn, baseCfg)
+ conn := tls.Client(baseConn, cfg)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| baseCfg.CipherSuites = []uint16{tlsCiphers[v]} | |
| cfg := baseCfg.Clone() | |
| cfg.CipherSuites = []uint16{tlsCiphers[v]} | |
| conn := tls.Client(baseConn, cfg) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/tlsx/tls/tls.go` at line 239, baseCfg is being mutated each loop by
assigning to baseCfg.CipherSuites, which can cause races or persistent state;
inside the loop in getConfig (or whichever function iterates), create a
per-iteration copy of the TLS config (e.g., clone baseCfg into a local variable)
and set localCfg.CipherSuites = []uint16{tlsCiphers[v]} instead of mutating
baseCfg, ensuring you reference baseCfg, localCfg, CipherSuites, tlsCiphers and
the getConfig/loop code path so the shared config is never modified across
iterations.
…nd data races 1. pkg/tlsx/ztls/ztls.go: - Clone TLS config per cipher iteration instead of mutating shared baseCfg - Set CipherSuites BEFORE creating tls.Client to avoid wrong cipher selection 2. internal/pdcp/writer.go: - autoCommit: Add RLock-protected reads for assetGroupID and TeamID - uploadChunk: Add RLock-protected reads for assetGroupID and TeamID - Prevents data races with concurrent SetAssetID/SetTeamID calls
Neo - PR Security ReviewNo security issues found Highlights
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/tlsx/ztls/ztls.go (1)
337-356: Goroutine-based handshake timeout implementation looks correct.The pattern properly prevents goroutine leaks: the buffered channel allows the handshake goroutine to exit even if the context times out, and closing the connection forces the blocked
Handshake()to return.One minor style point: Go convention is for
context.Contextto be the first parameter. Consider reordering for consistency with the standard library and other ProjectDiscovery code.-func (c *Client) tlsHandshakeWithTimeout(tlsConn *tls.Conn, ctx context.Context) error { +func (c *Client) tlsHandshakeWithTimeout(ctx context.Context, tlsConn *tls.Conn) error {This would also require updating the call sites (lines 143 and 272).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/tlsx/ztls/ztls.go` around lines 337 - 356, Reorder the tlsHandshakeWithTimeout function signature to follow Go convention by making context.Context the first parameter (change to tlsHandshakeWithTimeout(ctx context.Context, tlsConn *tls.Conn) error) and update all call sites that invoke tlsHandshakeWithTimeout to pass the context as the first argument; ensure the internal use of ctx remains unchanged and rebuild to catch any missed call site updates (refer to function name tlsHandshakeWithTimeout and its callers to locate changes).internal/pdcp/writer.go (1)
247-261: Potential race onu.uploadURL.Pathmodification.While the protected fields are correctly read under the lock,
u.uploadURL.Pathis modified (lines 254, 258) after releasing the lock. Currently this is safe becausegetRequestis only called from the singleautoCommitgoroutine, but if concurrent access is ever introduced, this would race.Consider constructing the URL locally instead of mutating shared state:
♻️ Suggested refactor to avoid shared URL mutation
func (u *UploadWriter) getRequest(bin []byte) (*retryablehttp.Request, error) { var method, url string u.mu.RLock() assetID := u.assetGroupID assetName := u.assetGroupName teamID := u.TeamID + baseURL := u.uploadURL.String() u.mu.RUnlock() + parsedURL, err := urlutil.Parse(baseURL) + if err != nil { + return nil, errkit.Wrap(err, "could not parse upload url") + } + if assetID == "" { - u.uploadURL.Path = uploadEndpoint + parsedURL.Path = uploadEndpoint method = http.MethodPost - url = u.uploadURL.String() + parsedURL.Update() + url = parsedURL.String() } else { - u.uploadURL.Path = fmt.Sprintf(appendEndpoint, assetID) + parsedURL.Path = fmt.Sprintf(appendEndpoint, assetID) method = http.MethodPatch - url = u.uploadURL.String() + parsedURL.Update() + url = parsedURL.String() }Alternatively, extend the mutex protection to cover the URL path modification if keeping the current approach.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/pdcp/writer.go` around lines 247 - 261, The code mutates shared state u.uploadURL.Path after releasing the lock which risks a race; update getRequest to avoid mutating u.uploadURL by either (1) making a local copy of the URL (e.g., clone u.uploadURL into a local url variable and set url.Path locally) before calling String(), or (2) hold the mutex while modifying u.uploadURL.Path; reference u.uploadURL.Path, getRequest, autoCommit and the existing u.mu.RLock()/RUnlock() when locating where to apply the fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/pdcp/writer.go`:
- Around line 247-261: The code mutates shared state u.uploadURL.Path after
releasing the lock which risks a race; update getRequest to avoid mutating
u.uploadURL by either (1) making a local copy of the URL (e.g., clone
u.uploadURL into a local url variable and set url.Path locally) before calling
String(), or (2) hold the mutex while modifying u.uploadURL.Path; reference
u.uploadURL.Path, getRequest, autoCommit and the existing u.mu.RLock()/RUnlock()
when locating where to apply the fix.
In `@pkg/tlsx/ztls/ztls.go`:
- Around line 337-356: Reorder the tlsHandshakeWithTimeout function signature to
follow Go convention by making context.Context the first parameter (change to
tlsHandshakeWithTimeout(ctx context.Context, tlsConn *tls.Conn) error) and
update all call sites that invoke tlsHandshakeWithTimeout to pass the context as
the first argument; ensure the internal use of ctx remains unchanged and rebuild
to catch any missed call site updates (refer to function name
tlsHandshakeWithTimeout and its callers to locate changes).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 81402191-4fc9-48ae-b022-0b57abf109e4
📒 Files selected for processing (2)
internal/pdcp/writer.gopkg/tlsx/ztls/ztls.go
Closes #859
Summary
Fixes issue #859 - comma-separated items in list file are not processed correctly.
Problem
When using
-lflag with a file containing comma-separated values on a single line, only the first value was processed instead of splitting and processing each item.Solution
Split input lines by comma in
normalizeAndQueueInputsfunction, matching the behavior of the-uflag which already handles comma-separated values.Changes
Testing
Verified that comma-separated values on a single line are now processed as individual items. Compatible with existing single-value-per-line format.
Bounty Submission
Crypto wallets for bounty:
Summary by CodeRabbit
New Features
Bug Fixes
Tests