Fix missing beat receiver trace logs from diagnostic bundle#14716
Fix missing beat receiver trace logs from diagnostic bundle#14716macdewee wants to merge 5 commits into
Conversation
|
This pull request does not have a backport label. Could you fix it @macdewee? 🙏
|
|
Can you add an integration test showing that we do, in fact, include these in the bundle? Modifying the existing filebeat diagnostic tests should work. |
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
Signed-off-by: Dominik Rosiek <dominik.rosiek@elastic.co>
77cbcf8 to
415485c
Compare
This comment has been minimized.
This comment has been minimized.
| } | ||
|
|
||
| if err := collectServiceComponentsLogs(zw); err != nil { | ||
| return fmt.Errorf("failed to collect endpoint-security logs: %w", err) |
There was a problem hiding this comment.
Should we just log a warning here and continue if error? Otherwise we wouldn't get any logs
| // Beat receivers write trace logs under {pathsHome}/components/logs. | ||
| // Collect them under logs/<commitName>/ (same as agent logs) so the zip layout matches | ||
| // 8.19.x where trace files lived under {pathsHome}/logs directly. | ||
| return walkLogPath(filepath.Join(pathsHome, "components", "logs"), commitName, excludeEvents, zw, ts) |
There was a problem hiding this comment.
The problem with this approach is that if the same file exists in both {pathsHome}/logs/ and {pathsHome}/components/logs/ it is going to produce two zip entries with the same path logs/<commitName>/{file}.
unzip for example will replace the first one with the second which can be confusing when analyzing the diagnostics.
I would keep the original structure for component logs (inside components/logs) instead of putting everything in logs/.
| // zipLogs walks paths.Logs() and copies the file structure into zw in "logs/" | ||
| func zipLogsWithPath(pathsHome, commitName string, collectServices, excludeEvents bool, zw *zip.Writer, ts time.Time) error { | ||
| // zipLogsWithPath walks {pathsHome}/logs and {pathsHome}/components/logs and copies them into zw | ||
| // under "logs/<commitName>/" and "logs/<commitName>/components/" respectively. |
There was a problem hiding this comment.
This is only true if https://github.com/elastic/elastic-agent/pull/14716/changes#r3342415022 is implemented. Currently all logs are stored under logs/<commitName>/, not "logs/<commitName>/components/"
Signed-off-by: Dominik Rosiek <dominik.rosiek@elastic.co>
415485c to
d5ee5ac
Compare
…crit errors Signed-off-by: Dominik Rosiek <dominik.rosiek@elastic.co>
Signed-off-by: Dominik Rosiek <dominik.rosiek@elastic.co>
This comment has been minimized.
This comment has been minimized.
Signed-off-by: Dominik Rosiek <dominik.rosiek@elastic.co>
TL;DRBoth failing steps ( Remediation
Investigation detailsRoot CauseThe new integration test case defines an expected trace pattern that does not match the archive layout produced by this PR:
But this PR’s diagnostics logic and unit tests clearly place beat receiver trace files under
So the integration test expectation is currently off by one path segment ( Evidence
Verification
Follow-up
Note 🔒 Integrity filter blocked 1 itemThe following item were blocked because they don't meet the GitHub integrity level.
To allow these resources, lower tools:
github:
min-integrity: approved # merged | approved | unapproved | noneWhat is this? | From workflow: PR Buildkite Detective Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
⏳ Build in-progress, with failures
Failed CI Steps
History
cc @macdewee |
What does this PR do?
In 9.x, beats run as in-process OTel receivers inside
elastic-otel-collector. Theirpath.homeis set topaths.Components()(i.e.{home}/components/), so theirpath.logsdefaults to{home}/components/logs/. Input trace logs (e.g. fromhttpjson,cel,http_endpoint,entity-analytics) are written to{home}/components/logs/{input}/.The old
zipLogsWithPathonly walked{pathsHome}/logs/— missing{pathsHome}/components/logs/entirely.Changes:
zipLogsWithPathnow walks both{pathsHome}/logs/and{pathsHome}/components/logs/, placing trace logs underlogs/{version}/{input}/in the zip alongside the agent's own logs.collectServiceComponentsLogsand thelogs/root header moved out ofzipLogsWithPathintozipLogsso they run once, not once per versioned home.walkLogPath+zipLogWalkFunchelpers so both roots share the same walk callback.Why is it important?
Without this fix,
elastic-agent diagnosticsbundles are silently missing the HTTP request trace logs that are essential for debugginghttpjson,cel, andentity-analyticsintegrations.The root cause is that beats resolve trace log paths via
ResolvePathInLogsFor, which places files under{home}/components/logs/{input}/. The diagnostics code only walked{pathsHome}/logs/and never walked{pathsHome}/components/logs/.The regression was introduced by elastic/beats#48909 (merged 2026-02-18), which added
ResolvePathInLogsForto fix tracer path validation under managed agents. Before that change, relative tracer paths were resolved against the process CWD, placing files under{ROOT}/logs/{input}/which the old diagnostics walk did cover. After it, paths resolve againstpath.logs, placing files under{home}/components/logs/{input}/which was never walked. This affects 9.x and later 8.19.x patch releases.Fixes #14359.
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files./changelog/fragmentsusing the changelog tool[ ] I have added an integration test or an E2E testDisruptive User Impact
None. The zip layout for existing log files is unchanged. Trace log files are now present in the bundle where they were previously missing.
How to test this PR locally
httpjsoninput that hasrequest.tracer.filenameconfigured.{home}/components/logs/httpjson/.elastic-agent diagnostics.logs/{version-hash}/httpjson/.Unit tests can be run directly:
go test ./internal/pkg/diagnostics/... -run TestZipLogs -vRelated issues