Skip to content

Comments

winlogbeat: replace Python system tests with Go testscript tests#49012

Draft
andrewkroh wants to merge 2 commits intoelastic:mainfrom
andrewkroh:winlogbeat/feat/replace-python-tests
Draft

winlogbeat: replace Python system tests with Go testscript tests#49012
andrewkroh wants to merge 2 commits intoelastic:mainfrom
andrewkroh:winlogbeat/feat/replace-python-tests

Conversation

@andrewkroh
Copy link
Member

@andrewkroh andrewkroh commented Feb 20, 2026

Proposed commit message

Delete the Python-based system test suite (tests/system/) and replace it
with pure Go tests using rogpeppe/go-internal/testscript. There is no
more Python in Winlogbeat's test infrastructure.

The new test suite lives in winlogbeat/tests/testscript/ with txtar
scripts organized into subdirectories: export/ (cross-platform),
config/ (Windows), eventlog/ (Windows), and evtx/ (Windows). Each
subdirectory runs as a subtest for targeted execution.

Test commands implemented for txtar scripts:
- write-event, write-multiline-event, clear-event-log: event log setup
- check-event-count, check-event-field, check-event-field-exists,
  check-event-field-absent, check-event-field-contains: assertions
- wait-for-event-count: polling for async event delivery
- envsubst, sleep: utilities

The write-event command defaults to the current process user's SID,
matching the Python write_event_log() behavior, so that user identity
fields (winlog.user.*) are present in events and testable.

While developing these tests, we identified and fixed a bug in
runner.go where io.EOF from no_more_events: stop was checked after
the backoff handler rather than before it. Since IsRecoverable()
returns true for io.EOF, this caused infinite retries and silently
dropped the final batch of events returned alongside the EOF. This
turned out to be the same issue reported in #47388. An evtx test
(read_all_events.txtar) reads sysmon-9.01.evtx with batch_read_size=5,
forcing 7 Read() calls where the final call returns 2 records with
io.EOF, proving the fix end-to-end.

Additional changes:
- Remove Python mage targets from winlogbeat and x-pack/winlogbeat
- Simplify make.bat to use 'go run' for mage
- Remove TestSystem from main_test.go

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works. Where relevant, I have used the stresstest.sh script to run them under stress conditions and race detector to verify their stability.
  • I have added an entry in ./changelog/fragments using the changelog tool.

How to test this PR locally

cd winlogbeat
.\make.bat unitTest

Related issues

@andrewkroh andrewkroh added Winlogbeat backport-skip Skip notification from the automated backport with mergify labels Feb 20, 2026
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Feb 20, 2026
@github-actions
Copy link
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@andrewkroh andrewkroh force-pushed the winlogbeat/feat/replace-python-tests branch 3 times, most recently from a851dbf to 6e8efe9 Compare February 20, 2026 17:08
@andrewkroh andrewkroh added the Team:Security-Windows Platform Windows Platform Team in Security Solution label Feb 20, 2026
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 20, 2026
@andrewkroh andrewkroh force-pushed the winlogbeat/feat/replace-python-tests branch 4 times, most recently from d6ff03a to c43682c Compare February 20, 2026 19:57
Delete the Python-based system test suite (tests/system/) and replace it
with pure Go tests using rogpeppe/go-internal/testscript. There is no
more Python in Winlogbeat's test infrastructure.

The new test suite lives in winlogbeat/tests/testscript/ with txtar
scripts organized into subdirectories: export/ (cross-platform),
config/ (Windows), eventlog/ (Windows), and evtx/ (Windows). Each
subdirectory runs as a subtest for targeted execution.

Test commands implemented for txtar scripts:
- write-event, write-multiline-event, clear-event-log: event log setup
- check-event-count, check-event-field, check-event-field-exists,
  check-event-field-absent, check-event-field-contains: assertions
- wait-for-event-count: polling for async event delivery
- envsubst, sleep: utilities

The write-event command defaults to the current process user's SID,
matching the Python write_event_log() behavior, so that user identity
fields (winlog.user.*) are present in events and testable.

While developing these tests, we identified and fixed a bug in
runner.go where io.EOF from no_more_events: stop was checked after
the backoff handler rather than before it. Since IsRecoverable()
returns true for io.EOF, this caused infinite retries and silently
dropped the final batch of events returned alongside the EOF. This
turned out to be the same issue reported in elastic#47388. An evtx test
(read_all_events.txtar) reads sysmon-9.01.evtx with batch_read_size=5,
forcing 7 Read() calls where the final call returns 2 records with
io.EOF, proving the fix end-to-end.

Additional changes:
- Remove Python mage targets from winlogbeat and x-pack/winlogbeat
- Simplify make.bat to use 'go run' for mage
- Remove TestSystem from main_test.go

Fixes elastic#47388
@andrewkroh andrewkroh force-pushed the winlogbeat/feat/replace-python-tests branch from c43682c to 36516b9 Compare February 20, 2026 20:03
On busy CI agents (observed on Windows 2019), events written by
ReportEvent may not be immediately visible to event log readers.
If winlogbeat starts before the events are committed, it sees an
empty channel, triggers no_more_events: stop, and exits with 0
events — causing spurious test failures.

Add a wait-for-event-log command that polls the Windows event log
via EvtQuery until the expected number of events are visible. Insert
it between write-event and exec winlogbeat in all eventlog tests.

Also add explanations to all //nolint:errcheck directives to satisfy
the nolintlint linter.
@andrewkroh andrewkroh force-pushed the winlogbeat/feat/replace-python-tests branch from 36516b9 to dea09cd Compare February 20, 2026 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-skip Skip notification from the automated backport with mergify Team:Security-Windows Platform Windows Platform Team in Security Solution Winlogbeat

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[winlogbeat] when ingesting evtx file processing stops at 512

1 participant