-
Notifications
You must be signed in to change notification settings - Fork 472
tests: Add automated VM testing environment with QEMU #5074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add automated VM testing infrastructure for running Tracee integration and e2e tests in isolated Vagrant VMs using QEMU/KVM. Key features: - Automated VM lifecycle management (start, test, cleanup) - Selective test execution (--unit, --integration, --e2e-inst, --e2e-net) - Automatic VM conflict detection and resolution - Log collection and VM preservation on failure - Provisioning optimization: first run 10-15min, subsequent runs ~2-3min Scripts: - tests/run-vm-tests.sh: Host-side orchestration - tests/run-tests-in-vm.sh: In-VM test execution Documentation: - docs/contributing/vm-testing.md: Comprehensive guide - docs/contributing/setup-development-machine-with-vagrant.md: Updated - tests/README.md: Quick reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces an automated VM testing infrastructure for running Tracee integration and e2e tests in isolated Vagrant VMs, addressing the need for safe test execution that won't impact developers' host systems.
Key changes:
- Automated VM lifecycle management with conflict detection and automatic cleanup
- Selective test execution supporting --unit, --integration, --e2e-inst, and --e2e-net flags
- Enhanced provisioning efficiency with run: "once" directives reducing subsequent runs from 10-15min to ~2-3min
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/run-vm-tests.sh | Host-side orchestration script handling VM lifecycle, test execution, and cleanup |
| tests/run-tests-in-vm.sh | In-VM test executor with selective test suite support and log collection |
| tests/README.md | Quick reference for test types and VM testing approach |
| docs/contributing/vm-testing.md | Comprehensive guide covering setup, usage, troubleshooting, and best practices |
| docs/contributing/setup-development-machine-with-vagrant.md | Updated with VM testing section and improved QEMU setup instructions |
| Vagrantfile | Added run: "once" provisioning optimization and improved 9p mount options |
| # Colors for output | ||
| RED='\033[0;31m' | ||
| GREEN='\033[0;32m' | ||
| YELLOW='\033[1;33m' |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The YELLOW color code uses bold (1;33) while other colors don't. For consistency, consider using '\033[0;33m' to match the pattern of RED and GREEN, or add comments explaining why YELLOW needs to be bold.
| YELLOW='\033[1;33m' | |
| YELLOW='\033[0;33m' |
| if vagrant status 2>/dev/null | grep -E "tracee-(dev|test)-vm.*running" | grep -v "${VM_NAME}" > /dev/null; then | ||
| warn "Another Tracee VM is running. Stopping it first..." | ||
| vagrant halt |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This grep pattern will always filter out ${VM_NAME} ('tracee-test-vm'), but then the code attempts to stop VMs. If only the test VM is running, this check will pass (no other VMs found) but then 'vagrant halt' will still stop the test VM. The logic should either skip the halt if no conflicting VMs exist, or check for any running VM including the test VM.
| if vagrant status 2>/dev/null | grep -E "tracee-(dev|test)-vm.*running" | grep -v "${VM_NAME}" > /dev/null; then | |
| warn "Another Tracee VM is running. Stopping it first..." | |
| vagrant halt | |
| running_vms=$(vagrant status 2>/dev/null | grep -E "tracee-(dev|test)-vm.*running" | awk '{print $1}') | |
| other_running_vms=$(echo "$running_vms" | grep -v "^${VM_NAME}$" || true) | |
| if [[ -n "$other_running_vms" ]]; then | |
| warn "Another Tracee VM is running. Stopping it first..." | |
| vagrant halt $other_running_vms |
| info "Running go test (without coverage to avoid 9p permission issues)..." | ||
| # Run the same test command as the Makefile but without -coverprofile | ||
| GOOS=linux CC=clang GOARCH=amd64 GOFIPS140=off \ | ||
| CGO_CFLAGS="-I/vagrant/dist/libbpf/include" \ | ||
| CGO_LDFLAGS="-L/vagrant/dist/libbpf/obj -lbpf" \ | ||
| go test \ | ||
| -tags core,ebpf,lsmsupport \ | ||
| -ldflags="-s=false -w=false -extldflags \"-lelf -lz\" -X main.version=\"$(cat /vagrant/VERSION)\"" \ | ||
| -shuffle on \ | ||
| -timeout 20m \ | ||
| -race \ | ||
| -v \ | ||
| -p 1 \ | ||
| -count=1 \ | ||
| ./tests/integration/... 2>&1 | tee "${integration_log}" | tee -a "${LOG_FILE}" |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This duplicates the integration test command from the Makefile. If the Makefile's test-integration target changes, this needs manual synchronization. Consider either: 1) invoking the Makefile target with coverage disabled via an environment variable, or 2) extracting this command to a shared script that both can use.
| info "Running go test (without coverage to avoid 9p permission issues)..." | |
| # Run the same test command as the Makefile but without -coverprofile | |
| GOOS=linux CC=clang GOARCH=amd64 GOFIPS140=off \ | |
| CGO_CFLAGS="-I/vagrant/dist/libbpf/include" \ | |
| CGO_LDFLAGS="-L/vagrant/dist/libbpf/obj -lbpf" \ | |
| go test \ | |
| -tags core,ebpf,lsmsupport \ | |
| -ldflags="-s=false -w=false -extldflags \"-lelf -lz\" -X main.version=\"$(cat /vagrant/VERSION)\"" \ | |
| -shuffle on \ | |
| -timeout 20m \ | |
| -race \ | |
| -v \ | |
| -p 1 \ | |
| -count=1 \ | |
| ./tests/integration/... 2>&1 | tee "${integration_log}" | tee -a "${LOG_FILE}" | |
| info "Running integration tests via Makefile (coverage disabled for VM)..." | |
| # Use the Makefile's test-integration target with COVERAGE=0 to avoid coverage on 9p mounts | |
| COVERAGE=0 make test-integration 2>&1 | tee "${integration_log}" | tee -a "${LOG_FILE}" |
| **Solution**: | ||
| - Increase CPU/RAM allocation: | ||
| ```bash | ||
| VM_PROC=8 VM_MEM=16 vagrant up |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The environment variable name should be VM_CPUS, not VM_PROC. The script uses VM_CPUS throughout (line 29 of run-vm-tests.sh and line 313 in this same file).
| VM_PROC=8 VM_MEM=16 vagrant up | |
| VM_CPUS=8 VM_MEM=16 vagrant up |
| ./tests/run-vm-tests.sh | ||
|
|
||
| # Custom resources | ||
| VM_PROC=8 VM_MEM=16 ./tests/run-vm-tests.sh |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The environment variable name should be VM_CPUS, not VM_PROC. This is inconsistent with the script's actual variable name used in run-vm-tests.sh.
| VM_PROC=8 VM_MEM=16 ./tests/run-vm-tests.sh | |
| VM_CPUS=8 VM_MEM=16 ./tests/run-vm-tests.sh |
| - **macOS**: HVF (Hypervisor Framework) acceleration on Intel/Apple Silicon | ||
| - **Fallback**: TCG software emulation (slower but works on all systems) | ||
| - [vagrant-qemu plugin]: Install with `vagrant plugin install vagrant-qemu` | ||
| - [Vagrant] (2.2+) |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The markdown link reference [Vagrant] is used but not defined. Either define the link reference at the bottom of the document (e.g., '[Vagrant]: https://www.vagrantup.com/downloads') or use an inline link format.
| echo "Mounting 9p filesystem directly to ${SYNCED_FOLDER}" | ||
| mount -t 9p -o trans=virtio,version=9p2000.L,dfltuid=1000,dfltgid=1000,uname=vagrant,access=any shared ${SYNCED_FOLDER} || echo "Note: 9p mount may already be active" | ||
| # Use mmap and cache=loose for better performance and permissions | ||
| # access=client allows root to write files |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The mount command now uses 'access=client' (changed from 'access=any'). While the comment on line 417 mentions this allows root to write files, it would be clearer to add a comment explaining why 'access=client' is preferred over 'access=any' in this context, especially for future maintainers.
| # access=client allows root to write files | |
| # access=client ensures that file ownership and permissions on the mounted 9p filesystem | |
| # match the client (VM) user, allowing root to write files and avoiding permission issues | |
| # that can occur with access=any. This is important for development workflows where root | |
| # or other users in the VM need to modify files in the shared folder. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5074 +/- ##
==========================================
+ Coverage 29.72% 30.31% +0.58%
==========================================
Files 234 232 -2
Lines 26186 25924 -262
==========================================
+ Hits 7785 7860 +75
+ Misses 17864 17526 -338
- Partials 537 538 +1
🚀 New features to boost your workflow:
|
|
I'm taking this over. |
Add automated VM testing infrastructure for running Tracee integration and e2e tests in isolated Vagrant VMs using QEMU/KVM.
Key features:
Scripts:
Documentation: