Test Runner Review - test_runner.zig

Executive Summary

This review analyzes /Users/williamcory/guillotine-mini/test_runner.zig and its helper module /Users/williamcory/guillotine-mini/test/utils.zig. The test runner is a comprehensive test execution system with support for parallel execution, multiple output formats, and progress tracking. While generally well-implemented, several issues were identified ranging from critical security concerns to code quality improvements.

Overall Assessment: Good foundation with room for improvement in error handling, resource management, and security.

1. Incomplete Features

1.1 Missing JSON/XML Escaping (CRITICAL)

Location: test/utils.zig:697-705

fn escapeJSON(s: []const u8) []const u8 {
    // For simplicity, return as-is. In production, should escape quotes, backslashes, etc.
    return s;
}

fn escapeXML(s: []const u8) []const u8 {
    // For simplicity, return as-is. In production, should escape <, >, &, quotes, etc.
    return s;
}

Issue: These stub implementations could produce invalid JSON/XML output when test names or error messages contain special characters (", \, <, >, &).

Impact:

Broken JSON output parsing in CI/CD pipelines
XML injection vulnerabilities in JUnit reports
Data corruption when test names contain quotes or special characters

Recommendation: Implement proper escaping or use Zig's standard library JSON/XML utilities.

fn escapeJSON(allocator: std.mem.Allocator, s: []const u8) ![]const u8 {
    var result = std.ArrayList(u8).init(allocator);
    for (s) |c| {
        switch (c) {
            '"' => try result.appendSlice("\\\""),
            '\\' => try result.appendSlice("\\\\"),
            '\n' => try result.appendSlice("\\n"),
            '\r' => try result.appendSlice("\\r"),
            '\t' => try result.appendSlice("\\t"),
            else => try result.append(c),
        }
    }
    return result.toOwnedSlice();
}

1.2 Incomplete Progress Reporting

Location: test_runner.zig:118-122

Progress reporting only works when has_tty is true, meaning CI/CD environments get no progress feedback during sequential execution.

Recommendation: Add periodic logging for non-TTY environments:

if (output_format == .pretty) {
    if (has_tty) {
        try utils.printProgress(stdout, i + 1, test_indices.items.len, suite_name);
    } else if ((i + 1) % 10 == 0) {
        // Log every 10 tests for CI/CD
        try stdout.print("[{d}/{d}] Running {s}...\n", .{i + 1, test_indices.items.len, suite_name});
    }
}

2. TODOs and Comments

No explicit TODO comments found, but the stub functions above serve as implicit TODOs.

3. Bad Code Practices

3.1 Hardcoded Timezone Offset (BUG)

Location: test_runner.zig:289

const hours: u32 = @intCast(@mod(@divTrunc(now_s, 3600) - 8, 24)); // PST

Issues:

Hardcoded PST timezone (UTC-8)
Breaks for users in other timezones
Doesn't account for DST
Uses current machine time instead of UTC

Recommendation: Either use UTC or detect the system timezone:

// Option 1: Use UTC
const hours: u32 = @intCast(@mod(@divTrunc(now_s, 3600), 24));
try writer.print("{s}{d:0>2}:{d:0>2}:{d:0>2} UTC{s}\n", .{...});

// Option 2: Remove timestamp entirely (duration is more useful)
// Just remove the "Start at" section

3.2 Magic Numbers

Location: Multiple instances

stdout_buffer = try std.ArrayList(u8).initCapacity(allocator, 8192);  // Line 15
while (i < full_name.len - 5) : (i += 1) {  // Line 70, 92 - magic number 5
const timeout_ns: i128 = 60 * std.time.ns_per_s;  // Line 202 - hardcoded 60s
std.Thread.sleep(3 * std.time.ns_per_ms);  // Line 231 - magic number 3
const bar_width = 20;  // Line 139 - magic number

Recommendation: Extract as named constants:

const STDOUT_BUFFER_SIZE = 8192;
const TEST_DOT_LENGTH = ".test.".len;
const TEST_TIMEOUT_SECONDS = 60;
const POLL_INTERVAL_MS = 3;
const PROGRESS_BAR_WIDTH = 20;

3.3 Silent Error Handling

Location: test/utils.zig:757-760

const result = runTestInProcess(ctx.allocator, task.index) catch |err| {
    std.debug.print("Error running test {d}: {}\n", .{ task.index, err });
    continue;  // Silently skip the test
};

Issue: Test failures are only printed to debug output and the test is silently skipped. The result array won't contain this test, potentially hiding failures.

Recommendation: Create a failed result entry:

const result = runTestInProcess(ctx.allocator, task.index) catch |err| {
    const t = builtin.test_functions[task.index];
    const error_msg = std.fmt.allocPrint(ctx.allocator,
        "Internal error running test: {}", .{err}) catch "Internal error";
    task.result = TestResult{
        .name = t.name,
        .suite = extractSuiteName(t.name),
        .test_name = extractTestName(t.name),
        .passed = false,
        .error_msg = error_msg,
        .duration_ns = 0,
    };
    continue;
};

3.4 Inconsistent Allocator Usage

Location: test/utils.zig:627, 879

var suite_map = std.StringHashMap(std.ArrayList(TestResult)).init(std.heap.page_allocator);
var sorted: std.ArrayList(TestResult) = .{};  // Uses page_allocator implicitly

Issue: Uses std.heap.page_allocator directly instead of the passed allocator. This bypasses memory tracking and limits testability.

Recommendation: Always use the passed allocator:

var suite_map = std.StringHashMap(std.ArrayList(TestResult)).init(allocator);
var sorted = std.ArrayList(TestResult).init(allocator);

3.5 Potential Race Condition in Progress Display

Location: test/utils.zig:807-817

for (0..tasks.len) |i| {
    tasks[i].mutex.lock();
    if (tasks[i].result) |r| {
        if (r.passed) {
            passed += 1;
        } else {
            failed += 1;
        }
    }
    tasks[i].mutex.unlock();
}

Issue: Between counting passed/failed and printing marks (lines 820-835), test states can change, causing marks to be printed incorrectly.

Recommendation: Snapshot the results while holding locks:

var snapshot = std.ArrayList(bool).init(allocator);
defer snapshot.deinit();
for (0..tasks.len) |i| {
    tasks[i].mutex.lock();
    if (tasks[i].result) |r| {
        try snapshot.append(r.passed);
    }
    tasks[i].mutex.unlock();
}
// Now print from snapshot

3.6 Unused Variable

Location: test/utils.zig:784

_ = i;

Issue: The loop variable i is declared but immediately discarded. This suggests the loop should either use _ directly or the variable serves no purpose.

Recommendation:

for (threads) |*thread| {  // Remove index entirely
    thread.* = try std.Thread.spawn(.{}, workerFn, .{...});
}

4. Missing Test Coverage

4.1 Unit Tests Missing

The test runner itself has no unit tests. Key areas that should be tested:

String Parsing Functions
- extractSuiteName() with various input formats
- extractTestName() edge cases
- matchesFilter() with special characters
Duration Formatting
- formatDuration() boundary values (0, very large numbers, etc.)
Output Formatting
- JSON output structure validation
- JUnit XML schema compliance
- Escaping special characters (once implemented)
Parallel Execution
- Thread safety of result collection
- Correct handling of worker failures
- Progress tracking accuracy

Example Test Structure:

test "extractSuiteName - standard format" {
    const result = extractSuiteName("module.test.my_test");
    try std.testing.expectEqualStrings("module", result);
}

test "extractSuiteName - nested modules" {
    const result = extractSuiteName("a.b.c.test.foo");
    try std.testing.expectEqualStrings("a.b.c", result);
}

test "matchesFilter - partial match" {
    try std.testing.expect(matchesFilter("foo.test.bar", "bar"));
    try std.testing.expect(!matchesFilter("foo.test.baz", "bar"));
}

4.2 Integration Tests Missing

No tests verify:

End-to-end test execution with various filters
Parallel vs sequential execution produces same results
Output format correctness
Timeout handling
Memory leak detection (exit code 2)

5. Other Issues

5.1 Memory Management Concerns

5.1.1 No Cleanup on Early Return

Location: test_runner.zig:15-17

var stdout_buffer = try std.ArrayList(u8).initCapacity(allocator, 8192);
defer stdout_buffer.deinit(allocator);

Issue: The defer correctly uses allocator, but if any error occurs before the defer, memory leaks. However, since we use a GPA at the top level, this is caught.

Note: This is actually fine given the GPA usage, but the pattern is unusual.

5.1.2 Leaked Error Messages in Worker

Location: test/utils.zig:757-760

If runTestInProcess fails, the error message is never freed because the result is not stored.

5.2 Platform-Specific Code Without Guards

Location: test/utils.zig:180-194

const pid = if (!nofork) try std.posix.fork() else 0;

Issue: fork() is POSIX-only and will fail on Windows. The code doesn't compile on Windows.

Recommendation: Add platform checks:

const supports_fork = builtin.os.tag != .windows;
const nofork = std.posix.getenv("TEST_NOFORK") != null or !supports_fork;

if (!nofork) {
    // Only fork on POSIX systems
    const pid = try std.posix.fork();
    // ...
}

5.3 Potential Integer Overflow

Location: test_runner.zig:137

const total_duration = @as(u64, @intCast(end_time - start_time));

Issue: If end_time < start_time (clock adjustment), this will panic. While unlikely, system clock adjustments can happen.

Recommendation: Use saturating arithmetic or monotonic time:

const total_duration = if (end_time >= start_time)
    @as(u64, @intCast(end_time - start_time))
    else 0;

5.4 Non-Atomic Test Index Access

Location: test/utils.zig:746-753

ctx.mutex.lock();
const task_idx = ctx.next_idx.*;
if (task_idx >= ctx.tasks_len) {
    ctx.mutex.unlock();
    break;
}
ctx.next_idx.* += 1;
ctx.mutex.unlock();

Issue: While correctly guarded by mutex, this could be simplified using atomic operations for better performance.

Recommendation: Consider std.atomic.Atomic(usize) for lock-free task dispatch.

5.5 Hardcoded Path in Header

Location: test_runner.zig:69

try std.fmt.format(stdout, " {s}{s}~/guillotine{s}\n", .{...});

Issue: Hardcoded path that doesn't reflect actual working directory.

Recommendation:

const cwd = std.fs.cwd().realpathAlloc(allocator, ".") catch "unknown";
defer allocator.free(cwd);
try std.fmt.format(stdout, " {s}{s}{s}{s}\n", .{
    Color.cyan, Icons.arrow, cwd, Color.reset
});

5.6 Busy-Wait in Test Timeout

Location: test/utils.zig:208-232

while (true) {
    wait_result = std.posix.waitpid(pid, std.posix.W.NOHANG);
    if (wait_result.pid == pid) break;
    // ... timeout check ...
    std.Thread.sleep(3 * std.time.ns_per_ms);  // Busy-wait every 3ms
}

Issue: Polling every 3ms is excessive and wastes CPU. The timeout is 60 seconds, but we're checking 20,000 times.

Recommendation: Increase sleep interval to 100ms or use blocking wait with signals.

5.7 Missing Error Context

Location: Throughout

When errors occur, context about which test was being run is often missing. For example:

const parallel_results = try utils.runTestsParallel(allocator, test_indices.items, max_workers);

If this fails, we don't know which test caused the failure.

Recommendation: Wrap critical operations with error context:

const parallel_results = utils.runTestsParallel(allocator, test_indices.items, max_workers) catch |err| {
    std.debug.print("Failed to run tests in parallel: {}\n", .{err});
    return err;
};

6. Code Quality Observations

6.1 Good Practices ✓

Proper Resource Management: Extensive use of defer for cleanup
Configuration via Environment Variables: Flexible runtime configuration
Comprehensive Output Formats: JSON, JUnit, and pretty printing
Memory Leak Detection: Exit code 2 for leaks
Timeout Protection: 60-second timeout prevents hanging tests
Parallel Execution: Utilizes multiple cores efficiently
Progress Feedback: Real-time progress bars and test markers

6.2 Documentation

Missing: Module-level documentation explaining architecture
Missing: Function-level docs for public API functions
Missing: Examples of environment variable usage

Recommendation: Add comprehensive documentation:

//! Test Runner for guillotine-mini
//!
//! This module provides a comprehensive test execution framework with:
//! - Parallel test execution with worker threads
//! - Multiple output formats (pretty, JSON, JUnit)
//! - Test filtering by name
//! - Memory leak detection
//! - Timeout protection
//!
//! Environment Variables:
//! - TEST_FILTER: Filter tests by substring match
//! - TEST_FORMAT: Output format (json, junit, pretty)
//! - TEST_SEQUENTIAL: Disable parallel execution
//! - TEST_WORKERS: Number of worker threads (default: CPU count)
//! - TEST_NOFORK: Disable process forking for debugging

7. Security Considerations

7.1 Incomplete Input Sanitization (MEDIUM)

Test names come from builtin.test_functions, which is compiler-generated and trusted. However, error messages from tests are not sanitized before being output to JSON/XML.

Risk: If a test deliberately includes shell metacharacters or ANSI escape sequences in error messages, these could be injected into output consumed by other tools.

Recommendation: Sanitize all user-controlled strings before output.

7.2 Process Management (LOW)

Location: test/utils.zig:221-224

std.posix.kill(pid, std.posix.SIG.KILL) catch |err| {
    std.debug.print("Warning: Failed to kill timed-out test process {d}: {}\n", .{ pid, err });
};

Issue: If kill fails, the process may become a zombie or orphan.

Recommendation: Use a more robust cleanup strategy with fallback options.

8. Performance Considerations

8.1 Buffer Pre-allocation

Location: test_runner.zig:15

var stdout_buffer = try std.ArrayList(u8).initCapacity(allocator, 8192);

Observation: 8KB is reasonable for most output, but large test suites may cause multiple reallocations.

Recommendation: Consider making this configurable or dynamically sizing based on test count.

8.2 Progress Update Frequency

Location: test/utils.zig:853

std.Thread.sleep(50 * std.time.ns_per_ms); // Update every 50ms

Observation: 50ms gives smooth visual feedback but may be excessive for very fast test suites.

Recommendation: Acceptable as-is, but could be adaptive based on average test duration.

8.3 String Allocations in Hot Path

Location: Multiple calls to extractSuiteName and extractTestName

These functions are called for every test and don't allocate, which is good. However, they perform linear searches for ".test." which could be optimized.

Recommendation: Consider caching results or using compile-time computation if possible.

9. Recommendations Summary

High Priority (Should Fix)

✅ Implement escapeJSON() and escapeXML() properly
✅ Fix hardcoded PST timezone (use UTC or remove)
✅ Handle worker thread errors properly (don't silently skip tests)
✅ Add platform guards for fork() to support Windows

Medium Priority (Should Consider)

✅ Use passed allocator consistently (avoid page_allocator)
✅ Extract magic numbers to named constants
✅ Add unit tests for utility functions
✅ Add module and function documentation
✅ Fix race condition in progress display
✅ Add CI-friendly progress logging for non-TTY

Low Priority (Nice to Have)

✅ Use atomic operations for task index
✅ Increase sleep interval in timeout polling
✅ Add integration tests
✅ Make stdout buffer size configurable
✅ Add error context to all error returns

10. Conclusion

The test runner is a well-structured, feature-rich system that provides excellent developer experience. The main concerns are:

Incomplete implementations (JSON/XML escaping) that could cause issues in production
Platform portability (Windows support)
Error handling that sometimes silently suppresses failures
Code quality issues like magic numbers and hardcoded values

With the recommended fixes, particularly the high-priority items, this would be a production-ready test runner suitable for CI/CD integration.

Estimated Effort:

High priority fixes: 4-6 hours
Medium priority improvements: 8-10 hours
Low priority enhancements: 6-8 hours
Total: 18-24 hours for complete remediation

FilesExpand file tree

test_runner.zig.md

Latest commit

History