Skip to content

Test validation logic fails due to trailing whitespace inconsistencies in test names #11

@Bambuuai

Description

@Bambuuai

test_result = (f2p | p2p) <= passed_tests

Problem

The test validation logic in env.py (line 459) incorrectly fails even when all required tests pass due to whitespace inconsistencies between expected test names and actual test results.

Current code:

passed_tests = {x["name"] for x in output["tests"] if x["status"] == "PASSED"}
test_result = (f2p | p2p) <= passed_tests

Root Cause

The sets f2p | p2p and passed_tests have the same length but different elements due to trailing whitespace differences:

Expected tests (from f2p | p2p) - missing trailing quotes/spaces:

  • 'test/user.js | User Digest.getSubscribers should accurately build digest list given ACP default "week'
  • 'test/user.js | User Digest.getSubscribers should accurately build digest list given ACP default "day'
  • 'test/user.js | User Digest.getSubscribers should accurately build digest list given ACP default "off'
  • 'test/database.js | Test database test/database/sorted.js::Sorted Set methods test/database/sorted.js::getSortedSetRange() should work with big arrays (length > 100)'

Actual passed tests - have proper quotes/trailing spaces:

  • 'test/user.js | User Digest.getSubscribers should accurately build digest list given ACP default "day"'
  • 'test/user.js | User Digest.getSubscribers should accurately build digest list given ACP default "week"'
  • 'test/user.js | User Digest.getSubscribers should accurately build digest list given ACP default "off"'
  • 'test/database.js | Test database test/database/sorted.js::Sorted Set methods test/database/sorted.js::getSortedSetRange() should work with big arrays (length > 100) '

Proposed Solution

Compare set lengths instead of using subset operation, since the whitespace inconsistencies are a dataset issue:

passed_tests = {x["name"] for x in output["tests"] if x["status"] == "PASSED"}
test_result = len(f2p | p2p) == len(passed_tests)

This approach works because:

  1. The number of required tests matches the number of passed tests
  2. The only difference is whitespace formatting, not actual test content
  3. It's a pragmatic fix until the dataset can be corrected

Alternative Solutions

  • Long-term fix: Update the dataset to ensure consistent test name formatting
  • Stricter fix: Normalize both sides by stripping whitespace before comparison

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions