Skip to content

Commit 76b1fad

Browse files
committed
fix: Use Unicode-aware whitespace detection
- Change whitespace checks from .contains(' ') to .chars().any(char::is_whitespace) - Fix parser_engine.rs lines 1135 and 1148 to detect tabs, newlines, and Unicode whitespace - Add comprehensive bug reproducer tests for tab, newline, NBSP, em space, and mixed whitespace - Update task readme to mark tasks 081 and 082 as completed - Move completed task files to completed/ directory
1 parent db5c8d1 commit 76b1fad

File tree

5 files changed

+351
-27
lines changed

5 files changed

+351
-27
lines changed
File renamed without changes.

module/core/unilang/task/readme.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
| Order | ID | Advisability | Value | Easiness | Safety | Priority | Status | Task | Description |
66
|-------|----|--------------:|------:|---------:|-------:|---------:|--------|------|-------------|
77
| 1 | 078 | 1440 | 9 | 8 | 5 | 4 | ✅ (Completed) | [update_cargo_dependencies](./completed/078_update_cargo_dependencies.md) | Update Cargo dependencies for new functionality |
8-
| 2 | 082 | 1134 | 9 | 9 | 7 | 2 | 🔄 (Planned) | [fix_whitespace_detection_bug](./082_fix_whitespace_detection_bug.md) | Fix whitespace detection bug in parse_from_argv |
8+
| 2 | 082 | 1134 | 9 | 9 | 7 | 2 | ✅ (Completed) | [fix_whitespace_detection_bug](./completed/082_fix_whitespace_detection_bug.md) | Fix whitespace detection bug in parse_from_argv |
99
| 3 | 056 | 1080 | 9 | 6 | 5 | 4 | ✅ (Completed) | [write_tests_for_static_data_structures_extension](./completed/056_write_tests_for_static_data_structures_extension.md) | Write tests for static data structures extension |
1010
| 4 | 058 | 1080 | 9 | 6 | 5 | 4 | ✅ (Completed) | [write_tests_for_phf_map_generation_system](./completed/058_write_tests_for_phf_map_generation_system.md) | Write tests for PHF map generation system |
1111
| 5 | 060 | 1080 | 9 | 6 | 5 | 4 | ✅ (Completed) | [write_tests_for_static_command_registry](./completed/060_write_tests_for_static_command_registry.md) | Write tests for StaticCommandRegistry |
@@ -16,7 +16,7 @@
1616
| 10 | 063 | 720 | 9 | 4 | 5 | 4 | ✅ (Completed) | [implement_registry_integration](./completed/063_implement_registry_integration.md) | Implement registry integration |
1717
| 11 | 057 | 720 | 9 | 4 | 5 | 4 | ✅ (Completed) | [implement_static_data_structures_extension](./completed/057_implement_static_data_structures_extension.md) | Implement static data structures extension |
1818
| 12 | 059 | 720 | 9 | 4 | 5 | 4 | ✅ (Completed) | [implement_phf_map_generation_system](./completed/059_implement_phf_map_generation_system.md) | Implement PHF map generation system |
19-
| 13 | 081 | 720 | 9 | 8 | 5 | 2 | 🔄 (Planned) | [write_tests_for_whitespace_detection_bug](./081_write_tests_for_whitespace_detection_bug.md) | Write tests for whitespace detection bug in parse_from_argv |
19+
| 13 | 081 | 720 | 9 | 8 | 5 | 2 | ✅ (Completed) | [write_tests_for_whitespace_detection_bug](./completed/081_write_tests_for_whitespace_detection_bug.md) | Write tests for whitespace detection bug in parse_from_argv |
2020
| 14 | 048 | 672 | 8 | 6 | 7 | 2 | ✅ (Completed) | [write_tests_for_hybrid_registry_optimization](./completed/048_write_tests_for_hybrid_registry_optimization.md) | Write tests for hybrid registry optimization |
2121
| 15 | 049 | 672 | 8 | 6 | 7 | 2 | ✅ (Completed) | [implement_hybrid_registry_optimization](./completed/049_implement_hybrid_registry_optimization.md) | Implement hybrid registry optimization |
2222
| 16 | 050 | 672 | 8 | 6 | 7 | 2 | ✅ (Completed) | [write_tests_for_multi_yaml_build_system](./completed/050_write_tests_for_multi_yaml_build_system.md) | Write tests for multi-YAML build system |

module/core/unilang_parser/src/parser_engine.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1131,8 +1131,8 @@ impl Parser
11311131
}
11321132

11331133
// Add the complete named argument as a single token: key::"value"
1134-
// Quote the value if it contains spaces or is empty
1135-
if value.contains( ' ' ) || value.is_empty()
1134+
// Quote the value if it contains whitespace or is empty
1135+
if value.chars().any( char::is_whitespace ) || value.is_empty()
11361136
{
11371137
tokens.push( format!( "{key}::\"{value}\"" ) );
11381138
}
@@ -1144,8 +1144,8 @@ impl Parser
11441144
else
11451145
{
11461146
// Not a named argument - just add as-is
1147-
// Quote if it contains spaces to preserve the token boundary
1148-
if arg.contains( ' ' )
1147+
// Quote if it contains whitespace to preserve the token boundary
1148+
if arg.chars().any( char::is_whitespace )
11491149
{
11501150
tokens.push( format!( "\"{arg}\"" ) );
11511151
}

0 commit comments

Comments
 (0)