Skip to content

Commit afdc243

Browse files
committed
http: declare unsigned-char alphabet in ragel parsers
ragel 6.10 -G2 generates incorrect code for the obs_text = 0x80..0xff range when the build host's default `char` is unsigned (notably aarch64). The state machine collapses byte 127 (DEL) and the legitimately allowed obs_text range into a single forbidden interval (`127 <= *p -> goto error`), so any HTTP field value containing bytes 0x80..0xff is rejected even though RFC 7230 §3.2.6 permits them. x86_64 happens to produce correct code via a different range-merging path. The .rl source is identical; only the host's signed-or- unsigned-char default flips ragel's codegen. The fix uses two ragel directives per machine: alphtype unsigned char; # picks the correct codegen path getkey ((unsigned char)(*p)); # forces fetched bytes unsigned `alphtype unsigned char;` is the load-bearing fix: it makes ragel treat the alphabet as 0..255 regardless of host, picking the correct emit that lists `case 127: goto st0` plus a 0..31 ctrl-range check while leaving 0x80..0xff fall through to the accept branch. `getkey ((unsigned char)(*p))` is required because ragel now emits character literals with the `u` suffix (e.g. `case 13u:`). On x86 the input pointer's `*p` is signed char, and comparing a signed-char to an unsigned literal triggers -Wsign-compare which seastar promotes to an error. The getkey directive casts the fetched byte to unsigned so the comparisons are unsigned on both sides. Affects seastar's HTTP server (http_request_parser via httpd.hh) and HTTP client (http_response_parser via client.cc) on arm64; the chunked-transfer trailer parser (http_chunk_trailer_parser) inherits the same fix. Unblocks three existing tests that fail on aarch64 once the obs_text input is exercised: - tests/unit/chunk_parsers_test.cc :: test_trailer_headers_parsing - tests/unit/request_parser_test.cc :: test_header_parsing - tests/unit/httpd_test.cc :: test_full_chunk_format Fixes scylladb#3413.
1 parent 510f314 commit afdc243

3 files changed

Lines changed: 12 additions & 0 deletions

File tree

src/http/chunk_parsers.rl

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,9 @@ namespace seastar {
3232
%%{
3333

3434
access _fsm_;
35+
# alphtype + getkey: workaround for #3413
36+
alphtype unsigned char;
37+
getkey ((unsigned char)(*p));
3538

3639
action mark {
3740
g.mark_start(p);
@@ -152,6 +155,9 @@ public:
152155
%%{
153156

154157
access _fsm_;
158+
# alphtype + getkey: workaround for #3413
159+
alphtype unsigned char;
160+
getkey ((unsigned char)(*p));
155161

156162
action mark {
157163
g.mark_start(p);

src/http/request_parser.rl

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,9 @@ namespace seastar {
3333
%%{
3434

3535
access _fsm_;
36+
# alphtype + getkey: workaround for #3413
37+
alphtype unsigned char;
38+
getkey ((unsigned char)(*p));
3639

3740
action mark {
3841
g.mark_start(p);

src/http/response_parser.rl

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,9 @@ namespace seastar {
3535
%%{
3636

3737
access _fsm_;
38+
# alphtype + getkey: workaround for #3413
39+
alphtype unsigned char;
40+
getkey ((unsigned char)(*p));
3841

3942
action mark {
4043
g.mark_start(p);

0 commit comments

Comments
 (0)