API ergonomics: intervention signal + bulk request-header ABI

While building a Python SDK on top of libcoraza (Flask / FastAPI / Starlette middleware) I ran into three API shapes that cost real cycles or correctness at runtime and tripped me up during development. They're unrelated enough to fix independently but landing them together is cheap and gives downstream bindings a much cleaner shape.

Profile (Ryzen 7 7800X3D, CRS v4.11, 550 rules, benign `GET /search?q=hello+world`, Python 3.13, cffi):

| Phase | Cost |
|---|---:|
| `coraza_process_request_body` | 224 µs |
| `coraza_process_request_headers` | 124 µs |
| `coraza_process_logging` + `coraza_intervention` polling | 17 µs |
| Other libcoraza calls (connection, uri, 5× `add_request_header`, new/free tx) | ~4 µs |
| Python wrapper layer | 39 µs |
| **Total** | **408 µs/txn** |

None of the three items below is on the critical path, but all three are friction points a binding author hits immediately.

---

## 1. Interruption signaling is out-of-band

Each phase function returns `0` on success and `0` on disruption. The only way to detect that Coraza decided to block is to poll `coraza_intervention()` after every interruptible phase (1, 2, 3, 4). That:

- **Confused every binding author I know.** When I first wrote the Python wrapper I assumed `rc == 1` meant interrupted (the way coraza-proxy-wasm signals). Integration tests passed against benign traffic but silently failed to block attacks, because the middleware never detected the disruption. This is a bypass-shaped bug in every downstream binding that follows the same wrong assumption.
- **Forces a round-trip per phase.** `coraza_intervention()` is a cgo boundary cross + a Go lookup + an alloc of the returned struct (plus a mandatory `coraza_free_intervention`). Each call is cheap (~5 µs) but we pay it on every phase on every request, including the 99% of requests that aren't disrupted.

### Proposed fix — return 1 on disruption

`ProcessRequestHeaders`, `ProcessRequestBody`, `ProcessResponseHeaders`, `ProcessResponseBody` already have a thread-local disruption flag internally (`tx.interrupted`). Plumb it to the return code:

```go
//export coraza_process_request_headers
func coraza_process_request_headers(t C.coraza_transaction_t) C.int {
    tx := fromRaw[types.Transaction](t)
    tx.ProcessRequestHeaders()
    if tx.IsInterrupted() {
        return 1
    }
    return 0
}
```

Happy path: zero behavior change, same alloc profile.
Disrupted path: caller learns at phase-return time, calls `coraza_intervention()` *once*, and can skip subsequent phases.

This is a breaking ABI change for return codes. Easiest rollout:

- Add `coraza_process_request_headers_v2` (returning 0/1 properly), deprecate the old one.
- OR bump the major, document \"return codes: 0 = ok, 1 = interrupted, -1 = error\" like most C WAF wrappers do.

Either way, the cgo return value becomes meaningful and downstream bindings stop needing their per-phase polling loop.

---

## 2. Bulk request-header ABI

`coraza_add_request_header` is one call per header. For a typical request with 5 headers, that's 5 cgo boundary crossings + 5 `C.GoString` allocations. Each boundary is ~300 ns cgo overhead + a Go-heap string alloc of (name+value) bytes.

Relative cost in my profile is low (~2 µs / 1.5% of txn time) but the pattern forces binding authors to loop in Python/Ruby/whatever and pay N times the round-trip. A bulk API would let the binding send one packed buffer.

### Proposed signature

```c
int coraza_add_request_headers(
    coraza_transaction_t t,
    const char *packed,
    size_t packed_len,
    int count
);
```

Encoding (one byte-buffer, single alloc on the caller side):

```
[name_len u16][name_bytes][value_len u32][value_bytes]   × count
```

`u16` for name (HTTP disallows >64KB header names; 255 would be tight) and `u32` for value because some auth headers get huge.

Go side parses the buffer in one goroutine, calls the existing `tx.AddRequestHeader(name, value)` N times, returns 0 on success.

Additive only. No breaking change. Minor bump.

Same idea would apply to `coraza_add_response_headers` and `coraza_add_get_args` but #1 is the one that matters for hot-path workloads.

---

## 3. `coraza_matched_rule_get_error_log` has no paired `coraza_free_string`

`coraza_matched_rule_get_error_log` returns a `char *` with a comment stating that the caller is responsible for freeing it. No `coraza_free_string` is exported. Callers are expected to call libc `free()` on the returned pointer.

Problems this causes for binding authors:

- **Allocator mismatch on Windows.** The Go runtime builds with the MinGW CRT when compiled as a c-shared library, while Python on Windows is built against the MSVC CRT. `free()` from MSVC on a pointer allocated by `C.CString` in MinGW's Go runtime is undefined behavior and crashes in release builds. Same trap exists for Rust bindings built with MSVC toolchain. This isn't theoretical — it's the single most common trap with Go c-shared DLLs on Windows.
- **Harder to bind correctly in safe languages.** Rust's `CString::from_raw` wants a pointer that was allocated by Rust, not a Go-side `C.CString`. Same for managed languages that wrap the pointer in a `SafeHandle`-style finalizer — they need a known `free` function to call, not libc's.
- **No way to interpose.** If an allocator is different (e.g. jemalloc preloaded, tcmalloc), the mismatch on free becomes a corrupted heap.

A paired free function also makes the ownership contract obvious at the symbol level — you see `coraza_matched_rule_get_error_log` in the header and immediately know to look for `coraza_free_*` next to it.

### Proposed fix

Add a symmetric `coraza_free_string(char *s)` that calls Go's `C.free(unsafe.Pointer(s))` (which in turn routes through Go's cgo-allocated memory correctly regardless of host CRT):

```go
//export coraza_free_string
func coraza_free_string(s *C.char) {
    C.free(unsafe.Pointer(s))
}
```

Document `coraza_matched_rule_get_error_log` explicitly: \"returned pointer must be released with `coraza_free_string`; do not call libc free directly.\"

Additive only. No breaking change. Minor bump.

Also consider: if there are other string-returning APIs we don't have today but might add (rule metadata, tag enumeration, debug dumps), having `coraza_free_string` on the table early saves a future compatibility break.

---

## Why this matters for downstream bindings

coraza-node (WASM) hides most of this at the TS layer with a custom ABI. For C-linked bindings (Python/Ruby/Java/Rust via SWIG or FFI), we're on the public C surface, and every language ecosystem hits the same sharp edges above. #1 is correctness (bypass-shaped), #2 is ergonomics (~1% perf), #3 is portability (Windows + safe-language bindings).

I'm happy to prototype the three changes against the current `main` and send a PR if you'd like. Flag your preference for migration path on #1 (versioned symbol vs ABI major bump) and I'll match it. #2 and #3 are purely additive.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API ergonomics: intervention signal + bulk request-header ABI #93

1. Interruption signaling is out-of-band

Proposed fix — return 1 on disruption

2. Bulk request-header ABI

Proposed signature

3. `coraza_matched_rule_get_error_log` has no paired `coraza_free_string`

Proposed fix

Why this matters for downstream bindings

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Phase	Cost
`coraza_process_request_body`	224 µs
`coraza_process_request_headers`	124 µs
`coraza_process_logging` + `coraza_intervention` polling	17 µs
Other libcoraza calls (connection, uri, 5× `add_request_header`, new/free tx)	~4 µs
Python wrapper layer	39 µs
Total	408 µs/txn

API ergonomics: intervention signal + bulk request-header ABI #93

Description

1. Interruption signaling is out-of-band

Proposed fix — return 1 on disruption

2. Bulk request-header ABI

Proposed signature

3. coraza_matched_rule_get_error_log has no paired coraza_free_string

Proposed fix

Why this matters for downstream bindings

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

3. `coraza_matched_rule_get_error_log` has no paired `coraza_free_string`