Skip to content

Commit 4d94b2c

Browse files
jasiszclaude
andcommitted
wasm-gc: specialize Int comparison against an i64 constant to a native compare
When a boxed `$AverInt` is compared against an Int literal that fits i64 (a bound check `n <= 100`, a wall test `head.x < 1`), the comparison now lowers to a tag-branch instead of allocating the constant as an `$AverInt` and calling the general `__aint_cmp`: - read `$magf`; if null (Small) load `$small` and do the native signed i64 compare against the constant; - if non-null (Big) the result is fixed by `$sign` alone — a Big is |·| > i64::MAX, so above every i64 constant when positive and below every i64 constant when negative; equality against an i64 constant is always false. Both operand orders are handled by flipping the relation (`K < x` ≡ `x > K`). The carrier invariant makes this sound: `__aint_normalize` / `__aint_from_i64` keep every value in `[i64::MIN, i64::MAX]` as a Small (it demotes exactly -2^63 to Small) and everything outside as a Big, so the Small branch is exact and the Big branch is sign-determined. Fail-closed: a `$AverInt`-vs-`$AverInt` comparison (no literal operand) stays on `__aint_cmp` byte-for-byte. The non-literal operand is stashed in a dedicated `(ref null $AverInt)` scratch slot (reserved whenever bignum is active) so neither field read re-evaluates the operand expression. An exhaustive VM-vs-wasm-gc differential is the soundness oracle: every one of the six comparisons, both operand orders, and constants {0, 1, -1, 100, -100, i64::MAX, i64::MIN+1} against Small-at-boundary, Small-far, Big-positive (`a*a*a`) and Big-negative (`0 - a*a*a`) operands must decode to the same boolean the VM produces. The bound-check program whose only comparison is against a constant DCEs `__aint_cmp` + `__aint_decompose` under `--optimize size` (~330 B drop vs the non-literal-bound twin). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 1eadafa commit 4d94b2c

3 files changed

Lines changed: 598 additions & 1 deletion

File tree

src/codegen/wasm_gc/body/from_mir/builtins.rs

Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1806,6 +1806,50 @@ pub(crate) fn emit_mir_numeric_binop(
18061806
} else {
18071807
l_ty
18081808
};
1809+
// bignum const-compare specialization — an `$AverInt` compared
1810+
// against an i64-fitting Int LITERAL lowers to a tag-branch (Small
1811+
// → native i64 compare; Big → sign-determined) instead of
1812+
// allocating the constant as an `$AverInt` and calling
1813+
// `__aint_cmp`. Fail-closed: only fires when both operands are the
1814+
// `$AverInt` ref type, exactly one is a literal that `fits_i64`,
1815+
// and the op is one of the six comparisons. A `$AverInt`-vs-
1816+
// `$AverInt` comparison (`p.x == q.x`) is NOT a literal on either
1817+
// side, so it stays on `__aint_cmp` byte-for-byte.
1818+
if ctx.registry.bignum
1819+
&& ctx.registry.aint_struct_idx.is_some()
1820+
&& operand != Some(ValType::F64)
1821+
&& operand != Some(ValType::I32)
1822+
&& l_ty
1823+
== ctx
1824+
.registry
1825+
.aint_struct_idx
1826+
.map(crate::codegen::wasm_gc::types::struct_ref)
1827+
&& r_ty
1828+
== ctx
1829+
.registry
1830+
.aint_struct_idx
1831+
.map(crate::codegen::wasm_gc::types::struct_ref)
1832+
&& matches!(
1833+
bop.op,
1834+
BinOp::Eq | BinOp::Neq | BinOp::Lt | BinOp::Gt | BinOp::Lte | BinOp::Gte
1835+
)
1836+
{
1837+
// `const_on_left` records which side the constant sat on so
1838+
// the tag-branch can flip the comparison (`K < x` ≡ `x > K`).
1839+
let lit = mir_int_literal(l)
1840+
.map(|k| (k, true))
1841+
.or_else(|| mir_int_literal(r).map(|k| (k, false)));
1842+
if let Some((k, const_on_left)) = lit {
1843+
// The non-literal operand (the `$AverInt` ref) goes on the
1844+
// stack; the literal is folded into the branch as an
1845+
// `i64.const`.
1846+
let non_lit = if const_on_left { r } else { l };
1847+
if emit_mir_expr(func, non_lit, slots, ctx)?.is_none() {
1848+
return Ok(None);
1849+
}
1850+
return emit_aint_cmp_const(func, bop.op, k, const_on_left, slots, ctx);
1851+
}
1852+
}
18091853
if emit_mir_expr(func, l, slots, ctx)?.is_none() {
18101854
return Ok(None);
18111855
}
@@ -1934,3 +1978,146 @@ fn emit_aint_binop(
19341978
}
19351979
Ok(Some(()))
19361980
}
1981+
1982+
/// If `expr` is a literal `Int`, return its `i64` value. Used by the
1983+
/// const-compare specialization to peel the constant operand of an
1984+
/// `$AverInt`-vs-constant comparison. `Literal::Int` is an `i64` in the
1985+
/// AST, so the value is always `fits_i64` by construction — there is no
1986+
/// out-of-range Int literal to reject. (A source literal that exceeds
1987+
/// `i64` is rejected at lex/parse time, before MIR.)
1988+
fn mir_int_literal(expr: &Spanned<MirExpr>) -> Option<i64> {
1989+
match &expr.node {
1990+
MirExpr::Literal(l) => match l.node {
1991+
crate::ast::Literal::Int(n) => Some(n),
1992+
_ => None,
1993+
},
1994+
_ => None,
1995+
}
1996+
}
1997+
1998+
/// bignum const-compare specialization — the non-literal `$AverInt`
1999+
/// operand is already on the stack; emit a tag-branch comparison
2000+
/// against the i64 constant `k` and leave an i32 bool, replacing the
2001+
/// general `__aint_cmp` call.
2002+
///
2003+
/// Soundness rests on the carrier invariant (`__aint_normalize` /
2004+
/// `__aint_from_i64`): a value is **Small** (`$magf == null`) iff it is
2005+
/// in `[i64::MIN, i64::MAX]` — the FULL i64 range, including `i64::MIN`
2006+
/// which `normalize` deliberately demotes to a Small. A value is
2007+
/// **Big** (`$magf != null`) iff `|value| > i64::MAX`, i.e. strictly
2008+
/// outside the i64 range, with `$sign ∈ {-1, +1}`. Therefore:
2009+
/// - Small → load `$small` and do the NATIVE signed i64 compare
2010+
/// against `k`.
2011+
/// - Big → the relation against any i64 constant is fixed by `$sign`
2012+
/// alone: a Big-positive is `> k` for every i64 `k`, a Big-negative
2013+
/// is `< k`; and a Big NEVER equals an i64 constant.
2014+
///
2015+
/// `const_on_left` flips the comparison so the `$AverInt` is always the
2016+
/// left operand of the EFFECTIVE relation (`k < x` ≡ `x > k`).
2017+
fn emit_aint_cmp_const(
2018+
func: &mut Function,
2019+
op: BinOp,
2020+
k: i64,
2021+
const_on_left: bool,
2022+
slots: &SlotTable,
2023+
ctx: &EmitCtx<'_>,
2024+
) -> Result<Option<()>, WasmGcError> {
2025+
let aint_idx = ctx.registry.aint_struct_idx.ok_or(WasmGcError::Validation(
2026+
"const-compare specialization requires the $AverInt struct slot".into(),
2027+
))?;
2028+
let scratch = slots.const_cmp_scratch.ok_or(WasmGcError::Validation(
2029+
"const-compare specialization needs a scratch slot but none was reserved".into(),
2030+
))?;
2031+
2032+
// Normalize so the `$AverInt` is the LEFT operand of the effective
2033+
// relation. The source spelled `const OP aint` when `const_on_left`,
2034+
// which is equivalent to `aint OP_flipped const`.
2035+
let eff = if const_on_left { flip_cmp(op) } else { op };
2036+
2037+
// Stash the operand so both field reads (`$magf`, then `$small` /
2038+
// `$sign`) read the same value without re-emitting the operand
2039+
// expression.
2040+
func.instruction(&Instruction::LocalSet(scratch));
2041+
2042+
let block_ty = wasm_encoder::BlockType::Result(ValType::I32);
2043+
// if ($magf == null) → Small
2044+
func.instruction(&Instruction::LocalGet(scratch));
2045+
func.instruction(&Instruction::StructGet {
2046+
struct_type_index: aint_idx,
2047+
field_index: 1,
2048+
});
2049+
func.instruction(&Instruction::RefIsNull);
2050+
func.instruction(&Instruction::If(block_ty));
2051+
2052+
// ── Small: native i64 compare of `$small` against `k` ──
2053+
func.instruction(&Instruction::LocalGet(scratch));
2054+
func.instruction(&Instruction::StructGet {
2055+
struct_type_index: aint_idx,
2056+
field_index: 0,
2057+
});
2058+
func.instruction(&Instruction::I64Const(k));
2059+
func.instruction(&match eff {
2060+
BinOp::Eq => Instruction::I64Eq,
2061+
BinOp::Neq => Instruction::I64Ne,
2062+
BinOp::Lt => Instruction::I64LtS,
2063+
BinOp::Gt => Instruction::I64GtS,
2064+
BinOp::Lte => Instruction::I64LeS,
2065+
BinOp::Gte => Instruction::I64GeS,
2066+
_ => unreachable!("emit_aint_cmp_const gated to the six comparisons"),
2067+
});
2068+
2069+
func.instruction(&Instruction::Else);
2070+
2071+
// ── Big: the result is determined by `$sign` alone ──
2072+
// A Big-positive (`$sign > 0`) exceeds every i64 `k`; a Big-negative
2073+
// (`$sign < 0`) is below every i64 `k`; a Big never equals `k`.
2074+
match eff {
2075+
// x < k ⟺ x is Big-negative ⟺ $sign < 0
2076+
// x <= k ⟺ same (Big never == k)
2077+
BinOp::Lt | BinOp::Lte => {
2078+
func.instruction(&Instruction::LocalGet(scratch));
2079+
func.instruction(&Instruction::StructGet {
2080+
struct_type_index: aint_idx,
2081+
field_index: 2,
2082+
});
2083+
func.instruction(&Instruction::I32Const(0));
2084+
func.instruction(&Instruction::I32LtS);
2085+
}
2086+
// x > k ⟺ x is Big-positive ⟺ $sign > 0
2087+
// x >= k ⟺ same (Big never == k)
2088+
BinOp::Gt | BinOp::Gte => {
2089+
func.instruction(&Instruction::LocalGet(scratch));
2090+
func.instruction(&Instruction::StructGet {
2091+
struct_type_index: aint_idx,
2092+
field_index: 2,
2093+
});
2094+
func.instruction(&Instruction::I32Const(0));
2095+
func.instruction(&Instruction::I32GtS);
2096+
}
2097+
// A Big never equals an i64 constant.
2098+
BinOp::Eq => {
2099+
func.instruction(&Instruction::I32Const(0));
2100+
}
2101+
BinOp::Neq => {
2102+
func.instruction(&Instruction::I32Const(1));
2103+
}
2104+
_ => unreachable!("emit_aint_cmp_const gated to the six comparisons"),
2105+
}
2106+
2107+
func.instruction(&Instruction::End);
2108+
Ok(Some(()))
2109+
}
2110+
2111+
/// Flip a comparison operator across its operands: `k OP x` ≡
2112+
/// `x flip(OP) k`. Equality / inequality are symmetric and unchanged.
2113+
fn flip_cmp(op: BinOp) -> BinOp {
2114+
match op {
2115+
BinOp::Lt => BinOp::Gt,
2116+
BinOp::Gt => BinOp::Lt,
2117+
BinOp::Lte => BinOp::Gte,
2118+
BinOp::Gte => BinOp::Lte,
2119+
BinOp::Eq => BinOp::Eq,
2120+
BinOp::Neq => BinOp::Neq,
2121+
other => other,
2122+
}
2123+
}

src/codegen/wasm_gc/body/slots.rs

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ use crate::ir::hir::{
1818
use crate::types::Type;
1919

2020
use super::super::WasmGcError;
21-
use super::super::types::{TypeRegistry, aver_to_wasm};
21+
use super::super::types::{TypeRegistry, aver_to_wasm, struct_ref};
2222
use super::FnMap;
2323
use super::infer::{
2424
arm_is_option_pattern_resolved, arm_is_result_pattern_resolved, aver_type_str_of,
@@ -93,6 +93,18 @@ pub(super) struct SlotTable {
9393
/// `Random.int(readBound(), 10)`). Stash `min` once after first
9494
/// emit and reuse via LocalGet.
9595
pub(super) random_int_wasip2_min_scratch: Option<u32>,
96+
/// Scratch `(ref null $AverInt)` slot for the const-compare
97+
/// specialization (`$AverInt` compared against an i64-fitting
98+
/// literal). The non-literal operand is stashed here once so the
99+
/// tag-branch can read its `$magf` (Small/Big discriminant) and
100+
/// then its `$small`/`$sign` field without re-emitting the operand
101+
/// expression — which would double-run any side effect in it.
102+
/// Reserved whenever `bignum` is active (one unused ref local when
103+
/// no const-comparison is present — cheaper than walking the MIR
104+
/// body here, which would couple slot allocation to the body walk
105+
/// and break the differential-gate invariant that `build_for_fn`
106+
/// reads only the resolver tables, not the MIR body).
107+
pub(super) const_cmp_scratch: Option<u32>,
96108
}
97109

98110
impl SlotTable {
@@ -298,6 +310,23 @@ impl SlotTable {
298310
} else {
299311
None
300312
};
313+
// Const-compare specialization scratch: a single `(ref null
314+
// $AverInt)` slot to stash the non-literal operand of an
315+
// `$AverInt`-vs-i64-constant comparison. Reserved whenever
316+
// bignum is active so the slot index is a pure function of the
317+
// registry flag (not the MIR body), preserving the invariant
318+
// that `build_for_fn` keys only off the resolver tables.
319+
let const_cmp_scratch = if registry.bignum {
320+
if let Some(aint_idx) = registry.aint_struct_idx {
321+
let idx = by_slot.len() as u32;
322+
by_slot.push(struct_ref(aint_idx));
323+
Some(idx)
324+
} else {
325+
None
326+
}
327+
} else {
328+
None
329+
};
301330
Ok(Self {
302331
by_slot,
303332
subject_scratch,
@@ -307,6 +336,7 @@ impl SlotTable {
307336
args_get_wasip2_retptr_scratch,
308337
env_get_wasip2_scratch,
309338
random_int_wasip2_min_scratch,
339+
const_cmp_scratch,
310340
})
311341
}
312342

0 commit comments

Comments
 (0)