Description
Nodes and symbols currently store an atomic integer representing their ID, which is lazily assigned upon first use. I was playing with using the pointer addresses as their ID, reducing the size of those structures and avoiding atomic reads on access.
func GetNodeId(node *Node) NodeId {
return NodeId(uint64(uintptr(unsafe.Pointer(node))))
}
func GetSymbolId(symbol *Symbol) SymbolId {
return SymbolId(uint64(uintptr(unsafe.Pointer(symbol))))
}
A potential downside is that KeyBuilder
will observe larger numbers, requiring additional bytes to represent various keys. It may also affect ordering as IDs will no longer be strictly increasing, but I believe this shouldn't matter given that ID assignment is already non-deterministic.
A quick attempt passes the hereby test
and shows a slight reduction in overall memory usage, but I don't have good insight into actual performance benfits/downsides of this approach.
I am opening this as issue instead of PR to discuss, possibly to learn this has been considered and may have been rejected for some reason. I'd be happy to open a PR if desired.
Activity
jakebailey commentedon Mar 17, 2025
I don't think we want to do this; we currently have no unsafe code (avoiding a compliance problem). The benefit would have to be pretty significant to want to do this.
To test, I'd do:
Then post the results. Hoping to add some checker benchmarks soon.
With 64 bit IDs, I think it'd be a better idea to just unconditionally construct them with an ID and then never change it. It won't be lazy anymore, but we're not going to "run out".
JoostK commentedon Mar 17, 2025
I just read this:
Which makes this a no-go (pun intended) altogether.
jakebailey commentedon Mar 17, 2025
Yes, they've been thinking about doing moving GC, so it's definitely a dangerous operation.
JoostK commentedon Mar 21, 2025
old.txt
is a7e0eb4address.txt
is using pointer addresses as node and symbol IDseager.txt
is eagerly assigning node and symbol IDs during construction (still an atomics add, but reads become non-atomic loads)$ benchstat old.txt address.txt
$ benchstat old.txt eager.txt
$ benchstat eager.txt address.txt
The effects are mostly expected;
eager
slows down the binder because of the atomic operations, as benefits will only become noticeable in the check phase. Exploiting address pointers does show a meaningful improvement in these tests, but this is fundamentally at odds with Go's runtime guarantees so may not be a desired way forward.With checker benchmarks this may start to show a more faithful picture of overall impact.
jakebailey commentedon Mar 28, 2025
I think we're going to want to do the eager version of this.
For the public API we're going to need for IDs to fit within JS's
Number.MAX_SAFE_INTEGER
, which means we cannot use the pointer value itself as an ID (outside of it generally not being safe).