Skip to content

Commit 450d0e7

Browse files
authored
fix: Support self-recursive types in Weaver.fromSurface (#515) (#517)
## Summary Fixes #515: `Weaver.fromSurface` threw `java.lang.IllegalStateException: Recursive update` from `ConcurrentHashMap.computeIfAbsent` when building a Weaver for a self-recursive type (e.g. `class T(children: List[T])`). The fix mirrors `airframe-codec`'s `LazyCodec` design (which is why the analogous wvlet code didn't hit this bug there) and adapts it for uni's globally-shared cache. ## What changed - **Drop `getOrElseUpdate` / nested `computeIfAbsent`.** Use direct `ConcurrentHashMap.get` + `putIfAbsent` so cache writes never nest (which is what Java's CHM blocks for recursive computations on the same key). - **Per-thread pending-build map.** Each top-level `fromSurface` accumulates its sub-builds in a thread-local `LinkedHashMap`. Entries are committed to the shared cache only after the outer-most call returns, so concurrent readers never observe a partial weaver tree. - **`LazyWeaver` placeholder with direct resolution.** When recursion re-enters a surface that's still on the build stack, a `LazyWeaver` is returned. Its target is wired in via `resolve(...)` *immediately* after that surface's `buildWeaver` returns — not via cache lookup at use time. So visibility to other threads is independent of commit order. - **Track recursion by `Surface.fullName`** (not the Surface instance). `LazySurface` (uni's compile-time recursive-surface placeholder) and the corresponding `GenericSurface` have different equality, so a Set-of-Surface guard would miss the cycle in the common `Node(next: Option[Node])` case. `WeaverFactory` and the public surface of `Weaver` are unchanged. ## Test plan - [x] New `RecursiveSurfaceWeaverTest` covers: self-recursive abstract class with `List[Self]` (the `DataType` shape from the issue), abstract field embedded in a concrete case class, self-recursive case class via `Option`, mutually recursive case classes, a round-trip through the `LazyWeaver` path, and a 64-task concurrent stress test on the same recursive type. - [x] `coreJVM/test` + `uniJVM/test` — 1595 tests pass (4 pre-existing pending), including all 17 existing `WeaverFromSurfaceTest` cases. - [x] `scalafmtAll` clean. - [x] `codex review --uncommitted` — issues identified and addressed across iterations. Closes #515. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
1 parent 6efb6a3 commit 450d0e7

4 files changed

Lines changed: 338 additions & 7 deletions

File tree

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
# Weaver: Support self-recursive types in `fromSurface`
2+
3+
Tracking: https://github.com/wvlet/uni/issues/515
4+
5+
## Problem
6+
7+
`Weaver.fromSurface` (added in #513) crashes on self-recursive types with:
8+
9+
```
10+
java.lang.IllegalStateException: Recursive update
11+
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent
12+
```
13+
14+
Reproducer (wvlet's `DataType` is the canonical case):
15+
16+
```scala
17+
abstract class DataType(val typeName: TypeName, val typeParams: List[DataType])
18+
```
19+
20+
Building `Weaver[DataType]` walks its fields, hits `typeParams: List[DataType]`,
21+
and recursively calls `fromSurface(DataType)` while the outer
22+
`computeIfAbsent` for that key is still on the stack — which Java's
23+
`ConcurrentHashMap` explicitly disallows.
24+
25+
## Why airframe-codec doesn't hit this
26+
27+
`MessageCodecFactory.ofSurface` threads a `seen: Set[Surface]` and returns a
28+
`LazyCodec` placeholder when recursion sees the same surface twice. The
29+
placeholder's `ref` is `lazy val` and looks up the per-factory cache only
30+
when `pack`/`unpack` is called. Crucially that cache is a *plain `Map`
31+
private to the factory instance*, so it is single-threaded — no other
32+
thread ever observes a partial weaver tree.
33+
34+
uni's `Weaver.fromSurface` uses a single global cache, so adapting the
35+
airframe approach has to be more careful about cross-thread visibility.
36+
37+
## Final design
38+
39+
1. **Drop `getOrElseUpdate` / nested `computeIfAbsent`.** Use direct
40+
`ConcurrentHashMap.get` + `putIfAbsent` so cache writes never nest
41+
(which is what Java's CHM blocks for recursive computations on the
42+
same key).
43+
2. **Per-thread pending-build map.** Each top-level `fromSurface`
44+
accumulates its sub-builds in a `ThreadLocal[LinkedHashMap]`. Entries
45+
are committed to the shared cache only after the outer-most call
46+
returns — concurrent readers never observe a partial weaver tree.
47+
3. **`LazyWeaver` placeholder with direct resolution.** When recursion
48+
re-enters a surface still on the build stack, we return a
49+
`LazyWeaver`. After that surface's `buildWeaver` finishes, we wire the
50+
placeholder's target in directly via `resolve(...)`*not* via cache
51+
lookup at use time. So visibility to other threads is independent of
52+
commit order.
53+
4. **Track recursion by `Surface.fullName`** (not the Surface instance).
54+
`LazySurface` (uni's compile-time recursive-surface placeholder) and
55+
the corresponding `GenericSurface` have different equality, so a
56+
Set-of-Surface guard would miss the cycle in the common
57+
`Node(next: Option[Node])` case.
58+
59+
`WeaverFactory` and the rest of the public surface of `Weaver` are
60+
unchanged: factories don't need a `seen` parameter because recursion is
61+
handled inside `fromSurface` itself.
62+
63+
## Worked sketch
64+
65+
```scala
66+
// Each pending entry holds the placeholder that recursive callers see, plus
67+
// the final weaver once buildWeaver completes for that surface.
68+
private final class PendingEntry(val placeholder: LazyWeaver):
69+
var built: Weaver[?] = null
70+
71+
private val pendingBuild
72+
: ThreadLocal[mutable.LinkedHashMap[String, PendingEntry]] = ...
73+
74+
def fromSurface(surface: Surface): Weaver[?] =
75+
val key = surface.fullName
76+
val cached = surfaceWeaverCache.get(key)
77+
if cached != null then return cached
78+
79+
val pending = pendingBuild.get()
80+
pending.get(key) match
81+
case Some(entry) if entry.built != null => entry.built
82+
case Some(entry) => entry.placeholder // cycle
83+
case None =>
84+
val isOuterMost = pending.isEmpty
85+
val entry = PendingEntry(LazyWeaver())
86+
pending(key) = entry
87+
try
88+
val built = buildWeaver(surface)
89+
// Wire any captured placeholder *before* the weaver escapes the build.
90+
entry.placeholder.resolve(built)
91+
entry.built = built
92+
if isOuterMost then
93+
for (k, e) <- pending do surfaceWeaverCache.putIfAbsent(k, e.built)
94+
built
95+
finally
96+
if isOuterMost then pending.clear()
97+
98+
// LazyWeaver holds a directly-set ref. By the time the build returns, every
99+
// placeholder has been resolved — so consumers never see an unresolved one,
100+
// regardless of what order entries land in surfaceWeaverCache.
101+
private final class LazyWeaver extends Weaver[Any]:
102+
@volatile private var ref: Weaver[Any] = null
103+
def resolve(w: Weaver[?]): Unit = ref = w.asInstanceOf[Weaver[Any]]
104+
override def pack(...) = ref.pack(...)
105+
override def unpack(...) = ref.unpack(...)
106+
```
107+
108+
## Iterations and lessons (from codex review)
109+
110+
The fix went through three drafts. Each round revealed a subtler issue:
111+
112+
1. **Draft 1**`seen: Set[Surface]` threaded through factories; cache
113+
committed entry-by-entry.
114+
- codex: `seen.contains(surface)` misses `LazySurface` because
115+
`LazySurface` and `GenericSurface` have different case-class
116+
equality even when they describe the same type.
117+
- codex: a sub-weaver containing a `LazyWeaver` is published to the
118+
shared cache mid-build, so a parallel reader can grab it before
119+
the LazyWeaver's target is in the cache.
120+
121+
2. **Draft 2** — atomic commit at outer-most return; key recursion by
122+
`fullName` via per-thread pending map.
123+
- codex: when the outer-most surface is a *wrapper* around the
124+
recursive type (e.g. `case class Wrap(node: Node)`), insertion-order
125+
commit publishes Wrap before Node. Wrap's structure embeds
126+
CaseClassWeaver(Node) inline, which embeds `LazyWeaver("Node")`. A
127+
parallel reader hits IllegalStateException before Node is published.
128+
129+
3. **Draft 3** (final) — `LazyWeaver` holds a directly-set reference,
130+
wired in by the outer build before commit. Use-time no longer touches
131+
the cache, so commit order is irrelevant for correctness.
132+
133+
**Lesson:** when adapting a known design (airframe-codec's `LazyCodec`)
134+
into a different concurrency model (single-instance cache → globally
135+
shared cache), the original cache-lookup pattern stops being safe. The
136+
fix is to *eliminate* the cache lookup at use time, not to engineer the
137+
commit order around it.
138+
139+
## Test
140+
141+
`RecursiveSurfaceWeaverTest` covers:
142+
- self-recursive abstract class with `List[Self]` (the `DataType` shape),
143+
- abstract field embedded in a concrete case class,
144+
- self-recursive case class via `Option`,
145+
- mutually recursive case classes,
146+
- round-trip through the `LazyWeaver` path,
147+
- a 64-task concurrent stress test on the same recursive type — the
148+
regression test for the cross-thread race that draft 2 still had.
149+
150+
## Out of scope
151+
152+
- Surface caching itself (a separate concern).
153+
- Sealed-trait dispatch on the abstract base — the field stays lossy
154+
(empty-object fallback) per #511 / #513.
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
package wvlet.uni.weaver
2+
3+
import wvlet.uni.surface.Surface
4+
import wvlet.uni.test.UniTest
5+
6+
object ConcurrentRecursiveSurfaceWeaverTest:
7+
case class Node(value: Int, next: Option[Node])
8+
9+
class ConcurrentRecursiveSurfaceWeaverTest extends UniTest:
10+
import ConcurrentRecursiveSurfaceWeaverTest.*
11+
12+
test("concurrent fromSurface for the same recursive type never observes a partial tree") {
13+
// Regression: an earlier draft committed sub-weavers (e.g. Option[Node]) to the
14+
// shared cache mid-build, so a parallel reader could grab one whose embedded
15+
// LazyWeaver(Node) had no cache target yet → IllegalStateException at pack time.
16+
// JVM-only because java.util.concurrent isn't available on Scala.js.
17+
val pool = java.util.concurrent.Executors.newFixedThreadPool(8)
18+
try
19+
val task: Runnable =
20+
() =>
21+
val w = Weaver.fromSurface(Surface.of[Node]).asInstanceOf[Weaver[Any]]
22+
val node = Node(1, Some(Node(2, None)))
23+
// Round-trip exercises every embedded LazyWeaver.
24+
w.fromJson(w.toJson(node)) shouldBe node
25+
val futures = (1 to 64).map(_ => pool.submit(task))
26+
futures.foreach(_.get())
27+
finally
28+
pool.shutdown()
29+
}
30+
31+
end ConcurrentRecursiveSurfaceWeaverTest

uni/src/main/scala/wvlet/uni/weaver/Weaver.scala

Lines changed: 88 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ import wvlet.uni.weaver.codec.PrimitiveWeaver
1616
import java.time.Instant
1717
import java.util.UUID
1818
import java.util.concurrent.ConcurrentHashMap
19-
import scala.jdk.CollectionConverters.*
2019

2120
trait Weaver[A]:
2221
def weave(v: A, config: WeaverConfig = WeaverConfig()): MsgPack = toMsgPack(v, config)
@@ -176,25 +175,82 @@ object Weaver:
176175
export PrimitiveWeaver.TupleElementWeaver
177176
export PrimitiveWeaver.{emptyTupleElementWeaver, nonEmptyTupleElementWeaver}
178177

179-
// Cache for weavers created from Surface
180-
private val surfaceWeaverCache = ConcurrentHashMap[String, Weaver[?]]().asScala
178+
// Cache for weavers created from Surface. Only fully-resolved weaver trees are committed.
179+
private val surfaceWeaverCache = ConcurrentHashMap[String, Weaver[?]]()
180+
181+
// Per-thread accumulator for the current top-level fromSurface call. Each entry holds
182+
// the LazyWeaver placeholder that was returned for recursive requests at that key, plus
183+
// the final built weaver once the build completes. Entries are published to
184+
// surfaceWeaverCache only after the outer-most fromSurface call returns, by which time
185+
// every LazyWeaver placeholder has had its target set directly via `resolve`. This
186+
// means consumers reading the cache never have to look up a placeholder's target — it's
187+
// already wired in — so concurrent threads observing a sub-weaver in cache cannot see
188+
// an unresolved LazyWeaver, regardless of commit order.
189+
private final class PendingEntry(val placeholder: LazyWeaver):
190+
var built: Weaver[?] = null
191+
192+
private val pendingBuild
193+
: ThreadLocal[scala.collection.mutable.LinkedHashMap[String, PendingEntry]] =
194+
new ThreadLocal[scala.collection.mutable.LinkedHashMap[String, PendingEntry]]:
195+
override def initialValue() = scala.collection.mutable.LinkedHashMap.empty
181196

182197
/**
183198
* Create a Weaver from Surface at runtime. Uses Surface type information to look up or build
184199
* appropriate Weaver by composing existing weavers.
185200
*
186201
* This is used by RPC framework to derive weavers for method parameters and return types without
187202
* requiring compile-time type information.
203+
*
204+
* Self-recursive types (e.g. `class T(children: List[T])`) are supported via [[LazyWeaver]]:
205+
* when a recursive request is made for a surface still on the current build stack, a placeholder
206+
* is returned. The placeholder's target is wired in directly once the surface's weaver is built,
207+
* so it never depends on cache lookups at use time.
188208
*/
189-
def fromSurface(surface: Surface): Weaver[?] = surfaceWeaverCache.getOrElseUpdate(
190-
surface.fullName,
191-
buildWeaver(surface)
192-
)
209+
def fromSurface(surface: Surface): Weaver[?] =
210+
val key = surface.fullName
211+
val cached = surfaceWeaverCache.get(key)
212+
if cached != null then
213+
return cached
214+
215+
val pending = pendingBuild.get()
216+
pending.get(key) match
217+
case Some(entry) if entry.built != null =>
218+
// Already finished within this build — return the same instance for sibling references.
219+
entry.built
220+
case Some(entry) =>
221+
// Recursive request: parent is still being built. Return its placeholder; it will be
222+
// resolved when the parent's buildWeaver call returns, before any consumer sees it.
223+
entry.placeholder
224+
case None =>
225+
val isOuterMost = pending.isEmpty
226+
val entry = PendingEntry(LazyWeaver())
227+
pending(key) = entry
228+
try
229+
val built = buildWeaver(surface)
230+
// Wire the placeholder directly to the built weaver so any structures that
231+
// captured the placeholder during recursion now have a working target.
232+
entry.placeholder.resolve(built)
233+
entry.built = built
234+
if isOuterMost then
235+
val it = pending.iterator
236+
while it.hasNext do
237+
val (k, e) = it.next()
238+
surfaceWeaverCache.putIfAbsent(k, e.built)
239+
built
240+
finally
241+
if isOuterMost then
242+
pending.clear()
243+
244+
end fromSurface
193245

194246
/**
195247
* Extensible weaver factories for runtime weaver construction. Each factory is a partial
196248
* function that maps a Surface to a Weaver. Factories are tried in order until one matches.
197249
*
250+
* Recursion is handled transparently via [[fromSurface]]: factories may call it directly when
251+
* descending into child surfaces; cycle detection is keyed on `Surface.fullName` so a
252+
* `LazySurface` placeholder resolves to the same key as the corresponding `GenericSurface`.
253+
*
198254
* This can be extended by platform-specific code to add support for additional types.
199255
*/
200256
type WeaverFactory = PartialFunction[Surface, Weaver[?]]
@@ -321,6 +377,31 @@ object Weaver:
321377
throw IllegalArgumentException(s"Cannot create Weaver for type: ${surface.fullName}")
322378
)
323379

380+
/**
381+
* Placeholder Weaver used to break cycles when deriving weavers for self-recursive types. Its
382+
* target is wired in by the outer build via [[resolve]] before any consumer can use it; pack and
383+
* unpack delegate directly to the resolved weaver. Holding a direct reference (rather than
384+
* re-looking up a global cache) means visibility to other threads doesn't depend on commit order
385+
* — the placeholder is always fully wired by the time it escapes the build.
386+
*/
387+
private final class LazyWeaver extends Weaver[Any]:
388+
@volatile
389+
private var ref: Weaver[Any] = null
390+
391+
def resolve(w: Weaver[?]): Unit = ref = w.asInstanceOf[Weaver[Any]]
392+
393+
override def pack(p: Packer, v: Any, config: WeaverConfig): Unit =
394+
val w = ref
395+
if w == null then
396+
throw IllegalStateException("LazyWeaver used before its target was resolved")
397+
w.pack(p, v, config)
398+
399+
override def unpack(u: Unpacker, context: WeaverContext): Unit =
400+
val w = ref
401+
if w == null then
402+
throw IllegalStateException("LazyWeaver used before its target was resolved")
403+
w.unpack(u, context)
404+
324405
// Weaver factory methods for composing weavers at runtime
325406

326407
private val unitWeaver: Weaver[Unit] =
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
package wvlet.uni.weaver
2+
3+
import wvlet.uni.surface.Surface
4+
import wvlet.uni.test.UniTest
5+
6+
object RecursiveSurfaceWeaverTest:
7+
// Self-recursive abstract class — mirrors wvlet's `DataType` shape that triggered #515:
8+
// building Weaver[TypeNode] needs Weaver[List[TypeNode]] needs Weaver[TypeNode]…
9+
abstract class TypeNode(val name: String, val children: List[TypeNode])
10+
11+
// Concrete subtype to exercise round-trip on a non-abstract field that itself
12+
// contains the recursive abstract type.
13+
case class TypeColumn(label: String, node: TypeNode) derives Weaver
14+
15+
// Self-recursive concrete case class via Option to break the compile-time cycle.
16+
case class Node(value: Int, next: Option[Node])
17+
18+
// Mutually recursive case classes via Option (avoids divergent implicit search at derive time).
19+
case class Branch(name: String, child: Option[Leaf])
20+
case class Leaf(value: Int, parent: Option[Branch])
21+
end RecursiveSurfaceWeaverTest
22+
23+
class RecursiveSurfaceWeaverTest extends UniTest:
24+
import RecursiveSurfaceWeaverTest.*
25+
26+
test("fromSurface does not throw for self-recursive abstract class with self-typed children") {
27+
// Without the LazyWeaver fix this throws java.lang.IllegalStateException: Recursive update
28+
// from ConcurrentHashMap.computeIfAbsent during the recursive build of Weaver[TypeNode].
29+
val w = Weaver.fromSurface(Surface.of[TypeNode])
30+
w shouldNotBe null
31+
}
32+
33+
test("fromSurface for surface containing a self-recursive abstract field") {
34+
val w = Weaver.fromSurface(Surface.of[TypeColumn])
35+
w shouldNotBe null
36+
}
37+
38+
test("fromSurface for self-recursive case class") {
39+
val w = Weaver.fromSurface(Surface.of[Node])
40+
w shouldNotBe null
41+
}
42+
43+
test("fromSurface for mutually recursive case classes round-trips") {
44+
val w1 = Weaver.fromSurface(Surface.of[Branch]).asInstanceOf[Weaver[Any]]
45+
val w2 = Weaver.fromSurface(Surface.of[Leaf]).asInstanceOf[Weaver[Any]]
46+
w1 shouldNotBe null
47+
w2 shouldNotBe null
48+
49+
val data = Branch("root", Some(Leaf(1, Some(Branch("child", None)))))
50+
val json = w1.toJson(data)
51+
val restored = w1.fromJson(json)
52+
restored shouldBe data
53+
}
54+
55+
test("LazyWeaver round-trips data through self-recursive case class") {
56+
// Exercises LazyWeaver.pack/unpack: the cache is populated by the time the
57+
// placeholder is touched at runtime.
58+
val w = Weaver.fromSurface(Surface.of[Node]).asInstanceOf[Weaver[Any]]
59+
val node = Node(1, Some(Node(2, Some(Node(3, None)))))
60+
val json = w.toJson(node)
61+
val back = w.fromJson(json)
62+
back shouldBe node
63+
}
64+
65+
end RecursiveSurfaceWeaverTest

0 commit comments

Comments
 (0)