fix: update how the error message looks (#1553)

avik-pal · web-flow · commit 15e62d3c8924 · 2025-11-14T13:16:23.000-05:00
diff --git a/ext/LuxReactantExt/training.jl b/ext/LuxReactantExt/training.jl
@@ -9,9 +9,10 @@ Invalid TrainState construction using a compiled function.
 
 `TrainState` is being constructed with a reactant compiled function, i.e. a
 `Reactant.Compiler.Thunk`. This is likely a mistake as the model should be
-passed in directly without being compiled first.
+passed in directly without being compiled first. When `single_train_step` or other
+functions are called on the `TrainState`, the model will be compiled automatically.
 
-This is likely originating from the following style of usage:
+The correct usage is:
 
 ```julia
 using Lux, Reactant, Random, Optimisers
@@ -22,17 +23,25 @@ model = Dense(10, 10)
 ps, st = Lux.setup(Random.default_rng(), model) |> rdev
 x = rand(10) |> rdev
 
-model_compiled = @compile model(x, ps, st)
-
-train_state = Training.TrainState(model_compiled, ps, st, Adam())
+train_state = TrainState(model, ps, st, Adam())
 ```
 
-Instead avoid compiling the model and pass it directly to `TrainState`. When
-`single_train_step` or other functions are called on the `TrainState`, the
-model will be compiled automatically.
+The error originates because the model is being compiled first, which is not
+supported. **The following is the incorrect way, which potentially causes this
+error.**
 
 ```julia
-train_state = Training.TrainState(model, ps, st, Adam())
+using Lux, Reactant, Random, Optimisers
+
+rdev = reactant_device()
+
+model = Dense(10, 10)
+ps, st = Lux.setup(Random.default_rng(), model)
+x = rand(10) |> rdev
+
+model_compiled = @compile model(x, ps, st)
+
+train_state = Training.TrainState(model_compiled, ps, st, Adam())
 ```
 
 For end-to-end usage example refer to the documentation:
diff --git a/src/helpers/training.jl b/src/helpers/training.jl
@@ -69,30 +69,31 @@ end
 
 function Adapt.adapt_structure(to::ReactantDevice, ts::TrainState)
     @warn """
-    Moving `TrainState` to `ReactantDevice` might lead to unwanted behaviour. This
-    potentially originates from the following style of usage:
+    Moving `TrainState` to `ReactantDevice` might lead to unwanted behaviour.
+
+    Move the `ps` and `st` to the device before constructing the `TrainState`.
+    This ensures the optimizer state and other internal states are on the device on
+    construction. Prefer using the following style:
 
     ```julia
     rdev = reactant_device()
 
-    ps, st = Lux.setup(rng, model)
+    ps, st = Lux.setup(rng, model) |> rdev
     train_state = TrainState(model, ps, st, opt)
-    train_state = train_state |> rdev
     ```
 
-    Specifically, `ps` and `st` we on the host device when `train_state` is being
-    constructed and later `train_state` is moved to the device. Instead it is recommended
-    to do the following:
+    This warning potentially originates from having `ps` and `st` on the host when
+    constructing the `TrainState`, and later moving the `TrainState` to the device.
+    **The following is the incorrect way, which potentially causes this warning to
+    appear.**
 
     ```julia
     rdev = reactant_device()
 
-    ps, st = Lux.setup(rng, model) |> rdev
+    ps, st = Lux.setup(rng, model)
     train_state = TrainState(model, ps, st, opt)
+    train_state = train_state |> rdev
     ```
-
-    This ensures the optimizer state and other internal states are on the device on
-    construction.
     """
     return @invoke Adapt.adapt_structure(to::AbstractDevice, ts::TrainState)
 end