|
| 1 | +# Neural Demo - Using neural.slang |
| 2 | + |
| 3 | +This demo showcases how to use Slang's `neural.slang` standard module to build a neural network for image reconstruction. The network learns to map UV coordinates to RGB colors, reconstructing a reference image through gradient-based optimization. |
| 4 | +This is a re-creation of the texture example in the https://github.com/shader-slang/neural-shading-s25 course. |
| 5 | + |
| 6 | +## Overview |
| 7 | + |
| 8 | +The demo uses an MLP (Multi-Layer Perceptron) with the following architecture: |
| 9 | +- **Input**: 4 latent features sampled from a learnable texture |
| 10 | +- **Layer 0**: 4 → 32 neurons + LeakyReLU |
| 11 | +- **Layer 1**: 32 → 32 neurons + LeakyReLU |
| 12 | +- **Layer 2**: 32 → 3 neurons + Exp (for positive RGB output) |
| 13 | + |
| 14 | +## neural.slang Types Used |
| 15 | + |
| 16 | +| Type | Description | |
| 17 | +|------|-------------| |
| 18 | +| `InlineVector<T, N>` | Fixed-size vector type with compile-time `.Size` constant | |
| 19 | +| `StructuredBufferStorage<T>` | GPU buffer storage implementing `IStorage<T>` interface | |
| 20 | +| `FFLayer<T, InVec, OutVec, Storage, Activation, HasBias>` | Feed-forward neural network layer | |
| 21 | +| `IdentityActivation<T>` | Pass-through activation (no transformation) | |
| 22 | +| `NoParam()` | Empty parameter for activations that don't need configuration | |
| 23 | + |
| 24 | +## Before/After Comparison |
| 25 | + |
| 26 | +### Vector Types |
| 27 | + |
| 28 | +| Before (Manual) | After (neural.slang) | |
| 29 | +|-----------------|---------------------| |
| 30 | +| `float[4]` / `float4` | `InlineVector<float, 4>` | |
| 31 | +| `float[32]` | `InlineVector<float, 32>` | |
| 32 | +| `float[3]` / `float3` | `InlineVector<float, 3>` | |
| 33 | +| Manual size tracking | `Vec4.Size` compile-time constant | |
| 34 | + |
| 35 | +**Before:** |
| 36 | +```slang |
| 37 | +static const int INPUT_SIZE = 4; |
| 38 | +static const int HIDDEN_SIZE = 32; |
| 39 | +static const int OUTPUT_SIZE = 3; |
| 40 | +
|
| 41 | +float[32] hidden; |
| 42 | +``` |
| 43 | + |
| 44 | +**After:** |
| 45 | +```slang |
| 46 | +typealias Vec4 = InlineVector<float, 4>; |
| 47 | +typealias Vec32 = InlineVector<float, 32>; |
| 48 | +typealias Vec3 = InlineVector<float, 3>; |
| 49 | +
|
| 50 | +static const int INPUT_SIZE = Vec4.Size; // 4 |
| 51 | +static const int HIDDEN_SIZE = Vec32.Size; // 32 |
| 52 | +static const int OUTPUT_SIZE = Vec3.Size; // 3 |
| 53 | +
|
| 54 | +Vec32 hidden; |
| 55 | +``` |
| 56 | + |
| 57 | +### Parameter Storage |
| 58 | + |
| 59 | +| Before (Manual) | After (neural.slang) | |
| 60 | +|-----------------|---------------------| |
| 61 | +| Separate weight/bias buffers | `StructuredBufferStorage<T>` wrapper | |
| 62 | +| Manual offset calculation | `Storage.getOffset()` method | |
| 63 | +| Manual parameter count | `FFLayer.ParameterCount` constant | |
| 64 | + |
| 65 | +**Before:** |
| 66 | +```slang |
| 67 | +struct Layer |
| 68 | +{ |
| 69 | + RWStructuredBuffer<float> weights; // [out * in] |
| 70 | + RWStructuredBuffer<float> biases; // [out] |
| 71 | +
|
| 72 | + static const int PARAM_COUNT = 32 * 4 + 32; // Manual calculation |
| 73 | +} |
| 74 | +``` |
| 75 | + |
| 76 | +**After:** |
| 77 | +```slang |
| 78 | +typealias Storage = StructuredBufferStorage<float>; |
| 79 | +typealias Layer0Type = FFLayer<float, Vec4, Vec32, Storage, Act, true>; |
| 80 | +
|
| 81 | +// Parameter count computed automatically from layer dimensions |
| 82 | +static const int LAYER0_PARAMS = Layer0Type.ParameterCount; // 4*32 + 32 = 160 |
| 83 | +
|
| 84 | +struct MLPNetwork |
| 85 | +{ |
| 86 | + // Single buffer per layer: [weights row-major, biases] |
| 87 | + RWStructuredBuffer<float> layer0_params; |
| 88 | +} |
| 89 | +``` |
| 90 | + |
| 91 | +### Layer Forward Pass |
| 92 | + |
| 93 | +| Before (Manual) | After (neural.slang) | |
| 94 | +|-----------------|---------------------| |
| 95 | +| Manual matrix multiply | `FFLayer.eval()` using `linearTransform` | |
| 96 | +| Explicit loops | Optimized internal implementation | |
| 97 | +| Manual bias addition | Handled by `FFLayer` | |
| 98 | + |
| 99 | +**Before:** |
| 100 | +```slang |
| 101 | +[Differentiable] |
| 102 | +float[32] layer_forward(float[4] input) |
| 103 | +{ |
| 104 | + float[32] output; |
| 105 | + for (int row = 0; row < 32; ++row) |
| 106 | + { |
| 107 | + float sum = biases[row]; |
| 108 | + for (int col = 0; col < 4; ++col) |
| 109 | + sum += weights[row * 4 + col] * input[col]; |
| 110 | + output[row] = sum; |
| 111 | + } |
| 112 | + return output; |
| 113 | +} |
| 114 | +``` |
| 115 | + |
| 116 | +**After:** |
| 117 | +```slang |
| 118 | +Vec3 forward(Vec4 input) |
| 119 | +{ |
| 120 | + // Create storage wrapper around buffer |
| 121 | + let storage0 = Storage(layer0_params); |
| 122 | +
|
| 123 | + // Create FFLayer instance |
| 124 | + // FFLayer(storage, weightAddress, biasAddress) |
| 125 | + let ff0 = Layer0Type(storage0, 0u, INPUT_SIZE * HIDDEN_SIZE); |
| 126 | +
|
| 127 | + // Forward pass: y = W*x + b (linearTransform inside eval) |
| 128 | + Vec32 h0 = ff0.eval(NoParam(), input); |
| 129 | +
|
| 130 | + // Apply activation... |
| 131 | +} |
| 132 | +``` |
| 133 | + |
| 134 | +### Network Definition |
| 135 | + |
| 136 | +| Before (Manual) | After (neural.slang) | |
| 137 | +|-----------------|---------------------| |
| 138 | +| Custom struct with manual layout | Type aliases for layers | |
| 139 | +| Hardcoded dimensions | Dimensions from vector types | |
| 140 | +| Manual weight indexing | Automatic address calculation | |
| 141 | + |
| 142 | +**Before:** |
| 143 | +```slang |
| 144 | +struct Network |
| 145 | +{ |
| 146 | + RWStructuredBuffer<float> layer0_weights; // 4*32 floats |
| 147 | + RWStructuredBuffer<float> layer0_biases; // 32 floats |
| 148 | + RWStructuredBuffer<float> layer1_weights; // 32*32 floats |
| 149 | + RWStructuredBuffer<float> layer1_biases; // 32 floats |
| 150 | + RWStructuredBuffer<float> layer2_weights; // 32*3 floats |
| 151 | + RWStructuredBuffer<float> layer2_biases; // 3 floats |
| 152 | +
|
| 153 | + [Differentiable] |
| 154 | + float3 forward(float4 input) { /* manual implementation */ } |
| 155 | +} |
| 156 | +``` |
| 157 | + |
| 158 | +**After:** |
| 159 | +```slang |
| 160 | +import neural; |
| 161 | +
|
| 162 | +// Type definitions using neural.slang |
| 163 | +typealias Vec4 = InlineVector<float, 4>; |
| 164 | +typealias Vec32 = InlineVector<float, 32>; |
| 165 | +typealias Vec3 = InlineVector<float, 3>; |
| 166 | +typealias Storage = StructuredBufferStorage<float>; |
| 167 | +typealias Act = IdentityActivation<float>; |
| 168 | +
|
| 169 | +typealias Layer0Type = FFLayer<float, Vec4, Vec32, Storage, Act, true>; |
| 170 | +typealias Layer1Type = FFLayer<float, Vec32, Vec32, Storage, Act, true>; |
| 171 | +typealias Layer2Type = FFLayer<float, Vec32, Vec3, Storage, Act, true>; |
| 172 | +
|
| 173 | +struct MLPNetwork |
| 174 | +{ |
| 175 | + // One buffer per layer: [weights, biases] contiguous |
| 176 | + RWStructuredBuffer<float> layer0_params; |
| 177 | + RWStructuredBuffer<float> layer1_params; |
| 178 | + RWStructuredBuffer<float> layer2_params; |
| 179 | +
|
| 180 | + Vec3 forward(Vec4 input) |
| 181 | + { |
| 182 | + let storage0 = Storage(layer0_params); |
| 183 | + let ff0 = Layer0Type(storage0, 0u, INPUT_SIZE * HIDDEN_SIZE); |
| 184 | + Vec32 h0 = ff0.eval(NoParam(), input); |
| 185 | + // ... |
| 186 | + } |
| 187 | +} |
| 188 | +``` |
| 189 | + |
| 190 | +## Python-Side Parameter Management |
| 191 | + |
| 192 | +| Before (Manual) | After (neural.slang) | |
| 193 | +|-----------------|---------------------| |
| 194 | +| Separate numpy arrays | `FFLayerParams` class matching FFLayer layout | |
| 195 | +| Manual buffer creation | Automatic `[weights, biases]` concatenation | |
| 196 | +| Manual gradient tracking | Linked `TrainableLayerParams` | |
| 197 | + |
| 198 | +**Before:** |
| 199 | +```python |
| 200 | +class Layer: |
| 201 | + def __init__(self, inputs, outputs): |
| 202 | + self.weights = np.random.randn(outputs, inputs) |
| 203 | + self.biases = np.zeros(outputs) |
| 204 | + self.weights_buffer = create_buffer(self.weights) |
| 205 | + self.biases_buffer = create_buffer(self.biases) |
| 206 | +``` |
| 207 | + |
| 208 | +**After:** |
| 209 | +```python |
| 210 | +class FFLayerParams: |
| 211 | + """Parameters matching FFLayer's expected buffer layout.""" |
| 212 | + |
| 213 | + def __init__(self, inputs: int, outputs: int): |
| 214 | + # Xavier initialization |
| 215 | + scale = np.sqrt(6.0 / (inputs + outputs)) |
| 216 | + self.weights_np = np.random.uniform(-scale, scale, (outputs, inputs)) |
| 217 | + self.biases_np = np.zeros(outputs) |
| 218 | + |
| 219 | + # Create single buffer: [weights row-major, biases] |
| 220 | + params = np.concatenate([self.weights_np.flatten(), self.biases_np]) |
| 221 | + self.buffer = create_buffer(params) |
| 222 | +``` |
| 223 | + |
| 224 | +## Architecture Notes |
| 225 | + |
| 226 | +### Why Two Network Types? |
| 227 | + |
| 228 | +The demo uses two network structs: |
| 229 | + |
| 230 | +1. **`MLPNetwork`** (FFLayer-based) - For rendering |
| 231 | + - Uses `FFLayer.eval()` with `StructuredBufferStorage` |
| 232 | + - Forward-only evaluation (non-differentiable struct) |
| 233 | + - Fast inference using optimized `linearTransform` |
| 234 | + |
| 235 | +2. **`TrainableMLPNetwork`** (Tensor-based) - For training |
| 236 | + - Uses explicit weight/bias Tensors with gradient accumulation |
| 237 | + - Implements the same `W*x + b` computation |
| 238 | + - Gradients accumulate via `AtomicTensor` |
| 239 | + |
| 240 | +This separation is needed because `FFLayer.eval()` has `[NoDiffThis]` - gradients flow through the storage via `atomicAdd`, which requires a specific differential storage setup that's complex to wire up from Python. The Tensor-based approach gives us explicit control over gradient flow. |
| 241 | + |
| 242 | +After each optimization step, weights are synced from `TrainableLayerParams` back to `FFLayerParams.buffer`, so the FFLayer-based `MLPNetwork` renders with the updated weights. |
| 243 | + |
| 244 | +### InlineVector Subscript Limitation |
| 245 | + |
| 246 | +`InlineVector<T, N>` subscript operator (`operator[]`) doesn't have a backward derivative in the current implementation. This means: |
| 247 | + |
| 248 | +```slang |
| 249 | +// This would break gradient flow: |
| 250 | +Vec32 v; |
| 251 | +float x = v[0]; // No backward derivative for subscript! |
| 252 | +``` |
| 253 | + |
| 254 | +**Workaround**: Convert between `InlineVector` and arrays using custom converters with explicit `[BackwardDerivative]`: |
| 255 | + |
| 256 | +```slang |
| 257 | +[BackwardDerivative(vec32ToArrBwd)] |
| 258 | +float[32] vec32ToArr(Vec32 v) |
| 259 | +{ |
| 260 | + float[32] a; |
| 261 | + [ForceUnroll] for (int i = 0; i < 32; ++i) a[i] = v[i]; |
| 262 | + return a; |
| 263 | +} |
| 264 | +
|
| 265 | +void vec32ToArrBwd(inout DifferentialPair<Vec32> dv, float[32] da) |
| 266 | +{ |
| 267 | + Vec32 d; |
| 268 | + [ForceUnroll] for (int i = 0; i < 32; ++i) d[i] = da[i]; |
| 269 | + dv = diffPair(dv.p, d); |
| 270 | +} |
| 271 | +``` |
| 272 | + |
| 273 | +## Running the Demo |
| 274 | + |
| 275 | +```bash |
| 276 | +cd slangpy-samples/examples/neural-demo |
| 277 | +python neural-demo.py |
| 278 | +``` |
| 279 | + |
| 280 | +The demo displays three panels: |
| 281 | +1. **Reference image** - Target to reconstruct |
| 282 | +2. **Network output** - Current reconstruction using FFLayer-based network |
| 283 | +3. **Loss visualization** - Per-pixel error |
| 284 | + |
| 285 | +Loss values are printed to console and should decrease over time as the network learns. |
| 286 | + |
| 287 | +## Key Files |
| 288 | + |
| 289 | +- `neural-demo.slang` - Shader code with FFLayer types and network definitions |
| 290 | +- `neural-demo.py` - Python host code with parameter management and training loop |
| 291 | +- `slangstars.png` - Reference image to reconstruct |
| 292 | + |
| 293 | +## Dependencies |
| 294 | + |
| 295 | +Requires the `neural` module to compile: |
| 296 | +```slang |
| 297 | +import neural; // Required! Demo won't compile without this |
| 298 | +``` |
| 299 | + |
| 300 | +This import provides: `InlineVector`, `StructuredBufferStorage`, `FFLayer`, `IdentityActivation`, `NoParam`, and other neural network primitives. |
0 commit comments