Skip to content

Commit 07be6f2

Browse files
committed
Add demo for neural.slang
1 parent 6d2c37e commit 07be6f2

File tree

6 files changed

+813
-0
lines changed

6 files changed

+813
-0
lines changed
Lines changed: 300 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,300 @@
1+
# Neural Demo - Using neural.slang
2+
3+
This demo showcases how to use Slang's `neural.slang` standard module to build a neural network for image reconstruction. The network learns to map UV coordinates to RGB colors, reconstructing a reference image through gradient-based optimization.
4+
This is a re-creation of the texture example in the https://github.com/shader-slang/neural-shading-s25 course.
5+
6+
## Overview
7+
8+
The demo uses an MLP (Multi-Layer Perceptron) with the following architecture:
9+
- **Input**: 4 latent features sampled from a learnable texture
10+
- **Layer 0**: 4 → 32 neurons + LeakyReLU
11+
- **Layer 1**: 32 → 32 neurons + LeakyReLU
12+
- **Layer 2**: 32 → 3 neurons + Exp (for positive RGB output)
13+
14+
## neural.slang Types Used
15+
16+
| Type | Description |
17+
|------|-------------|
18+
| `InlineVector<T, N>` | Fixed-size vector type with compile-time `.Size` constant |
19+
| `StructuredBufferStorage<T>` | GPU buffer storage implementing `IStorage<T>` interface |
20+
| `FFLayer<T, InVec, OutVec, Storage, Activation, HasBias>` | Feed-forward neural network layer |
21+
| `IdentityActivation<T>` | Pass-through activation (no transformation) |
22+
| `NoParam()` | Empty parameter for activations that don't need configuration |
23+
24+
## Before/After Comparison
25+
26+
### Vector Types
27+
28+
| Before (Manual) | After (neural.slang) |
29+
|-----------------|---------------------|
30+
| `float[4]` / `float4` | `InlineVector<float, 4>` |
31+
| `float[32]` | `InlineVector<float, 32>` |
32+
| `float[3]` / `float3` | `InlineVector<float, 3>` |
33+
| Manual size tracking | `Vec4.Size` compile-time constant |
34+
35+
**Before:**
36+
```slang
37+
static const int INPUT_SIZE = 4;
38+
static const int HIDDEN_SIZE = 32;
39+
static const int OUTPUT_SIZE = 3;
40+
41+
float[32] hidden;
42+
```
43+
44+
**After:**
45+
```slang
46+
typealias Vec4 = InlineVector<float, 4>;
47+
typealias Vec32 = InlineVector<float, 32>;
48+
typealias Vec3 = InlineVector<float, 3>;
49+
50+
static const int INPUT_SIZE = Vec4.Size; // 4
51+
static const int HIDDEN_SIZE = Vec32.Size; // 32
52+
static const int OUTPUT_SIZE = Vec3.Size; // 3
53+
54+
Vec32 hidden;
55+
```
56+
57+
### Parameter Storage
58+
59+
| Before (Manual) | After (neural.slang) |
60+
|-----------------|---------------------|
61+
| Separate weight/bias buffers | `StructuredBufferStorage<T>` wrapper |
62+
| Manual offset calculation | `Storage.getOffset()` method |
63+
| Manual parameter count | `FFLayer.ParameterCount` constant |
64+
65+
**Before:**
66+
```slang
67+
struct Layer
68+
{
69+
RWStructuredBuffer<float> weights; // [out * in]
70+
RWStructuredBuffer<float> biases; // [out]
71+
72+
static const int PARAM_COUNT = 32 * 4 + 32; // Manual calculation
73+
}
74+
```
75+
76+
**After:**
77+
```slang
78+
typealias Storage = StructuredBufferStorage<float>;
79+
typealias Layer0Type = FFLayer<float, Vec4, Vec32, Storage, Act, true>;
80+
81+
// Parameter count computed automatically from layer dimensions
82+
static const int LAYER0_PARAMS = Layer0Type.ParameterCount; // 4*32 + 32 = 160
83+
84+
struct MLPNetwork
85+
{
86+
// Single buffer per layer: [weights row-major, biases]
87+
RWStructuredBuffer<float> layer0_params;
88+
}
89+
```
90+
91+
### Layer Forward Pass
92+
93+
| Before (Manual) | After (neural.slang) |
94+
|-----------------|---------------------|
95+
| Manual matrix multiply | `FFLayer.eval()` using `linearTransform` |
96+
| Explicit loops | Optimized internal implementation |
97+
| Manual bias addition | Handled by `FFLayer` |
98+
99+
**Before:**
100+
```slang
101+
[Differentiable]
102+
float[32] layer_forward(float[4] input)
103+
{
104+
float[32] output;
105+
for (int row = 0; row < 32; ++row)
106+
{
107+
float sum = biases[row];
108+
for (int col = 0; col < 4; ++col)
109+
sum += weights[row * 4 + col] * input[col];
110+
output[row] = sum;
111+
}
112+
return output;
113+
}
114+
```
115+
116+
**After:**
117+
```slang
118+
Vec3 forward(Vec4 input)
119+
{
120+
// Create storage wrapper around buffer
121+
let storage0 = Storage(layer0_params);
122+
123+
// Create FFLayer instance
124+
// FFLayer(storage, weightAddress, biasAddress)
125+
let ff0 = Layer0Type(storage0, 0u, INPUT_SIZE * HIDDEN_SIZE);
126+
127+
// Forward pass: y = W*x + b (linearTransform inside eval)
128+
Vec32 h0 = ff0.eval(NoParam(), input);
129+
130+
// Apply activation...
131+
}
132+
```
133+
134+
### Network Definition
135+
136+
| Before (Manual) | After (neural.slang) |
137+
|-----------------|---------------------|
138+
| Custom struct with manual layout | Type aliases for layers |
139+
| Hardcoded dimensions | Dimensions from vector types |
140+
| Manual weight indexing | Automatic address calculation |
141+
142+
**Before:**
143+
```slang
144+
struct Network
145+
{
146+
RWStructuredBuffer<float> layer0_weights; // 4*32 floats
147+
RWStructuredBuffer<float> layer0_biases; // 32 floats
148+
RWStructuredBuffer<float> layer1_weights; // 32*32 floats
149+
RWStructuredBuffer<float> layer1_biases; // 32 floats
150+
RWStructuredBuffer<float> layer2_weights; // 32*3 floats
151+
RWStructuredBuffer<float> layer2_biases; // 3 floats
152+
153+
[Differentiable]
154+
float3 forward(float4 input) { /* manual implementation */ }
155+
}
156+
```
157+
158+
**After:**
159+
```slang
160+
import neural;
161+
162+
// Type definitions using neural.slang
163+
typealias Vec4 = InlineVector<float, 4>;
164+
typealias Vec32 = InlineVector<float, 32>;
165+
typealias Vec3 = InlineVector<float, 3>;
166+
typealias Storage = StructuredBufferStorage<float>;
167+
typealias Act = IdentityActivation<float>;
168+
169+
typealias Layer0Type = FFLayer<float, Vec4, Vec32, Storage, Act, true>;
170+
typealias Layer1Type = FFLayer<float, Vec32, Vec32, Storage, Act, true>;
171+
typealias Layer2Type = FFLayer<float, Vec32, Vec3, Storage, Act, true>;
172+
173+
struct MLPNetwork
174+
{
175+
// One buffer per layer: [weights, biases] contiguous
176+
RWStructuredBuffer<float> layer0_params;
177+
RWStructuredBuffer<float> layer1_params;
178+
RWStructuredBuffer<float> layer2_params;
179+
180+
Vec3 forward(Vec4 input)
181+
{
182+
let storage0 = Storage(layer0_params);
183+
let ff0 = Layer0Type(storage0, 0u, INPUT_SIZE * HIDDEN_SIZE);
184+
Vec32 h0 = ff0.eval(NoParam(), input);
185+
// ...
186+
}
187+
}
188+
```
189+
190+
## Python-Side Parameter Management
191+
192+
| Before (Manual) | After (neural.slang) |
193+
|-----------------|---------------------|
194+
| Separate numpy arrays | `FFLayerParams` class matching FFLayer layout |
195+
| Manual buffer creation | Automatic `[weights, biases]` concatenation |
196+
| Manual gradient tracking | Linked `TrainableLayerParams` |
197+
198+
**Before:**
199+
```python
200+
class Layer:
201+
def __init__(self, inputs, outputs):
202+
self.weights = np.random.randn(outputs, inputs)
203+
self.biases = np.zeros(outputs)
204+
self.weights_buffer = create_buffer(self.weights)
205+
self.biases_buffer = create_buffer(self.biases)
206+
```
207+
208+
**After:**
209+
```python
210+
class FFLayerParams:
211+
"""Parameters matching FFLayer's expected buffer layout."""
212+
213+
def __init__(self, inputs: int, outputs: int):
214+
# Xavier initialization
215+
scale = np.sqrt(6.0 / (inputs + outputs))
216+
self.weights_np = np.random.uniform(-scale, scale, (outputs, inputs))
217+
self.biases_np = np.zeros(outputs)
218+
219+
# Create single buffer: [weights row-major, biases]
220+
params = np.concatenate([self.weights_np.flatten(), self.biases_np])
221+
self.buffer = create_buffer(params)
222+
```
223+
224+
## Architecture Notes
225+
226+
### Why Two Network Types?
227+
228+
The demo uses two network structs:
229+
230+
1. **`MLPNetwork`** (FFLayer-based) - For rendering
231+
- Uses `FFLayer.eval()` with `StructuredBufferStorage`
232+
- Forward-only evaluation (non-differentiable struct)
233+
- Fast inference using optimized `linearTransform`
234+
235+
2. **`TrainableMLPNetwork`** (Tensor-based) - For training
236+
- Uses explicit weight/bias Tensors with gradient accumulation
237+
- Implements the same `W*x + b` computation
238+
- Gradients accumulate via `AtomicTensor`
239+
240+
This separation is needed because `FFLayer.eval()` has `[NoDiffThis]` - gradients flow through the storage via `atomicAdd`, which requires a specific differential storage setup that's complex to wire up from Python. The Tensor-based approach gives us explicit control over gradient flow.
241+
242+
After each optimization step, weights are synced from `TrainableLayerParams` back to `FFLayerParams.buffer`, so the FFLayer-based `MLPNetwork` renders with the updated weights.
243+
244+
### InlineVector Subscript Limitation
245+
246+
`InlineVector<T, N>` subscript operator (`operator[]`) doesn't have a backward derivative in the current implementation. This means:
247+
248+
```slang
249+
// This would break gradient flow:
250+
Vec32 v;
251+
float x = v[0]; // No backward derivative for subscript!
252+
```
253+
254+
**Workaround**: Convert between `InlineVector` and arrays using custom converters with explicit `[BackwardDerivative]`:
255+
256+
```slang
257+
[BackwardDerivative(vec32ToArrBwd)]
258+
float[32] vec32ToArr(Vec32 v)
259+
{
260+
float[32] a;
261+
[ForceUnroll] for (int i = 0; i < 32; ++i) a[i] = v[i];
262+
return a;
263+
}
264+
265+
void vec32ToArrBwd(inout DifferentialPair<Vec32> dv, float[32] da)
266+
{
267+
Vec32 d;
268+
[ForceUnroll] for (int i = 0; i < 32; ++i) d[i] = da[i];
269+
dv = diffPair(dv.p, d);
270+
}
271+
```
272+
273+
## Running the Demo
274+
275+
```bash
276+
cd slangpy-samples/examples/neural-demo
277+
python neural-demo.py
278+
```
279+
280+
The demo displays three panels:
281+
1. **Reference image** - Target to reconstruct
282+
2. **Network output** - Current reconstruction using FFLayer-based network
283+
3. **Loss visualization** - Per-pixel error
284+
285+
Loss values are printed to console and should decrease over time as the network learns.
286+
287+
## Key Files
288+
289+
- `neural-demo.slang` - Shader code with FFLayer types and network definitions
290+
- `neural-demo.py` - Python host code with parameter management and training loop
291+
- `slangstars.png` - Reference image to reconstruct
292+
293+
## Dependencies
294+
295+
Requires the `neural` module to compile:
296+
```slang
297+
import neural; // Required! Demo won't compile without this
298+
```
299+
300+
This import provides: `InlineVector`, `StructuredBufferStorage`, `FFLayer`, `IdentityActivation`, `NoParam`, and other neural network primitives.

0 commit comments

Comments
 (0)