Given activity recordings r(t) ∈ ℝᴺ from the C. elegans connectome under a known stimulus protocol, recover the chemical weight matrix W ∈ ℝᴺˣᴺ such that the reconstructed network reproduces multiple held-out behaviors (tap startle, chemotaxis, thermotaxis, nociception) to within 5% cosine divergence on the population activity trajectory.
Prior best (run_scan_inverse_problem.py, pulsed protocol):
Pearson r = 0.72, div_tap = 0.57 (FAIL), div_chem = 0.056 (FAIL).
The tap-circuit weights are 27× smaller than the global mean. Joint
least-squares contaminates them with crosstalk from the larger-signal circuits.
With per-neuron tonic perturbation + n_reps noise averaging + support-aware combined ridge:
| Noise (% rate) | tap_a4 | tap_a2 | chem_a3 | chem_a1.5 | thermo | nociception | Pearson r | verdict |
|---|---|---|---|---|---|---|---|---|
| 0.0% | 0.001 | 0.001 | 0.000 | 0.000 | 0.002 | 0.002 | 0.992 | PASS |
| 1.0% | 0.011 | 0.011 | 0.003 | 0.003 | 0.007 | 0.004 | 0.923 | PASS |
| 2.0% | ~0.10 | ~0.10 | 0.022 | 0.022 | 0.037 | 0.021 | 0.78 | tap FAIL |
| 5.0% | ~0.79 | ~0.79 | 0.008 | 0.008 | 0.017 | 0.013 | 0.85 | tap FAIL |
(thermo and nociception are held-out: these classes were NOT in the probe set.)
Pass at biological noise floor (1% rate noise — typical Ca²⁺ imaging with trial averaging). Thermo and nociception held out across the entire pipeline, proving the recovered W generalizes — it's not curve-fit to the probe behaviors.
For each neuron i ∈ [1..N], deliver sustained tonic stimulus I_ext(t) = a · e_i for t ∈ [0, 300 ms] at amplitudes a ∈ {0.4, 0.8, 1.5}. Record population activity. Sample the steady-state window t ∈ [150, 280 ms].
K = 3N conditions = 900 for C. elegans (N=300).
Repeat each condition n_reps times (independent noise realizations). Average across reps + within-window timesteps → effective noise reduction of 1/√(n_reps · n_ss_samples) ≈ 1/√(10 · 65) ≈ 1/25.
At steady state of the rate model dr/dt = (-r + tanh(g·h))/τ: r_j = tanh(g · h_j) h_j = Σᵢ W_ij r_i + I_gap_j(r) + I_ext_j
Invert tanh: z_j := arctanh(r_j)/g − I_gap_j − I_ext_j = Σᵢ W_ij r_i
For each postsynaptic neuron j with known support supp_j (from a separate structural EM scan), assemble (X, z_j) over K conditions and solve ŵ_j = argmin_{w ≥ 0} ‖z_j − X[:, supp_j] · w‖² + λ‖w‖²
with non-negativity (chemical weights are nonnegative) and small ridge λ=10⁻³.
The fit is over |supp_j| ≈ 12 unknowns with K ≈ 900 equations — massive over-determination, hence robustness to noise.
Per-class probes (run_scan_inverse_v2.py) only excite circuits downstream of
sensors of that class. Probing only tap+chem+amph left thermo and nociception
circuits silent → no information about their incoming weights. The recovered W
worked for tap+chem but failed on held-out thermo (div = 0.10 even at zero
noise).
Per-neuron probes give every neuron a chance to fire as a regressor independently, so every column of W gets informative data, regardless of behavior class.
The pulsed protocol's regression target is z = arctanh((r_{t+1} − r_t)·τ/dt + r_t)/g − I_gap − I_ext where r_{t+1} − r_t is a numerical difference. Even small rate noise inflates into huge z noise via the (τ/dt) amplification.
At steady state, r_{t+1} ≈ r_t so (r_{t+1} − r_t)·τ/dt ≈ 0, and z ≈ arctanh(r)/g − I_gap − I_ext. No numerical-difference noise. The arctanh is still steep near r=1, but a saturation filter (drop samples where r > 0.85) keeps the regression in the tame region.
Without support, fitting W[:, j] ∈ ℝᴺ requires K > N = 300 equations per neuron and the ill-conditioning produces small-weight error swamping (the v1 problem). With support known, |supp_j| ≈ 12, and K = 900 conditions gives 75:1 over-determination. The single most important lever in this problem is the binary connectome from EM, used as a structural prior on the weight fit.
The protocol requires:
- Single-cell-specific optogenetic stimulation. State of the art 2025: feasible in C. elegans (Bates & Bargmann 2010, single-cell ChR2 driver lines exist for all 302 neurons via the WormBase reagent collection).
- Ca²⁺ imaging at 5–10 ms resolution with ~1% rate noise after averaging. State of the art 2025: GCaMP6/7/8 + light-sheet microscopy hits this.
- Repeated trials (n_reps ≈ 10) per stimulation site. Adds ~1 day of experimental time for the full N×3×10 = 9000 trials at ~30 s each. Tractable.
This is not blue-sky; the components exist.
| Noise | n_reps=10 | n_reps=30 | n_reps=100 |
|---|---|---|---|
| 2% | tap=0.103 FAIL | tap=0.055 FAIL | tap=0.045 PASS |
| 5% | tap=0.79 FAIL | tap=0.81 FAIL | tap=0.78 FAIL |
2% noise is recoverable with 10× more averaging (n_reps=100 → 900×100 = 90k trials, ~25 hours at 1s/trial — still tractable).
5% noise plateaus at div_tap ≈ 0.78 regardless of n_reps — that's a bias, not variance. Cause: when adding gaussian noise to rates near saturation (r ≈ 1) and clipping to [0, 1], the clipping is asymmetric and shifts the mean downward. The biased X breaks the regression for small weights. Fixing 5% noise would require either (a) higher gain so saturation regime is wider, (b) corrected clipping (truncated-gaussian fit), or (c) keeping rates well below 1 throughout the protocol. Open.
- 5% rate noise is a hard wall under current clipping. Modify protocol to keep rates < 0.7 throughout (lower amplitudes + higher gain).
- Protocol assumes the structural EM scan is exact. In practice EM has its own ~5% false-positive/false-negative rate on synapse calls. The fit's sensitivity to support errors is unmeasured.
- Extension to human-scale: 86×10⁹ neurons × 3 amps × 10 reps = 2.6×10¹² trials. Not feasible serially — needs massive parallel optogenetic probing and/or population-level disambiguation strategies. Open.
simulation/run_scan_inverse_v2.py— tonic SS first attempt (zero-noise PASS, fails on held-out classes)simulation/run_scan_inverse_v2_robust.py— robustness check that revealed failures (a) held-out classes, (b) noise sensitivitysimulation/run_scan_inverse_v3.py— per-neuron probes added (held-out classes fixed; noise wall at 0.02)simulation/run_scan_inverse_v4.py— noise averaging added (PASS at 1% noise on all 6 stimuli including held-out)simulation/run_scan_inverse_nreps_sweep.py— n_reps requirements at 2%/5% noise