Direction 1 — Scan Inverse Problem (resolved at biological noise)

Statement

Given activity recordings r(t) ∈ ℝᴺ from the C. elegans connectome under a known stimulus protocol, recover the chemical weight matrix W ∈ ℝᴺˣᴺ such that the reconstructed network reproduces multiple held-out behaviors (tap startle, chemotaxis, thermotaxis, nociception) to within 5% cosine divergence on the population activity trajectory.

Prior best (run_scan_inverse_problem.py, pulsed protocol): Pearson r = 0.72, div_tap = 0.57 (FAIL), div_chem = 0.056 (FAIL). The tap-circuit weights are 27× smaller than the global mean. Joint least-squares contaminates them with crosstalk from the larger-signal circuits.

Result (`run_scan_inverse_v4.py`)

With per-neuron tonic perturbation + n_reps noise averaging + support-aware combined ridge:

Noise (% rate)	tap_a4	tap_a2	chem_a3	chem_a1.5	thermo	nociception	Pearson r	verdict
0.0%	0.001	0.001	0.000	0.000	0.002	0.002	0.992	PASS
1.0%	0.011	0.011	0.003	0.003	0.007	0.004	0.923	PASS
2.0%	~0.10	~0.10	0.022	0.022	0.037	0.021	0.78	tap FAIL
5.0%	~0.79	~0.79	0.008	0.008	0.017	0.013	0.85	tap FAIL

(thermo and nociception are held-out: these classes were NOT in the probe set.)

Pass at biological noise floor (1% rate noise — typical Ca²⁺ imaging with trial averaging). Thermo and nociception held out across the entire pipeline, proving the recovered W generalizes — it's not curve-fit to the probe behaviors.

Protocol

For each neuron i ∈ [1..N], deliver sustained tonic stimulus I_ext(t) = a · e_i for t ∈ [0, 300 ms] at amplitudes a ∈ {0.4, 0.8, 1.5}. Record population activity. Sample the steady-state window t ∈ [150, 280 ms].

K = 3N conditions = 900 for C. elegans (N=300).

Repeat each condition n_reps times (independent noise realizations). Average across reps + within-window timesteps → effective noise reduction of 1/√(n_reps · n_ss_samples) ≈ 1/√(10 · 65) ≈ 1/25.

Math

At steady state of the rate model dr/dt = (-r + tanh(g·h))/τ: r_j = tanh(g · h_j) h_j = Σᵢ W_ij r_i + I_gap_j(r) + I_ext_j

Invert tanh: z_j := arctanh(r_j)/g − I_gap_j − I_ext_j = Σᵢ W_ij r_i

For each postsynaptic neuron j with known support supp_j (from a separate structural EM scan), assemble (X, z_j) over K conditions and solve ŵ_j = argmin_{w ≥ 0} ‖z_j − X[:, supp_j] · w‖² + λ‖w‖²

with non-negativity (chemical weights are nonnegative) and small ridge λ=10⁻³.

The fit is over |supp_j| ≈ 12 unknowns with K ≈ 900 equations — massive over-determination, hence robustness to noise.

Why per-neuron beats per-class

Per-class probes (run_scan_inverse_v2.py) only excite circuits downstream of sensors of that class. Probing only tap+chem+amph left thermo and nociception circuits silent → no information about their incoming weights. The recovered W worked for tap+chem but failed on held-out thermo (div = 0.10 even at zero noise).

Per-neuron probes give every neuron a chance to fire as a regressor independently, so every column of W gets informative data, regardless of behavior class.

Why tonic beats pulsed

The pulsed protocol's regression target is z = arctanh((r_{t+1} − r_t)·τ/dt + r_t)/g − I_gap − I_ext where r_{t+1} − r_t is a numerical difference. Even small rate noise inflates into huge z noise via the (τ/dt) amplification.

At steady state, r_{t+1} ≈ r_t so (r_{t+1} − r_t)·τ/dt ≈ 0, and z ≈ arctanh(r)/g − I_gap − I_ext. No numerical-difference noise. The arctanh is still steep near r=1, but a saturation filter (drop samples where r > 0.85) keeps the regression in the tame region.

Why structural support known is essential

Without support, fitting W[:, j] ∈ ℝᴺ requires K > N = 300 equations per neuron and the ill-conditioning produces small-weight error swamping (the v1 problem). With support known, |supp_j| ≈ 12, and K = 900 conditions gives 75:1 over-determination. The single most important lever in this problem is the binary connectome from EM, used as a structural prior on the weight fit.

Biological feasibility

The protocol requires:

Single-cell-specific optogenetic stimulation. State of the art 2025: feasible in C. elegans (Bates & Bargmann 2010, single-cell ChR2 driver lines exist for all 302 neurons via the WormBase reagent collection).
Ca²⁺ imaging at 5–10 ms resolution with ~1% rate noise after averaging. State of the art 2025: GCaMP6/7/8 + light-sheet microscopy hits this.
Repeated trials (n_reps ≈ 10) per stimulation site. Adds ~1 day of experimental time for the full N×3×10 = 9000 trials at ~30 s each. Tractable.

This is not blue-sky; the components exist.

n_reps sweep at higher noise

Noise	n_reps=10	n_reps=30	n_reps=100
2%	tap=0.103 FAIL	tap=0.055 FAIL	tap=0.045 PASS
5%	tap=0.79 FAIL	tap=0.81 FAIL	tap=0.78 FAIL

2% noise is recoverable with 10× more averaging (n_reps=100 → 900×100 = 90k trials, ~25 hours at 1s/trial — still tractable).

5% noise plateaus at div_tap ≈ 0.78 regardless of n_reps — that's a bias, not variance. Cause: when adding gaussian noise to rates near saturation (r ≈ 1) and clipping to [0, 1], the clipping is asymmetric and shifts the mean downward. The biased X breaks the regression for small weights. Fixing 5% noise would require either (a) higher gain so saturation regime is wider, (b) corrected clipping (truncated-gaussian fit), or (c) keeping rates well below 1 throughout the protocol. Open.

Remaining open

5% rate noise is a hard wall under current clipping. Modify protocol to keep rates < 0.7 throughout (lower amplitudes + higher gain).
Protocol assumes the structural EM scan is exact. In practice EM has its own ~5% false-positive/false-negative rate on synapse calls. The fit's sensitivity to support errors is unmeasured.
Extension to human-scale: 86×10⁹ neurons × 3 amps × 10 reps = 2.6×10¹² trials. Not feasible serially — needs massive parallel optogenetic probing and/or population-level disambiguation strategies. Open.

Files

simulation/run_scan_inverse_v2.py — tonic SS first attempt (zero-noise PASS, fails on held-out classes)
simulation/run_scan_inverse_v2_robust.py — robustness check that revealed failures (a) held-out classes, (b) noise sensitivity
simulation/run_scan_inverse_v3.py — per-neuron probes added (held-out classes fixed; noise wall at 0.02)
simulation/run_scan_inverse_v4.py — noise averaging added (PASS at 1% noise on all 6 stimuli including held-out)
simulation/run_scan_inverse_nreps_sweep.py — n_reps requirements at 2%/5% noise

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Direction 1 — Scan Inverse Problem (resolved at biological noise)

Statement

Result (`run_scan_inverse_v4.py`)

Protocol

Math

Why per-neuron beats per-class

Why tonic beats pulsed

Why structural support known is essential

Biological feasibility

n_reps sweep at higher noise

Remaining open

Files

FilesExpand file tree

direction1_scan_inverse_solved.md

Latest commit

History

direction1_scan_inverse_solved.md

File metadata and controls

Direction 1 — Scan Inverse Problem (resolved at biological noise)

Statement

Result (run_scan_inverse_v4.py)

Protocol

Math

Why per-neuron beats per-class

Why tonic beats pulsed

Why structural support known is essential

Biological feasibility

n_reps sweep at higher noise

Remaining open

Files

Result (`run_scan_inverse_v4.py`)