Last updated: 2026-02-12 Previous version: docs/mneme_project_plan_v1_original.md
Mneme is an exploratory research system designed to detect field-like, emergent memory structures embedded in biological systems. Starting with planarian regeneration and bioelectric data, it seeks to uncover attractor states, regulatory logic, and latent architectures not captured by sequence-based models alone.
Biological systems, particularly those capable of regeneration, may encode memory not just genetically but as persistent spatial or field-based attractors. Mneme aims to identify, model, and interpret these attractor dynamics using machine learning, field theory, and topological methods.
- No prior overfitting to clean systems (Drosophila bias avoided)
- Sensitivity to emergent, nonlinear, and self-correcting behavior
- Respect for biological memory as distributed and dynamic
- Simulation-first: generate rich data through computational biology before requiring wet-lab access
- Interdisciplinary: bridge topology, dynamical systems, and bioelectricity
Objective: Build and validate a modular analysis pipeline on synthetic field data inspired by planarian voltage maps and regenerative logic.
| Task | Status | Implementation |
|---|---|---|
| Synthetic 2D field data with noise and attractors | Done | SyntheticFieldGenerator, generate_planarian_bioelectric_sequence() |
| IFT / field reconstruction | Done | SparseGPReconstructor (default, O(nm^2)), DenseIFTReconstructor, Neural Fields |
| Dimensionality reduction / latent spaces | Done | Convolutional VAE (FieldAutoencoder) with training, encoding, interpolation |
| TDA for persistent structures | Done | Full GUDHI integration: cubical, Rips, Alpha complexes; persistence diagrams/landscapes/images |
| Symbolic regression for field rules | Done | PySR integration with discover_field_dynamics() for automatic PDE discovery |
| Validate internal coherence | Partial | Pipeline runs end-to-end; formal cross-method validation report still needed |
Key validation result: PhysioNet ECG heart rate variability analysis yielded lambda_1 = +0.12/s, D_KY = 2.35, predictability horizon ~8s -- matching published literature on cardiac chaos.
Objective: Apply Mneme to real and simulated biological data, starting with BETSE bioelectric tissue simulations from published Levin Lab papers, then expanding to self-collected ECG data and experimental datasets.
BETSE (BioElectric Tissue Simulation Engine) is a 2D bioelectric tissue simulator used by Levin Lab. It produces spatiotemporal voltage fields, ion concentrations, and current densities -- exactly the kind of data Mneme is built to analyze.
Data pipeline: BETSE CSV exports --> betse_loader.py (interpolation to regular grid) --> Mneme Field objects --> full analysis pipeline
| Task | Status | Notes |
|---|---|---|
| Install and validate BETSE | Done | Ran betse try locally; confirmed CSV output format |
| Build BETSE data loader | Done | betse_loader.py: handles irregular cell data, multi-frame time series, metadata |
| Build BETSE analysis script | Done | scripts/analyze_betse.py: topology, attractors, recurrence, Lyapunov |
| Run paper simulation configs on AWS | Done | c6i.xlarge on-demand instance; 4 configs completed (attractors x2, physiology, patterns) |
| Analyze attractor configs (2016 Frontiers) | Done | sim_1 and sim_2 analyzed: rank-2 PCA, distinct Wasserstein profiles, VAE basin separation |
| Analyze pattern configs (2018 PBMB) | Done | 824-cell elliptical tissue, 5-mode PCA (vs rank-2), Wasserstein drift 84.1, symbolic regression |
| Analyze physiology configs (2018 PBMB) | Done | 7-cell cluster, topology-preserving 56 mV voltage excursion |
| Cross-simulation comparison | Done | Full comparison in docs/BETSE_ANALYSIS_REPORT.md: topology, Wasserstein, PCA, VAE, recurrence |
| Parameter sweep experiments | Not started | Vary gap junction conductance, ion channel expression; track attractor bifurcations |
AWS compute details:
- Instance: c6i.xlarge (4 vCPU, 8 GB), on-demand ~$0.17/hr (switched from spot after termination)
- 4 configs completed: attractors_1 (sim_1+sim_2), physiology, patterns
- Patterns re-run with Vmem CSV export enabled (26 min)
- Total AWS cost: ~$2-3
Hardware: 2x AD8232 Single-Lead ECG sensor modules (acquired).
| Task | Status | Notes |
|---|---|---|
| Design ECG recording protocol | Not started | Duration, sampling rate, electrode placement, activity states |
| Build Arduino/RPi data acquisition script | Not started | Serial read from AD8232, timestamp, save to CSV |
| Build ECG data loader for Mneme | Not started | Adapt PhysioNet loader patterns to raw AD8232 output |
| Baseline recordings (resting, breathing patterns) | Not started | |
| Compare self-data to PhysioNet HRV analysis | Not started | Validate against known lambda_1 ~ +0.12/s result |
| Longitudinal tracking | Not started | Daily recordings to detect attractor drift over time |
| Source | Data Type | Status |
|---|---|---|
| PhysioNet MIT-BIH | ECG / HRV time series | Done -- validated Lyapunov pipeline |
| Planform Database | Planarian phenotype images | Not started |
| ModelDB | Computational neuroscience models | Not started |
| Figshare / Dryad | Supplementary data from Levin Lab papers | Not started |
| SmedGD | Planarian gene expression | Not started |
- Reproducible analysis of 4 BETSE paper simulation configs (attractors x2, physiology, patterns)
- Topological comparison across simulation conditions (attractors vs patterns vs physiology)
- Self-collected ECG data with Lyapunov/attractor analysis
- Cross-method validation report: do topology, Lyapunov, recurrence, and VAE latent space tell consistent stories?
- Cleaned dataset repository with provenance metadata
- Per-paper validation document (see Phase 2.5 below)
Objective: Publish Mneme as a peer-reviewed research tool in the Journal of Open Source Software. JOSS reviews the software itself -- code quality, tests, documentation, and demonstrated utility. This is distinct from the Phase 3 whitepaper, which is about the scientific findings.
Why JOSS, why now:
- JOSS is specifically for research software. The review process evaluates your code, not just your claims.
- A DOI makes the tool citable. Labs won't adopt software they can't cite.
- Peer review of the software catches bugs and design issues before you build Phase 3 on top of it.
- It's the credibility bridge between "GitHub repo" and "research tool" -- and the thing that makes outreach to Levin Lab or other groups a professional conversation rather than a cold pitch.
JOSS requirements vs current status:
| Requirement | Status | Gap |
|---|---|---|
| Open source repo | Done | None |
| Documentation sufficient for new users | Done | MkDocs site with API reference from docstrings |
| Tests that can be run | Done (113 tests, 38.4% coverage) | Push coverage toward 70%+ |
| Community guidelines (CONTRIBUTING, CODE_OF_CONDUCT) | Done | Contributor Covenant v2.1 |
| Statement of need (why this software matters) | Not written | Draft from validation results |
| Example usage | Partial | Scripts exist; need a clean tutorial/notebook |
| Validation on real data | Done | BETSE (4 configs analyzed) + PhysioNet ECG = multi-system validation |
Tasks:
| Task | Status | Notes |
|---|---|---|
| Write per-paper validation docs | Not started | For each BETSE config: paper citation, their values, your values, percent error, caveats |
| Push test coverage to 70%+ | Not started | Focus on BETSE loader, pipeline orchestration, edge cases |
| Generate API reference docs | Done | MkDocs + mkdocstrings; mkdocs build generates site/ |
| Add CONTRIBUTING.md and CODE_OF_CONDUCT.md | Done | Contributor Covenant v2.1; CONTRIBUTING.md updated |
| Create a clean Jupyter notebook walkthrough | Not started | End-to-end: load data, reconstruct field, analyze, visualize |
| Draft JOSS paper (1-2 pages) | Not started | Statement of need, summary, validation highlights, references |
| Submit to JOSS | Not started | After all above; expect 1-3 month review cycle |
| Respond to reviewer feedback | Not started | JOSS reviewers file GitHub issues; fix and respond |
JOSS paper outline (short -- typically 1,000 words max):
- Summary -- What Mneme does in two sentences
- Statement of need -- No existing toolkit combines field reconstruction, Lyapunov analysis, TDA, and symbolic regression for bioelectric data. Researchers using BETSE or similar simulators currently write ad-hoc analysis scripts.
- Key features -- Sparse GP reconstruction, Wolf algorithm Lyapunov spectrum, GUDHI TDA integration, PySR equation discovery, BETSE data loader
- Validation -- PhysioNet ECG (lambda_1 matches literature), BETSE paper configs (N papers, quantitative comparison)
- References -- BETSE, GUDHI, PySR, Wolf algorithm, PhysioNet, the 4 papers you validated against
Post-publication outreach (after JOSS acceptance or submission):
- Contact Levin Lab with: "I built a validated analysis toolkit for BETSE output; here's the JOSS paper; would any of your students/postdocs find it useful?"
- Post in BETSE GitHub discussions
- Cross-post to relevant communities (computational biology, bioelectric signaling)
- The DOI and peer review convert "random person's GitHub repo" into "published research software"
Compute-for-collaboration (ongoing after JOSS):
Troll arXiv and journals for papers with open computational questions Mneme can answer. Target:
- Papers citing BETSE/PLIMBO that lack topology or attractor analysis
- Bioelectric patterning studies that describe "stability" or "robustness" qualitatively but don't quantify (Lyapunov spectrum answers this)
- Regeneration papers with spatiotemporal voltage data and uncharacterized pattern memory
- Gap junction perturbation studies observing bistability without attractor landscape characterization
The pitch: "I ran your published simulation configs through a peer-reviewed analysis toolkit. Here are the topological signatures and Lyapunov spectra. Want to co-author a follow-up?" This converts compute time into co-authorships and builds the collaborative network needed for Phase 3 theory work.
Objective: Formalize insights into a model of distributed memory encoding via fields in biological tissue.
| Task | Status | Dependencies |
|---|---|---|
| Identify recurring attractor geometries across datasets | Not started | Phase 2 results |
| Map topological signatures to biological function | Not started | BETSE analysis + literature review |
| Compare field attractors to Hopfield networks | Not started | Phase 2 cross-method validation |
| Compare to morphogenetic field models (Turing, positional information) | Not started | |
| Propose theoretical framework for field memory dynamics | Not started | All above |
| Draft whitepaper or preprint | Not started | Framework + results |
| Design follow-up experiments (perturbation predictions) | Not started | Framework |
Potential extensions:
- Cross-organism validation (zebrafish, axolotl bioelectric data if available)
- Inclusion of behavior/fear response patterning in planaria
- Field inheritance modeling across generations
- Collaboration with experimental labs for perturbation testing
| Component | Status | Details |
|---|---|---|
Python package (pyproject.toml) |
Done | PEP 621, editable install, optional deps for TDA/PySR/GPU |
| CLI | Done | mneme analyze, mneme info via Click |
| Test suite | Done | pytest with unit + integration tests, coverage reporting |
| CI/CD | Done | GitHub Actions: pytest, mypy, flake8, coverage |
| Code quality | Done | Black, isort, flake8, mypy, pre-commit hooks |
| Documentation | Done | README, CLAUDE.md, CHANGELOG, course, MkDocs API reference, CODE_OF_CONDUCT |
| Cloud compute | Active | AWS EC2 spot instances for BETSE simulations |
| Data loaders | Active | BETSE loader done; ECG loader needs work; PhysioNet validated |
Retrieve AWS simulation results-- Done. 4 configs retrieved and analyzed.Run full Mneme analysis on each config-- Done. PCA, Wasserstein, Lyapunov, recurrence, VAE, symbolic regression.Cross-simulation comparison-- Done. Seedocs/BETSE_ANALYSIS_REPORT.md.- ECG data acquisition setup -- Get the AD8232 sensors producing clean data
- Cross-method validation -- Formal report on whether topology, Lyapunov, recurrence, and latent space analysis agree
- Push test coverage to 70%+ -- Currently 38.4% (113 tests); focus on pipeline, visualization, edge cases
- Per-paper validation docs -- For each BETSE config: compare Mneme results to published values
These are the scientific questions driving the current work:
-
Do BETSE attractor simulations produce topologically distinct persistence diagrams compared to pattern/physiology simulations? If the topology of the voltage field differs between conditions designed to show attractor behavior vs. pattern formation, that's evidence that TDA can distinguish biologically meaningful states.
-
Can we discover governing PDEs from BETSE output using symbolic regression? If
discover_field_dynamics()recovers equations resembling known bioelectric models (gap junction diffusion, Nernst potentials), that validates the approach for use on experimental data where the equations are unknown. -
Do VAE latent spaces organize biologically meaningful states? When we encode BETSE voltage field time series, do nearby points in latent space correspond to similar biological conditions? Do trajectories through latent space reflect known developmental/regenerative paths?
-
Is cardiac HRV chaos (from ECG) topologically similar to bioelectric tissue chaos (from BETSE)? Both are bioelectric systems. If their attractor geometries share topological features, that suggests something universal about bioelectric memory encoding.
-
Can we predict perturbation outcomes? Given a trained model of the attractor landscape, can we predict what happens when a gap junction blocker is applied (simulated in BETSE)?
| Period | Focus |
|---|---|
| Feb 2026 | Retrieve and analyze BETSE simulation results; write per-paper validation docs; begin ECG setup |
| Mar 2026 | Cross-simulation comparison report; first self-ECG recordings; push test coverage to 70%+ |
| Apr 2026 | Parameter sweep experiments in BETSE; ECG longitudinal tracking; generate API docs; create tutorial notebook |
| May 2026 | Cross-method validation report; draft JOSS paper; add CONTRIBUTING/CODE_OF_CONDUCT |
| Jun 2026 | Submit JOSS paper; begin Phase 3 interpretation while review is pending |
| Jul-Aug 2026 | Respond to JOSS reviewer feedback; theory development; outreach to Levin Lab / BETSE community |
| Sep-Oct 2026 | JOSS publication (expected); draft science whitepaper |
Note: The JOSS paper (Phase 2.5) and the science whitepaper (Phase 3) are different documents with different audiences. JOSS says "here is a tool, it works, it's tested." The whitepaper says "here is what we found using the tool." JOSS should go first -- it establishes the tool's credibility before you make claims with it.