Skip to content

rkv0id/biota

Repository files navigation

 biota

Distributed Flow-Lenia discovery platform. MAP-Elites search, behavioral archive, ecosystem simulation, chemical signal field.

demo

biota runs MAP-Elites searches across a Ray cluster, dispatching batches of Flow-Lenia simulations as vectorized PyTorch forward passes to stateless GPU workers, producing a structured behavioral archive of distinct artificial life-forms. The full experimental loop: configure behavioral descriptors, search the parameter space, explore the archive, seed ecosystem simulations from selected creatures. An optional signal field adds a shared chemical medium to both search and ecosystem runs: each creature emits into and senses from a 16-channel field, enabling signal-mediated interaction alongside mass dynamics.

How it works

Flow-Lenia is a continuous cellular automaton where matter is conserved by construction. Mass conservation prevents the explode/collapse failure modes that dominate vanilla Lenia, producing stable solitons across a much wider range of parameters.

MAP-Elites searches that parameter space for behavioral diversity rather than a single optimum. Instead of one best creature, it fills a grid where each cell holds the highest-quality creature with a particular phenotypic fingerprint: an atlas of qualitatively distinct life-forms.

CVT-MAP-Elites archive: calibration survivors (grey dots) and occupied Voronoi cells (magma color = quality)

The driver owns the archive and the search loop. Each Ray task evaluates B creatures as a single (B, H, W) vectorized forward pass. One task fills one GPU. Workers are stateless; nothing persistent lives on the cluster between tasks.

Search loop and Ray dispatch

CVT-MAP-Elites two-phase algorithm: calibration fits k centroids, search loop inserts via nearest-centroid lookup

--workers N controls how many batches are in flight simultaneously. --workers 1 is synchronous MAP-Elites (maximally fresh archive). Higher values trade freshness for throughput on multi-node setups.

Ecosystem simulation

Once the archive is populated, biota ecosystem takes specific archive cells and runs them on a shared grid to see how creatures interact. A homogeneous run spawns N copies of one species. A heterogeneous run mixes two or more species, each with its own full parameter set (kernel radii, growth windows, weights), using species-indexed LocalizedFlowLenia: per-cell species ownership tracks which lineage owns the local mass, blends growth fields by ownership, and advects with the flow.

After the simulation, a suite of spatial observables is computed from the captured snapshots -- no re-simulation required. For heterogeneous runs: patch count per species over time, interface area per species pair, center-of-mass distance per pair, and spatial entropy per species. For homogeneous runs: patch count over time, spatial entropy, and patch size distribution. Interaction coefficients are gated to snapshot windows where species actually co-occur, so they measure contact dynamics rather than spatial separation. A temporal outcome classifier assigns per-species labeled windows (coexistence, exclusion, merger, fragmentation for heterogeneous; stable_isolation, full_merger, partial_clustering, cannibalism, fragmentation for homogeneous) and derives a dominant run-level label shown as a badge in the viewer.

Ecosystem dispatch is Ray-correct: each experiment is a self-contained payload. The driver loads creatures from its local archive and ships them with the config; workers simulate and render to bytes; the driver materializes outputs locally. No shared filesystem is assumed at any step, so experiments run correctly on real multi-node clusters without NFS or rsync setup.

Ecosystem dispatch: driver loads creatures, workers simulate and render, driver materializes

Ecosystem outcome taxonomy: coexistence, exclusion, merger, fragmentation, stable isolation, cannibalism

Behavioral descriptors

The archive grid has three axes, each a scalar measured empirically from the rollout. Choose any three from the built-in library of eighteen:

Descriptor What it captures
velocity Mean COM displacement per step over the trailing 50 steps
gyradius Mass-weighted RMS distance from the center of mass
spectral_entropy Shannon entropy of the radially-averaged FFT spectrum
oscillation Variance of bounding-box fraction over the trace tail
compactness Mass inside bounding box / total mass at the final step
mass_asymmetry Directional bias of motion: straight movers vs orbiters
png_compressibility PNG compressed/uncompressed ratio of the final state
rotational_symmetry Angular variance of radial mass profile
persistence_score Max descriptor drift across the trace tail
displacement_ratio Total displacement / total path length (0 = orbiter, 1 = glider)
angular_velocity Mean absolute angular speed of COM motion
growth_gradient Mass-weighted mean spatial gradient magnitude (internal edge density)
morphological_instability Variance of gyradius over the trace tail (shape stability)
activity Mean absolute gyradius change per step (internal work rate)
spatial_entropy Shannon entropy of coarse spatial mass distribution
signal_field_variance Spatial variance of the total signal field at end of rollout. High = signal concentrated near creature body; low = diffused. Signal-only.
signal_mass_ratio Final signal mass / initial signal mass. Measures chemical accumulation relative to the background field. Signal-only.
dominant_channel_fraction Fraction of signal mass in the dominant channel. High = chemical specialist; low ≈ 1/C = generalist. Signal-only.

With 18 built-ins (15 general + 3 signal-only) there are C(18,3) = 816 possible archive configurations. Signal-only descriptors require --signal-field and work best combined with at least one morphological axis -- pure signal-only combos require larger budgets since creatures cluster more tightly in signal space. Supply custom descriptors via --descriptor-module. The archive viewer renders all three axes with histogram and correlation panels.

Quickstart

git clone https://github.com/rkv0id/biota
cd biota
uv sync
uv run biota search --preset dev --budget 50

Runs 50 rollouts synchronously on CPU. Then build the viewer:

uv run python scripts/build_index.py --output-dir archive
open archive/index.html

Every creature is rendered as an animated magma-colorized thumbnail with hover tooltips, lineage highlighting, and a click-through modal with full parameters. No server required, fully self-contained HTML.

Ecosystem simulation

Once you have an archive, define one or more ecosystem experiments in a YAML config and run them:

biota ecosystem --config experiments.yaml --device cuda

A minimal config defines what to spawn, on what grid, for how long:

experiments:
  - name: dense-population
    grid: 512                     # 512 for square, [192, 512] for rectangular
    steps: 5000
    snapshot_every: 35
    border: torus                 # 'torus' or 'wall'
    output_format: gif            # 'gif' or 'frames'
    spawn:
      min_dist: 55                # min pixel distance between spawn centers
      patch: 32                   # initial random patch side length
      seed: 0
    sources:
      - run: 20260413-134355-hazy-creek
        cell: [5, 23, 13]
        n: 8

A heterogeneous experiment lists multiple sources. Each source contributes its own creature with its own full parameter set; species ownership is tracked per-cell and growth fields blend by ownership weight:

experiments:
  - name: predator-prey
    grid: 512
    steps: 8000
    snapshot_every: 50
    border: torus
    output_format: gif
    spawn:
      min_dist: 60
      patch: 28              # default patch size for all sources
      seed: 42
    sources:
      - run: 20260413-134355-hazy-creek
        cell: [5, 23, 13]
        n: 6
      - archive_dir: archive-secondary    # optional per-source override
        run: 20260414-091122-still-pond
        cell: [22, 8, 11]
        n: 6
        patch: 48            # this species spawns at a larger scale

Each source can override the experiment's spawn.patch with its own value. Useful when species in a heterogeneous run have different natural scales; for example a small fast glider mixed with a large dense colony. The Poisson disk margin uses the largest patch in the run so creatures still fit safely inside the wall border. When omitted, sources fall back to the experiment's spawn.patch.

Multiple experiments in a single file run sequentially. After running, rebuild the index to include ecosystem results in the atlas:

python scripts/build_index.py \
    --output-dir archive \
    --ecosystem-dir ecosystem \
    --publish

Signal field

The signal field is an optional chemical communication layer that operates on top of the mass dynamics. Enable it at search time with --signal-field:

biota search --preset standard --budget 2000 \
    --device cuda --batch-size 64 --workers 3 \
    --signal-field

This adds six signal parameters to each creature's searchable parameter space:

Parameter Shape Range Description
emission_vector (16,) [0, 1] How emitted signal is distributed across the 16 channels
receptor_profile (16,) [-1, 1] Channel weights for sensing. Negative values produce inhibitory (aversive) responses
emission_rate scalar [0.0001, 0.01] Base signal emission rate per step. Modulated by beta_modulation and G_pos (positive growth activity)
decay_rates (16,) [0, 0.9] Per-channel decay rate applied each step. Creatures with low decay on key channels maintain longer-range chemical gradients
alpha_coupling scalar [-1, 1] Reception-to-growth coupling. Positive = chemotaxis (grow into favorable signal, enables cross-species predation). Negative = chemorepulsion. Zero = no coupling
beta_modulation scalar [-1, 1] Adaptive emission. Positive = quorum sensing (amplify emission when receiving signal). Negative = feedback inhibition (suppress emission). Zero = static rate
signal_kernel_r scalar [0.2, 1.0] Signal kernel radius scale
signal_kernel_a/b/w (3,) each same as mass kernels Ring function parameters for signal diffusion

Signal field mechanics: per-step mass and signal field update cycle

Inter-species signal coupling: dot products between emission vectors and receptor profiles determine chemotaxis, chemorepulsion, pursuit, or blind interaction

Physics. At each step: (1) convolve mass to get G(H,W); (2) convolve signal field; (3) compute reception dot(convolved_signal, receptor_profile); (4) apply alpha_coupling: G *= (1 + alpha * reception).clamp(min=0) -- positive alpha is chemotaxis (grow into favorable signal, including other species' territory, enabling cross-species predation); negative alpha is chemorepulsion; (5) modulate emission rate via beta_modulation: rate_eff = rate * (1 + beta * mean(reception)) clipped to [0, 0.1] -- positive beta is quorum sensing, negative beta is feedback inhibition; (6) emit G_pos * rate_eff * emission_vector, draining mass into signal field; (7) reintegrate mass; (8) decay signal at decay_rates. Note: signal mass decays each step by design -- total mass+signal is not conserved. Creature mass alone is conserved modulo emission (which transfers mass into the signal field).

Archive compatibility. An archive produced with --signal-field is tagged "signal_field": true in manifest.json. Ecosystem runs detect this automatically from the creature params -- no YAML flag needed. If any source creature comes from a signal-enabled archive, all sources must too; mixing signal and non-signal archives raises an error at load time.

Quality metric. Three hard filters gate entry: (1) creature mass within [0.5, 2.0] × initial (signal field decays by design and is excluded from this check); (2) bounding-box fraction < 0.6 (not scattered); (3) descriptor drift across adjacent 50-step windows ≤ 0.2. Survivors are ranked by a three-component score:

non-signal:  q = 0.6 × compactness  +  0.4 × stability
signal:      q = 0.5 × compactness  +  0.3 × stability  +  0.2 × signal_activity

compactness      = min( compact(state_T/2),  compact(state_T) )
stability        = clip( 1 − drift / 0.2,  0,  1 )
signal_activity  = clip( final_signal_mass / initial_signal_mass,  0,  1 )

The two-point compactness term is the key addition. Almost all viable solitons score >0.95 at the final step, making a single-snapshot metric nearly constant across the population. Taking the minimum with the midpoint state catches creatures that peak early and gradually become diffuse -- the type of instability that matters most for long ecosystem runs. The signal_activity term rewards creatures that maintain or replenish the background chemical field rather than letting it decay away -- selecting for genuine emitters over chemically passive creatures. The initial signal field is low-frequency Gaussian noise (~0.01 amplitude) per channel; signal searches auto-select 800 steps.

Relationship to related work

The closest published work is Plantec et al. 2025, Exploring Flow-Lenia Universes. Both efforts run multi-rule Flow-Lenia on a shared grid, but the framing and mechanism are different.

Plantec's setup is a universe search: random P-field initialization, random kernel sets, and per-cell parameter embeddings that drift under the dynamics. Speciation emerges in-simulation because the P field itself evolves and can carve out distinct regions over time. Only the growth-window vector h is localized; kernel parameters (R, r, a, b, w) are shared across the grid because spatial variation in those would break the FFT factorization the step relies on.

biota's heterogeneous mode is a curated gene-pool ecosystem. The creatures are not random; they come from a MAP-Elites archive built by the search loop, each one a behavioral variant validated by descriptors and quality. A heterogeneous run picks specific archive cells, treats each as a species, and gives every species its own complete parameter set: R, r, a, b, w, and the h vector. The cost is one FFT pass per species per step; the upshot is that the species in the run are interpretable, reproducible, and selectable from the same descriptor space the atlas exposes. Species ownership is tracked per cell as a simplex weight that advects with the mass; growth fields blend by ownership. There is no in-simulation speciation in the current implementation: the species count is fixed at the start of the run.

The two approaches answer different questions. Plantec asks "what kinds of universes does Flow-Lenia generate from random initial conditions?". biota asks "what happens when these specific creatures, found by search, are placed together?". The first is open-ended exploration of universe space; the second is hypothesis-driven study of the archive. They are complementary, and the heterogeneous code path here borrows the per-cell ownership idea from Plantec while keeping each species' parameters intact.

Running on a cluster

# On every node
just cluster-install && source ~/.biota-runtime/bin/activate

# Head node
ray start --head --node-ip-address=<ip> --port=6379 --num-gpus=1

# Worker nodes
ray start --address=<ip>:6379 --num-gpus=1

# Search
biota search --ray-address <ip>:6379 \
    --preset standard --budget 500 \
    --device cuda --batch-size 64 --workers 3

# With a custom descriptor set
biota search --ray-address <ip>:6379 \
    --preset standard --budget 2000 \
    --device cuda --batch-size 64 --workers 3 \
    --descriptors oscillation,compactness,png_compressibility

Three presets: dev (64×64, 200 steps), standard (192×192, 300 steps), pretty (384×384, 500 steps). Signal searches (--signal-field) automatically override to 500/800/1200 steps respectively.

CLI reference

biota search

Flag Default Description
--preset standard dev, standard, or pretty
--budget 500 Total rollouts
--random-phase 200 Uniform random rollouts before mutation
--centroids 1024 CVT archive capacity (number of Voronoi cells)
--batch-size 1 Rollouts per dispatch. 32-128 on cuda/mps
--workers 1 Concurrent batch dispatches. 1 = synchronous MAP-Elites
--device cpu cpu, mps, or cuda
--local-ray off Start a fresh local Ray instance
--ray-address none Attach to an existing Ray cluster
--base-seed 0 Reproducibility seed
--checkpoint-every 100 Checkpoint cadence in rollouts
--descriptors velocity,gyradius,spectral_entropy Three descriptor names, comma-separated
--descriptor-module none Path to a Python file defining custom Descriptor objects
--signal-field off Enable signal field parameters (emission, reception, kernel) in search. Produces a signal-enabled archive tagged in manifest.json. Signal and non-signal archives cannot be mixed in ecosystem runs
--output-dir archive Directory for run output

biota ecosystem

Experiment-level parameters (grid, steps, sources, spawn) live in the YAML config. CLI flags carry only infrastructure.

Flag Description
--config Path to a YAML file defining one or more experiments (required)
--archive-dir Default archive directory; sources may override per-entry. Default: archive
--output-dir Root directory for ecosystem run output. Default: ecosystem
--device cpu, mps, or cuda. Default: cpu
--local-ray Start a fresh local Ray instance and run experiments in parallel. Mutually exclusive with --ray-address
--ray-address Attach to an existing Ray cluster at HOST[:PORT] (or ray://host:port for the Client protocol) and run experiments in parallel
--workers Maximum experiments running concurrently when Ray is active. Defaults to detected CUDA GPU count, or 1
--gpu-fraction Fraction of a GPU each worker reserves. Defaults from --device: 1.0 for cuda (one worker per GPU), 0 for cpu and mps. Set explicitly to pack workers per GPU (e.g. 0.5 with --device cuda runs two workers per GPU). Combinations like --device cuda --gpu-fraction 0 are rejected as contradictory; combinations like --device cpu --gpu-fraction 0.5 print a warning since they idle GPU resources

biota doctor checks Python, torch, device availability, Ray, and module health (search, ray_compat, ecosystem).

Run output

Archive runs

archive/20260413-134355-hazy-creek/
├── manifest.json       # run metadata, biota version, preset, descriptors used
├── config.json         # exact SearchConfig serialized
├── archive.pkl         # MAP-Elites archive, rewritten on checkpoint
├── events.jsonl        # append-only log of every rollout outcome
├── thumbs/             # per-cell animated GIFs (--publish mode)
├── view.html           # interactive archive viewer
└── index.html          # top-level atlas (in archive/ root)

Ecosystem runs

ecosystem/20260415-104007-096-dense-population/
├── config.json         # resolved experiment configuration
├── summary.json        # mode, sources, measures (mass history, spatial observables,
│                       # interaction coefficients, outcome label and temporal sequence)
├── ecosystem.gif       # animated GIF output (gif mode)
├── frames/             # individual PNG snapshots (frames mode)
├── trajectory.npy      # raw float32 mass snapshots (n_snapshots, H, W)
└── view.html           # ecosystem viewer: mass/territory/patch count/entropy charts,
                        #   interface area and COM distance per pair, interaction heatmap,
                        #   outcome timeline

Development

just check       # ruff + pyright + pytest (422 tests, 0 warnings)
just smoke-ray   # local-Ray integration smoke test

The test suite runs entirely in no-Ray mode. just smoke-ray exercises the Ray code path and should be run after any change to ray_compat.py.

Roadmap

  • v0.1.0 - Flow-Lenia PyTorch port, mass conservation verified against JAX reference
  • v0.2.0 - Driver, Ray runtime, search loop, multi-node GPU verified
  • v0.3.0 - Descriptor rework, visual pipeline, static index, per-run metrics
  • v0.4.0 - Batched rollout engine, 3.5x cluster speedup
  • v1.0.0 - Lineage view, atlas site, public launch at biota-atlas.pages.dev
  • v1.1.0 - 9 built-in descriptors, --descriptors CLI, per-axis archive filtering, custom descriptor API
  • v2.0.0 - Ecosystem simulation: spawn archive creatures on a shared grid, animated GIF output, rectangular grids
  • v2.1.0 - 15 general built-in descriptors (displacement ratio, angular velocity, growth gradient, morphological instability, activity, spatial entropy)
  • v2.2.0 - Heterogeneous ecosystems: multi-source YAML configs, species-indexed parameter localization, per-cell ownership tracking
  • v2.3.0 - Per-source patch override; parallel ecosystem dispatch via Ray (--local-ray, --ray-address, --workers, --gpu-fraction); sidebar layout with pan/zoom canvas
  • v2.4.0 - Cluster-safe ecosystem dispatch: driver-side creature loading and driver-side output materialization; transport×device smoke test grid
  • v2.5.0 - Species-colored ecosystem rendering, per-species territory and mass charts, mobile layout overhaul
  • v3.0.0 - Growth field capture, empirical S×S interaction coefficient matrix, ecosystem outcome classification, interaction heatmap in viewer
  • v3.1.0 - Spatial observables for both run modes: patch count, interface area, COM distance, spatial entropy (from existing snapshots, no new simulation code); interaction coefficients gated to contact windows; blended pair colors in viewer
  • v3.2.0 - Temporal outcome classifier: per-species labeled windows, patch-count-based fragmentation, separate taxonomies for homogeneous and heterogeneous runs, outcome timeline in viewer
  • v3.3.0 - Signal field: per-creature emission and sensing in a shared (H, W, 16) chemical field; switchable via --signal-field; archive-level tagging; quality filter updated for mass+signal conservation; both homo and hetero ecosystem paths signal-aware
  • v3.4.0 - Signal physics corrected: per-creature emission_rate and decay_rates (searchable); standard preset 500 steps, signal_preset 800 steps; CREATURE_MASS_FLOOR 0.2; signal_retention quality term; signal observables (total history, mass fraction, receptor alignment, emission-reception matrix); SIGNAL badge on archives; signal params in creature modal; signal overlay checkbox on ecosystem GIF; outcome label tooltips
  • v3.5.0 - Chemical coupling (alpha_coupling [-1,1]: multiplicative growth, enables cross-species predation) + adaptive emission (beta_modulation [-1,1]: quorum sensing / feedback inhibition); descriptor library 15→18
  • v4.0.0 - CVT-MAP-Elites archive (calibration phase fits k-means centroids from observed descriptor distribution; per-axis scale normalization; creature_id replaces grid coords); signal descriptors rebuilt (signal_field_variance, signal_mass_ratio, dominant_channel_fraction replace emission_activity/receptor_sensitivity/signal_retention -- new descriptors measure final field state and work correctly for equilibrium solitons); batched signal physics in rollout_batch; archive viewer rebuilt (card list + histogram + Pearson correlation panels)

References

About

Flow-Lenia quality-diversity search and multi-species ecosystem simulation. CVT-MAP-Elites behavioral archive, chemical signal field, Ray cluster dispatch.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors