"You speak for the whole planet, do you? For the common consciousness of every dewdrop, of every pebble, of even the liquid central core of the planet?"
"I do, and so can any portion of the planet in which the intensity of the common consciousness is great enough."
— Isaac Asimov, Foundation and Earth
This project introduces and experiments with a decentralized control framework for systems described by PDEs. Leveraging Tesseract-Jax to implement the PDE solver as a differentiable layer, we leverage the Differentiable Predictive Control framework to enable autonomous agents to interact with the physical field for trajectory tracking.
This project was ideated and evaluated by Pietro Zanotta1, Dibakar Roy1 and Honghui Zheng1 as part of the Tesseract Hackathon 2025.
Contacts:
- Pietro Zanotta: pzanott1@jhu.edu
- Dibakar Roy: droysar1@jh.edu
- Honghui Zheng: hzheng39@jh.edu
1: shared first authorship
- Differentiable Operator Learning for Control: we recast policy synthesis for PDE systems as an operator learning problem using the DeepONet framework. By treating the PDE solver as a differentiable layer through the Tesseract differentiable programming library, we compute exact sensitivity gradients for policy optimization then used within the Differentiable Predictive Control framework.
-
Zero-Shot Scalability: Policies trained on a fixed swarm size
$N$ generalize to unseen cardinalities$M$ (e.g., training on 20 agents and deploying on 60) without further tuning, allowing resilience to actuator failure. - Communication-Free Coordination: We test the scenario where agents operate using local-only sensing and zero inter-agent communication, where we observe an emerging self-normalization property, coming from stigmergic interaction, preventing overactuation.
-
Theoretical Gradient Consistency: We provide a mathematical foundation theorem ensuring that discrete policy gradients converge to the mean-field limit as the swarm size
$N \rightarrow \infty$ . - Parameter Efficiency: In our toy examples, the decentralized approach utilizes 48% fewer parameters in the 1d cases and 76% fewer in the 2d case than centralized benchmarks while maintaining competitive performance.
For a more rigorous discussion about all the above points we suggest reading through our technical document.
- Multi-Agent Differentiable Predictive Control for Zero-Shot PDE Scalability
This research explores the intersection of Differentiable Programming, Operator Learning, and Swarm Intelligence. We demonstrate that treating a PDE solver as a neural network layer allows for the training of highly efficient, decentralized control policies. In this section we provide a brief introduction to the problem formulation. For a more rigorous discussion we refer to our technical document.
The control objective is to find an optimal control sequence
where
System Dynamics (PDE): The state field
where the total forcing
Actuator Kinematics: Each mobile actuator
Constraints:
- Control Saturation:
$|u_i(t)| \le u_{\max}$ - Kinematic Limits:
$|v_i(t)| \le v_{\max}$ - Boundary Containment:
$\xi_i(t) \in \Omega$
To synthesize a policy approximating the optimal control sequence
-
Forward Pass: The current state
$z_k$ and control actions$u_k$ are passed through a differentiable operator$\Psi$ (the PDE solver) to predict the future state$z_{k+1}$ . It is relevant that such a solver is created using Tesseract, to allow differentiable simulations. -
Sensitivity Analysis: By applying the chain rule through the solver, we compute exact sensitivity gradients of the future state with respect to the policy parameters
$\theta$ -
Policy Optimization: These gradients are used to update the neural network, minimizing the total loss
$\mathcal{J}$ over a trajectory of length$K$ .
Note that part of the theoretical results on Zero-Shot Scalability rely on a conjecture that we are only empirically validating. For a more rigorous discussion about all the above points we suggest reading through our technical document.
Algorithm pseudocodes can be found below:
- Centralized Policy Pseudocode:
- Decentralized Policy Pseudocode:
The framework was validated on two primary physical systems:
- Linear Heat Equation: Focused on temperature tracking and heat spreading.
- Nonlinear Fisher-KPP Equation: Modeled population dynamics and chemical fronts, where agents must overcome natural growth to achieve stability.
| Metric | Heat 1d (Centr.) | Heat 1d (Decentr.) | Heat 2d (Centr.) | Heat 2d (Decentr.) | Fisher-KPP (Centr.) | Fisher-KPP (Decentr.) |
|---|---|---|---|---|---|---|
| Branch Input Dim | 200 | 40 | 1024 | 144 | 200 | 40 |
| Total Parameters | 21,794 | 11,298 | 2,116,003 | 158,531 | 21,794 | 11,298 |
| Final Tracking Loss | 5.2e-3 | 6.4e-3 | 7.8e-3 | 9.0e-3 | 7.0e-3 | 8.3e-3 |
| Scalability | Zero-shot | Zero-shot | Zero-shot | Zero-shot | Zero-shot | Zero-shot |
| Communication | Global | None | Global | None | Global | None |
| Training Time (500 ep.) | ~1 min | ~1 min | ~4 min | ~3 min | ~3 min | ~3 min |
tesseract-hackathon/
├── examples/ # High-level scripts for specific PDE problems
│ ├── fkpp1d/ # Fisher-KPP 1D reaction-diffusion examples
│ │ ├── centralized/ # Training and visualization for global control
│ │ └── decentralized/ # Multi-agent/local control versions
│ ├── heat1d/ # 1D Heat Equation examples
│ │ ├── centralized/
│ │ └── decentralized/
│ └── heat2D/ # 2D Heat Equation examples
│ ├── centralized/
│ └── decentralized/
│
├── models/ # Core neural network architectures
│ └── policy.py # JAX implementation of the DPC policies
│
├── tesseracts/ # The "Legacy" Simulator Wrappers
│ ├── solverFKPP_.../ # Solvers specifically for FKPP problems
│ ├── solverHeat_.../ # Solvers specifically for Heat problems (both 1d and 2d)
│ │ ├── solver.py # The underlying physics engine logic
│ │ ├── tesseract_api.py # Interface defining 'apply' and 'vjp' for JAX
│ │ └── tesseract_config.yaml
│ └── ...
│
├── requirements.txt # Python dependencies
└── README.md # Project documentation
- Clone the repository:
git clone https://github.com/PietroZanotta/Multi-Agent-DPC
cd Multi-Agent-DPC- Set up Python virtual environment:
python -m venv .venvActivate the virtual environment:
- Linux/MacOS:
source .venv/bin/activate- Windows (PowerShell):
.venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtTip
If you have GPU access and want to accelerate training, also install JAX with CUDA:
pip install jax[cuda12]- Verify Tesseract installation:
which tesseract # Linux/MacOS
# or
where tesseract # WindowsNote
For Mac Users: If tesseract build conflicts with the Tesseract OCR binary, use the full path:
/path/to/venv/bin/tesseract build .Build the differentiable PDE solvers (required only once). This step containerizes each solver with its neural network policy as a differentiable layer.
# Build Heat Equation (1D)
cd tesseracts/solverHeat_centralized && tesseract build .
cd ../solverHeat_decentralized && tesseract build .
# Build Fisher-KPP (1D reaction-diffusion)
cd ../solverFKPP_centralized && tesseract build .
cd ../solverFKPP_decentralized && tesseract build .
# Build 2D Heat Equation (2D)
cd ../solverHeat2D_centralized && tesseract build .
cd ../solverHeat2D_decentralized && tesseract build .
# Return to project root
cd ../..[!INFO] Each
tesseract buildcommand creates a Docker image containing the PDE solver and its trained policy. Subsequent builds are cached. You can verify built images with:docker images | grep solver
Pre-trained policy weights are included, so you can visualize results immediately without training:
Centralized policy (global sensing):
cd examples/heat1d/centralized
python visualize_conference.py
# Generates: heat_dpc_visualization_*.png, heat_dpc_agents_*.pngDecentralized policy (local sensing, communication-free):
cd ../decentralized
python visualize_conference.py
# Generates: heat_dpc_decentralized_visualization_*.pngExample output (centralized):
Centralized policy:
cd ../../fkpp1d/centralized
python visualize_conference.py
# Generates: fkpp_dpc_visualization_*.pngDecentralized policy:
cd ../decentralized
python visualize_conference.py
# Generates: fkpp_dpc_decentralized_visualization_*.pngCentralized policy:
cd ../../heat2D/centralized
python visualize.py
# Generates: heat2d_centralized_visualization.png/pdfDecentralized policy:
cd ../decentralized
python visualize.py
# Generates: heat2d_decentralized_visualization.png/pdfExample output (2D Heat - centralized):
Create animated trajectories (.gif and .mp4) demonstrating the policy performance:
# Heat 1D - Centralized
cd examples/heat1d/centralized && python animate.py
# Generates: heat_dpc_animation.gif, heat_dpc_animation.mp4
# Fisher-KPP - Decentralized
cd ../../fkpp1d/decentralized && python animate.py
# Generates: fkpp_dpc_animation.gif, fkpp_dpc_animation.mp4
# Heat 2D - Centralized
cd ../../heat2D/centralized && python animate.py
# Generates: heat2d_animation.gif, heat2d_animation.mp4Note
Animation generation requires FFmpeg. On most systems:
# Ubuntu/Debian
sudo apt-get install ffmpeg
# macOS
brew install ffmpeg
# Windows (with Chocolatey)
choco install ffmpegExample animations:
-
Fisher-KPP - Centralized:
-
Fisher-KPP - Decentralized:
-
Heat 2D - Centralized:
-
Heat 2D - Decentralized:
To train policies on new datasets or modify architectures, use the training scripts. This requires significant compute (GPU recommended):
Make sure you are at project root:
# Example: Train 1D Heat centralized policy
cd examples/heat1d/centralized
python train.py # Generates dataset and trains for 500 epochs (saves centralized_params.msgpack)
python visualize_conference.py # Visualize results
python animate.py # Create animated trajectoriesFull workflow for all experiments:
Make sure you are at project root:
# Heat 1D
for variant in centralized decentralized; do
cd examples/heat1d/$variant
python train.py && python visualize_conference.py && python animate.py
cd ../../..
done
cd ../..;
# Fisher-KPP 1D
for variant in centralized decentralized; do
cd examples/fkpp1d/$variant
python train.py && python visualize_conference.py && python animate.py
cd ../../..
done
# Heat 2D
for variant in centralized decentralized; do
cd examples/heat2D/$variant
python train.py && python visualize.py && python animate.py
cd ../../..
doneFor decentralized policies, explore the self-normalization property and zero-shot scalability empirically:
Make sure you are at project root:
cd examples/fkpp1d/decentralized
# Analyze control effort across different effort penalty weights
python visualize_lambda_effort.py
# Tests how control effort scales as the number of agents increases beyond training size.
# Validates the self-normalization conjecture: individual control efforts u_i ~ O(1/N),
# so the total forcing norm ||B|| remains bounded as N increases.
# Test zero-shot scalability: deploy policy trained on N agents on M agents (M ≠ N)
python visualize_comparison.py
# Evaluates tracking MSE and control effort as agent count varies from training conditions.
# Demonstrates that policies generalize to unseen swarm sizes without retraining.| Issue | Solution |
|---|---|
Image solver_X:latest not found |
Run tesseract build tesseracts/solverX/ first |
tesseract command not found on Mac |
Use full path: /path/to/venv/bin/tesseract build . |
| Training is slow on CPU | Install jax[cuda12] and verify GPU is detected: python -c "import jax; print(jax.devices())" |
| Out of memory errors | Reduce batch_size in train.py (default: 32) |
| Animations won't generate | Install FFmpeg (see section above) |
There are various research directions we believe can stem from this project. Here is a list of the ones we believe are the most promising:
- Understand all the perks and the limitations of casting the policy synthesis into an operator learning paradigm.
- Extending our theoretical analysis to a wider class of PDEs and formally proving our self-normalization conjecture.
- Implementing Shared Memory strategies (e.g.
/dev/shm) to minimize the serialization cost of communication between the python script and the Tesseract during the training of the policy.
- Processor: Intel Core Ultra 9 275HX (24 cores, up to 5.4 GHz)
- GPU: NVIDIA GeForce RTX 5090 Laptop GPU (24GB GDDR7 VRAM)
- Operating System: Ubuntu 22.04 running under Windows Subsystem for Linux (WSL2)
- Main Frameworks: JAX (v0.8.1) for numerical computing; Tesseract-JAX (v0.2.2) for differentiable PDE solvers
- Hardware Acceleration: CUDA backend with NVIDIA driver v581.57
See our technical document for details about our experimental setup.




