MARL-Dyson

Multi-Agent Reinforcement Learning for Dynamic Resource Optimization: A Dyson Swarm Simulation

Project Status: Early Development

This project is actively under development. Features and documentation will be updated regularly.

Overview

MARL-Dyson simulates resource optimization using multi-agent reinforcement learning. It models autonomous agents optimizing their positions around a central energy source, inspired by the concept of a Dyson swarm.

Energy Distribution Generation

The energy distribution is generated by creating a uniform random field across a spherical coordinate grid, with values between 0 and 1. A Gaussian smoothing filter is then applied to this random field, creating continuous regions of varying energy levels. The smoothing parameter controls the transition gradient between these regions. The final distribution is normalized to ensure all values remain in the [0,1] range.

Discretization of the Spherical Domain

theta = np.linspace(0, np.pi, resolution)
phi = np.linspace(0, 2*np.pi, resolution)
self.theta, self.phi = np.meshgrid(theta, phi)

This creates a discretized grid over the entire spherical surface using standard spherical coordinates:

θ ∈ [0, π] spans from the north pole (θ=0) to south pole (θ=π)
φ ∈ [0, 2π] covers the full longitudinal rotation

The choice of uniform discretization with resolution points enables a computational tradeoff: higher values provide better spatial accuracy at the cost of increased computational complexity (O(resolution²)).

Stochastic Field Generation

random_field = np.random.rand(self.resolution, self.resolution)

This initializes a random field following a uniform distribution U(0,1). Mathematically, each point (i,j) in the field is assigned a value:

$$F_0(i,j) \sim \mathcal{U}(0,1)$$

This represents a white noise process with zero spatial correlation. The uniform distribution was selected rather than, for instance, Gaussian or power-law distributions, to ensure maximum entropy in the initial state.

Spatial Correlation via Gaussian Filtering

smoothed_field = gaussian_filter(random_field, sigma=self.smoothing)

This operation transforms the white noise field into a correlated field through convolution with a Gaussian kernel:

$$F_1(i,j) = \sum_{k,l} F_0(k,l) \cdot G_\sigma(i-k, j-l)$$

Where the Gaussian kernel is defined as:

$$G_\sigma(x,y) = \frac{1}{2\pi\sigma^2} e^{-\frac{x^2+y^2}{2\sigma^2}}$$

The parameter sigma (σ) has profound effects on the resulting field:

It establishes the correlation length scale between points
It controls the characteristic size of energy "features" on the sphere
It determines the smoothness of gradients in the field

Normalization for Scale Invariance

return (smoothed_field - smoothed_field.min()) / (smoothed_field.max() - smoothed_field.min())

This performs min-max normalization, transforming the field to the range [0,1]:

$$F_2(i,j) = \frac{F_1(i,j) - \min(F_1)}{\max(F_1) - \min(F_1)}$$

This normalization serves several purposes:

Creates a dimensionless energy measure independent of absolute scale
Ensures consistency across different random initializations
Simplifies agent reward calculations by bounding all possible values

Position-Based Energy Lookup

def get_energy_at_position(self, theta, phi):
    theta_idx = np.argmin(np.abs(np.linspace(0, np.pi, self.resolution) - theta))
    phi_idx = np.argmin(np.abs(np.linspace(0, 2*np.pi, self.resolution) - phi))
    return self.energy_field[phi_idx, theta_idx]

This implements nearest-neighbor interpolation in the spherical coordinate space. For any continuous position (θ,φ), it finds the closest discretized grid point using the L1-norm:

$$\text{idx}_\theta = \arg\min_i |θ_i - θ|$$

$$\text{idx}_\phi = \arg\min_j |\phi_j - \phi|$$

The lookup returns $F_2(\text{idx}\phi, \text{idx}\theta)$, effectively creating a piecewise-constant function over the sphere. This approach was chosen for computational efficiency, though alternative interpolation methods (bilinear, cubic) could provide smoother transitions at the cost of computational complexity.

References

Sherman, Michael. Spatial Statistics and Spatio-Temporal Data : Covariance Functions and Directional Properties / Michael Sherman. Hobooken, N.J: Wiley, 2010.

Swarm Agent Definition

The initial development of MARL system is based on the class definition of the swarm agent, where coordinate location, energy collection, directionality, and movement is defined. This is a simple implementation of the swarm agent rule set, and more will be experimented with in the future. These future experiements include a ruleset introduction where only a single agent can occupy a coordinate location at a time, thus limiting movement and generating a more dynamic environment.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
documentation_assets		documentation_assets
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MARL-Dyson

Multi-Agent Reinforcement Learning for Dynamic Resource Optimization: A Dyson Swarm Simulation

Project Status: Early Development

Overview

Energy Distribution Generation

Discretization of the Spherical Domain

Stochastic Field Generation

Spatial Correlation via Gaussian Filtering

Normalization for Scale Invariance

Position-Based Energy Lookup

References

Swarm Agent Definition

About

Uh oh!

Languages

UmbertoFasci/MARL-DYSON

Folders and files

Latest commit

History

Repository files navigation

MARL-Dyson

Multi-Agent Reinforcement Learning for Dynamic Resource Optimization: A Dyson Swarm Simulation

Project Status: Early Development

Overview

Energy Distribution Generation

Discretization of the Spherical Domain

Stochastic Field Generation

Spatial Correlation via Gaussian Filtering

Normalization for Scale Invariance

Position-Based Energy Lookup

References

Swarm Agent Definition

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages