This repository contains the implementation for the paper "Spatial Representations Emerge in a Model Linking Global Maps to First-Person Perspectives" by Hin Wai Lui, Elizabeth R. Chrastil, Douglas A. Nitz, and Jeffrey L. Krichmar.
This research investigates the computational mechanisms underlying perspective transformations between first-person perspectives (FPP) and global map perspectives (GMP). The project demonstrates how various spatial representations, similar to those found in the mammalian brain (such as place cells, head direction cells, border cells, and corner cells), emerge spontaneously in a model trained to perform perspective transformation.
The model is based on a Variational Autoencoder (VAE) architecture enhanced with Recurrent Neural Networks (RNNs) and attention mechanisms. The project includes simulations in the Webots robotics environment, where a Khepera robot navigates an arena with colored walls and cylinders.
-
Implementation of three different VAE architectures:
- Single: Processes individual images
- Stacked: Processes sequences of images simultaneously
- RNN: Processes sequences with temporal continuity
-
Analysis tools for identifying emergent spatial representations:
- Place cells
- Head direction cells
- Border cells
- Corner cells
- Object vector cells
-
Perturbation studies to evaluate the importance of different environmental cues
-
Attention mechanisms to visualize which parts of the environment the model focuses on
- Set up your workspace:
# Create a project directory
mkdir spatial-representations
cd spatial-representations- Install dependencies:
pip install -r requirements.txtFirst, run the Webots simulation to collect paired FPP and GMP images:
- Install Webots from cyberbotics.com
- Open the provided Webots world file
- Run the simulation to collect data
The simulation will generate paired images from both perspectives along with metadata about the robot's position and orientation.
Run the parallel training script:
python run_parallel_vae_train.pyThis script will train different VAE architectures (single, stacked, RNN) for perspective transformations in both directions (FPP→GMP and GMP→FPP).
You can customize the training process with various arguments:
--seq_arch: Model architecture type (single, stacked, rnn)--latent_size: Dimensionality of the latent space--batch_size: Batch size for training--epochs: Number of training epochs--background: Environment background (empty_office, stadium, entrance_hall)--attn: Enable attention mechanism
After training, analyze the latent variables:
python latent_variable_analysis.pyThis will:
- Generate visualizations of place fields, head direction tuning, and other spatial representations
- Perform perturbation studies to test the importance of different environmental cues
- Create summary plots and statistics for the paper figures
The project tests multiple configurations:
- 3 different background environments
- 3 random seeds per environment
- 3 different VAE architectures
- 2 transformation directions (FPP→GMP and GMP→FPP)
- Various perturbation conditions (arena rotation, cylinder rotation, background rotation)
If you use this code in your research, please cite our paper:
@article{lui2023spatial,
title={Spatial Representations Emerge in a Model Linking Global Maps to First-Person Perspectives},
author={Lui, Hin Wai and Chrastil, Elizabeth R. and Nitz, Douglas A. and Krichmar, Jeffrey L.},
journal={},
year={2023}
}