A universal neural foundation model for tokenized brain activity representation
Code is coming soon.
Brainmu0 is a general-purpose foundation model framework for modeling brain activity across neural recording modalities. It is designed to provide a unified representation space for diverse neural signals, including EEG, calcium imaging, and in vivo electrophysiology.
Instead of building separate models for each modality or task, brainmu0 follows a token-based modeling paradigm: raw neural signals are first converted into compact neural activity tokens, and these tokens are then used by a sequence model for representation learning and downstream prediction.
The goal of brainmu0 is to serve as a reusable backbone for neuroscience research, enabling cross-modality, cross-task, cross-individual, and potentially cross-species neural data modeling.
Neural recordings are complex, high-dimensional, and heterogeneous. Different experimental systems often produce signals with different sampling rates, spatial layouts, noise profiles, and biological interpretations.
brainmu0 addresses this by introducing a shared intermediate representation:
raw neural signals → neural activity tokens → foundation model → task-specific predictions
This design allows different types of neural data to be modeled within a common computational framework.
At a high level, brainmu0 consists of three conceptual components:
-
Brain Tokenizer
Converts raw neural recordings into sequences of discrete neural activity tokens. -
Neural Sequence Backbone
Learns generalizable representations from tokenized neural activity sequences. -
Task-Specific Heads
Adapts the learned representations to downstream neuroscience or biomedical prediction tasks.
The framework supports a two-stage training strategy:
- Pre-training: learn general neural representations from large-scale multimodal neural datasets.
- Fine-tuning: adapt the model to specific downstream tasks, such as brain-state classification, event detection, or disease-related prediction.
The tokenizer is the key interface between biological neural signals and the foundation model.
Its role is to transform continuous, noisy, and modality-specific neural recordings into a compact sequence of neural activity tokens. These tokens preserve essential spatiotemporal information while making the data suitable for scalable sequence modeling.
The tokenizer is designed to:
- compress raw neural signals into a token sequence;
- capture local temporal dynamics and spatial structure;
- reduce modality-specific variability;
- provide a shared representation format for EEG, calcium imaging, and electrophysiology;
- support downstream transfer across tasks and datasets.
By representing neural activity as tokens, brainmu0 enables neural data to be processed using a unified modeling paradigm similar to modern foundation models in language, vision, and multimodal learning.
brainmu0 is designed around a general pre-training and fine-tuning workflow.
During pre-training, the model learns from large-scale neural recordings across modalities. The objective is to capture reusable neural dynamics and build general-purpose representations of brain activity.
During fine-tuning, the pre-trained model can be adapted to specific research tasks with task-specific prediction heads. Depending on the task, these heads may be used for classification, regression, event detection, or other supervised objectives.
brainmu0 can be used as a foundation model backbone for a wide range of neuroscience tasks, including:
- EEG-based brain-state classification;
- memory-related sleep state analysis;
- calcium activity modeling;
- arousal and transition-state prediction;
- neural event detection;
- cross-dataset neural representation learning;
- biomedical and neurological signal analysis.
License information will be provided with the first public release.