Skip to content

Latest commit

 

History

History
79 lines (47 loc) · 4.01 KB

File metadata and controls

79 lines (47 loc) · 4.01 KB

Brainmu0

A universal neural foundation model for tokenized brain activity representation

Code is coming soon.

Overview

Brainmu0 is a general-purpose foundation model framework for modeling brain activity across neural recording modalities. It is designed to provide a unified representation space for diverse neural signals, including EEG, calcium imaging, and in vivo electrophysiology.

Instead of building separate models for each modality or task, brainmu0 follows a token-based modeling paradigm: raw neural signals are first converted into compact neural activity tokens, and these tokens are then used by a sequence model for representation learning and downstream prediction.

The goal of brainmu0 is to serve as a reusable backbone for neuroscience research, enabling cross-modality, cross-task, cross-individual, and potentially cross-species neural data modeling.

Core Idea

Neural recordings are complex, high-dimensional, and heterogeneous. Different experimental systems often produce signals with different sampling rates, spatial layouts, noise profiles, and biological interpretations.

brainmu0 addresses this by introducing a shared intermediate representation:

raw neural signals → neural activity tokens → foundation model → task-specific predictions

This design allows different types of neural data to be modeled within a common computational framework.

Architecture Design

At a high level, brainmu0 consists of three conceptual components:

  1. Brain Tokenizer
    Converts raw neural recordings into sequences of discrete neural activity tokens.

  2. Neural Sequence Backbone
    Learns generalizable representations from tokenized neural activity sequences.

  3. Task-Specific Heads
    Adapts the learned representations to downstream neuroscience or biomedical prediction tasks.

The framework supports a two-stage training strategy:

  • Pre-training: learn general neural representations from large-scale multimodal neural datasets.
  • Fine-tuning: adapt the model to specific downstream tasks, such as brain-state classification, event detection, or disease-related prediction.

Brain Tokenizer

The tokenizer is the key interface between biological neural signals and the foundation model.

Its role is to transform continuous, noisy, and modality-specific neural recordings into a compact sequence of neural activity tokens. These tokens preserve essential spatiotemporal information while making the data suitable for scalable sequence modeling.

The tokenizer is designed to:

  • compress raw neural signals into a token sequence;
  • capture local temporal dynamics and spatial structure;
  • reduce modality-specific variability;
  • provide a shared representation format for EEG, calcium imaging, and electrophysiology;
  • support downstream transfer across tasks and datasets.

By representing neural activity as tokens, brainmu0 enables neural data to be processed using a unified modeling paradigm similar to modern foundation models in language, vision, and multimodal learning.

Training Paradigm

brainmu0 is designed around a general pre-training and fine-tuning workflow.

During pre-training, the model learns from large-scale neural recordings across modalities. The objective is to capture reusable neural dynamics and build general-purpose representations of brain activity.

During fine-tuning, the pre-trained model can be adapted to specific research tasks with task-specific prediction heads. Depending on the task, these heads may be used for classification, regression, event detection, or other supervised objectives.

Example Applications

brainmu0 can be used as a foundation model backbone for a wide range of neuroscience tasks, including:

  • EEG-based brain-state classification;
  • memory-related sleep state analysis;
  • calcium activity modeling;
  • arousal and transition-state prediction;
  • neural event detection;
  • cross-dataset neural representation learning;
  • biomedical and neurological signal analysis.

License information will be provided with the first public release.