Skip to content

BAAI-Brain-Inspired-Group/Brainmu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Brainmu0

A universal neural foundation model for tokenized brain activity representation

Code is coming soon.

Overview

Brainmu0 is a general-purpose foundation model framework for modeling brain activity across neural recording modalities. It is designed to provide a unified representation space for diverse neural signals, including EEG, calcium imaging, and in vivo electrophysiology.

Instead of building separate models for each modality or task, brainmu0 follows a token-based modeling paradigm: raw neural signals are first converted into compact neural activity tokens, and these tokens are then used by a sequence model for representation learning and downstream prediction.

The goal of brainmu0 is to serve as a reusable backbone for neuroscience research, enabling cross-modality, cross-task, cross-individual, and potentially cross-species neural data modeling.

Core Idea

Neural recordings are complex, high-dimensional, and heterogeneous. Different experimental systems often produce signals with different sampling rates, spatial layouts, noise profiles, and biological interpretations.

brainmu0 addresses this by introducing a shared intermediate representation:

raw neural signals → neural activity tokens → foundation model → task-specific predictions

This design allows different types of neural data to be modeled within a common computational framework.

Architecture Design

At a high level, brainmu0 consists of three conceptual components:

  1. Brain Tokenizer
    Converts raw neural recordings into sequences of discrete neural activity tokens.

  2. Neural Sequence Backbone
    Learns generalizable representations from tokenized neural activity sequences.

  3. Task-Specific Heads
    Adapts the learned representations to downstream neuroscience or biomedical prediction tasks.

The framework supports a two-stage training strategy:

  • Pre-training: learn general neural representations from large-scale multimodal neural datasets.
  • Fine-tuning: adapt the model to specific downstream tasks, such as brain-state classification, event detection, or disease-related prediction.

Brain Tokenizer

The tokenizer is the key interface between biological neural signals and the foundation model.

Its role is to transform continuous, noisy, and modality-specific neural recordings into a compact sequence of neural activity tokens. These tokens preserve essential spatiotemporal information while making the data suitable for scalable sequence modeling.

The tokenizer is designed to:

  • compress raw neural signals into a token sequence;
  • capture local temporal dynamics and spatial structure;
  • reduce modality-specific variability;
  • provide a shared representation format for EEG, calcium imaging, and electrophysiology;
  • support downstream transfer across tasks and datasets.

By representing neural activity as tokens, brainmu0 enables neural data to be processed using a unified modeling paradigm similar to modern foundation models in language, vision, and multimodal learning.

Training Paradigm

brainmu0 is designed around a general pre-training and fine-tuning workflow.

During pre-training, the model learns from large-scale neural recordings across modalities. The objective is to capture reusable neural dynamics and build general-purpose representations of brain activity.

During fine-tuning, the pre-trained model can be adapted to specific research tasks with task-specific prediction heads. Depending on the task, these heads may be used for classification, regression, event detection, or other supervised objectives.

Example Applications

brainmu0 can be used as a foundation model backbone for a wide range of neuroscience tasks, including:

  • EEG-based brain-state classification;
  • memory-related sleep state analysis;
  • calcium activity modeling;
  • arousal and transition-state prediction;
  • neural event detection;
  • cross-dataset neural representation learning;
  • biomedical and neurological signal analysis.

License information will be provided with the first public release.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors