Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-10-21 | A Unified Perspective on Optimization in Machine Learning and Neuroscience: From Gradient Descent to Neural Adaptation | Jesús García Fernández, Nasir Ahmad, Marcel van Gerven | Link | Iterative optimization is central to modern artificial intelligence (AI) and provides a crucial framework for understanding adaptive systems. This review provides a unified perspective on this subject, bridging classic theory with neural network training and biological learning. Although gradient-based methods, powered by the efficient but biologically implausible backpropagation (BP), dominate machine learning, their computational demands can hinder scalability in high-dimensional settings. In contrast, derivative-free or zeroth-order (ZO) optimization feature computationally lighter approaches that rely only on function evaluations and randomness. While generally less sample efficient, recent breakthroughs demonstrate that modern ZO methods can effectively approximate gradients and achieve performance competitive with BP in neural network models. This ZO paradigm is also particularly relevant for biology. Its core principles of random exploration (probing) and feedback-guided adaptation (reinforcing) parallel key mechanisms of biological learning, offering a mathematically principled perspective on how the brain learns. In this review, we begin by categorizing optimization approaches based on the order of derivative information they utilize, ranging from first-, second-, and higher-order gradient-based to ZO methods. We then explore how these methods are adapted to the unique challenges of neural network training and the resulting learning dynamics. Finally, we build upon these insights to view biological learning through an optimization lens, arguing that a ZO paradigm leverages the brain's intrinsic noise as a computational resource. This framework not only illuminates our understanding of natural intelligence but also holds vast implications for neuromorphic hardware, helping us design fast and energy-efficient AI systems that exploit intrinsic hardware noise. |
2025-10-21 | Topoformer: brain-like topographic organization in Transformer language models through spatial querying and reweighting | Taha Binhuraib, Greta Tuckute, Nicholas Blauch | Link | Spatial functional organization is a hallmark of biological brains: neurons are arranged topographically according to their response properties, at multiple scales. In contrast, representations within most machine learning models lack spatial biases, instead manifesting as disorganized vector spaces that are difficult to visualize and interpret. Here, we propose a novel form of self-attention that turns Transformers into "Topoformers" with topographic organization. We introduce spatial querying - where keys and queries are arranged on 2D grids, and local pools of queries are associated with a given key - and spatial reweighting, where we convert the standard fully connected layer of self-attention into a locally connected layer. We first demonstrate the feasibility of our approach by training a 1-layer Topoformer on a sentiment classification task. Training with spatial querying encourages topographic organization in the queries and keys, and spatial reweighting separately encourages topographic organization in the values and self-attention outputs. We then apply the Topoformer motifs at scale, training a BERT architecture with a masked language modeling objective. We find that the topographic variant performs on par with a non-topographic control model on NLP benchmarks, yet produces interpretable topographic organization as evaluated via eight linguistic test suites. Finally, analyzing an fMRI dataset of human brain responses to a large set of naturalistic sentences, we demonstrate alignment between low-dimensional topographic variability in the Topoformer model and human brain language network. Scaling up Topoformers further holds promise for greater interpretability in NLP research, and for more accurate models of the organization of linguistic information in the human brain. |
2025-10-21 | Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware SSL | Sangyoon Bae, Mehdi Azabou, Jiook Cha, Blake Richards | Link | Self-supervised learning (SSL) holds a great deal of promise for applications in neuroscience, due to the lack of large-scale, consistently labeled neural datasets. However, most neural datasets contain heterogeneous populations that mix stable, predictable cells with highly stochastic, stimulus-contingent ones, which has made it hard to identify consistent activity patterns during SSL. As a result, self-supervised pretraining has yet to show clear signs of benefits from scale on neural data. Here, we present a novel approach to self-supervised pretraining, POYO-SSL that exploits the heterogeneity of neural data to improve pre-training and achieve benefits of scale. Specifically, in POYO-SSL we pretrain only on predictable (statistically regular) neurons-identified on the pretraining split via simple higher-order statistics (skewness and kurtosis)-then we fine-tune on the unpredictable population for downstream tasks. On the Allen Brain Observatory dataset, this strategy yields approximately 12-13% relative gains over from-scratch training and exhibits smooth, monotonic scaling with model size. In contrast, existing state-of-the-art baselines plateau or destabilize as model size increases. By making predictability an explicit metric for crafting the data diet, POYO-SSL turns heterogeneity from a liability into an asset, providing a robust, biologically grounded recipe for scalable neural decoding and a path toward foundation models of neural dynamics. |
2025-10-21 | Entropy-Enhanced Conformal Features from Ricci Flow for Robust Alzheimer's Disease Classification | F. Ahmadi, B. Bidabad, H. Nasiri | Link | Background and Objective: In brain imaging, geometric surface models are essential for analyzing the 3D shapes of anatomical structures. Alzheimer's disease (AD) is associated with significant cortical atrophy, making such shape analysis a valuable diagnostic tool. The objective of this study is to introduce and validate a novel local surface representation method for the automated and accurate diagnosis of AD. Methods: The study utilizes T1-weighted MRI scans from 160 participants (80 AD patients and 80 healthy controls) from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Cortical surface models were reconstructed from the MRI data using Freesurfer. Key geometric attributes were computed from the 3D meshes. Area distortion and conformal factor were derived using Ricci flow for conformal parameterization, while Gaussian curvature was calculated directly from the mesh geometry. Shannon entropy was applied to these three features to create compact and informative feature vectors. The feature vectors were used to train and evaluate a suite of classifiers (e.g. XGBoost, MLP, Logistic Regression, etc.). Results: Statistical significance of performance differences between classifiers was evaluated using paired Welch's t-test. The method proved highly effective in distinguishing AD patients from healthy controls. The Multi-Layer Perceptron (MLP) and Logistic Regression classifiers outperformed all others, achieving an accuracy and F$_1$ Score of 98.62%. Conclusions: This study confirms that the entropy of conformally-derived geometric features provides a powerful and robust metric for cortical morphometry. The high classification accuracy underscores the method's potential to enhance the study and diagnosis of Alzheimer's disease, offering a straightforward yet powerful tool for clinical research applications. |
2025-10-21 | Quantification of dual-state 5-ALA-induced PpIX fluorescence: Methodology and validation in tissue-mimicking phantoms | Silvère Ségaud, Charlie Budd, Matthew Elliot, Graeme Stasiuk, Jonathan Shapey, Yijing Xie, Tom Vercauteren | Link | Quantification of protoporphyrin IX (PpIX) fluorescence in human brain tumours has the potential to significantly improve patient outcomes in neuro-oncology, but represents a formidable imaging challenge. Protoporphyrin is a biological molecule which interacts with the tissue micro-environment to form two photochemical states in glioma. Each exhibits markedly different quantum efficiencies, with distinct but overlapping emission spectra that also overlap with tissue autofluorescence. Fluorescence emission is known to be distorted by the intrinsic optical properties of tissue, coupled with marked intra-tumoural heterogeneity as a hallmark of glioma tumours. Existing quantitative fluorescence systems are developed and validated using simplified phantoms that do not simultaneously mimic the complex interactions between fluorophores and tissue optical properties or micro-environment. Consequently, existing systems risk introducing systematic errors into PpIX quantification when used in tissue. In this work, we introduce a novel pipeline for quantification of PpIX in glioma, which robustly differentiates both emission states from background autofluorescence without reliance on a priori spectral information, and accounts for variations in their quantum efficiency. Unmixed PpIX emission forms are then corrected for wavelength-dependent optical distortions and weighted for accurate quantification. Significantly, this pipeline is developed and validated using novel tissue-mimicking phantoms replicating the optical properties of glioma tissues and photochemical variability of PpIX fluorescence in glioma. Our workflow achieves strong correlation with ground-truth PpIX concentrations (R2 = 0.918+-0.002), demonstrating its potential for robust, quantitative PpIX fluorescence imaging in clinical settings. |
2025-10-20 | MEG-GPT: A transformer-based foundation model for magnetoencephalography data | Rukuang Huang, Sungjun Cho, Chetan Gohil, Oiwi Parker Jones, Mark Woolrich | Link | Modelling the complex spatiotemporal patterns of large-scale brain dynamics is crucial for neuroscience, but traditional methods fail to capture the rich structure in modalities such as magnetoencephalography (MEG). Recent advances in deep learning have enabled significant progress in other domains, such as language and vision, by using foundation models at scale. Here, we introduce MEG-GPT, a transformer based foundation model that uses time-attention and next time-point prediction. To facilitate this, we also introduce a novel data-driven tokeniser for continuous MEG data, which preserves the high temporal resolution of continuous MEG signals without lossy transformations. We trained MEG-GPT on tokenised brain region time-courses extracted from a large-scale MEG dataset (N=612, eyes-closed rest, Cam-CAN data), and show that the learnt model can generate data with realistic spatio-spectral properties, including transient events and population variability. Critically, it performs well in downstream decoding tasks, improving downstream supervised prediction task, showing improved zero-shot generalisation across sessions (improving accuracy from 0.54 to 0.59) and subjects (improving accuracy from 0.41 to 0.49) compared to a baseline methods. Furthermore, we show the model can be efficiently fine-tuned on a smaller labelled dataset to boost performance in cross-subject decoding scenarios. This work establishes a powerful foundation model for electrophysiological data, paving the way for applications in computational neuroscience and neural decoding. |
2025-10-22 | Benchmarking Probabilistic Time Series Forecasting Models on Neural Activity | Ziyu Lu, Anna J. Li, Alexander E. Ladd, Pascha Matveev, Aditya Deole, Eric Shea-Brown, J. Nathan Kutz, Nicholas A. Steinmetz | Link | Neural activity forecasting is central to understanding neural systems and enabling closed-loop control. While deep learning has recently advanced the state-of-the-art in the time series forecasting literature, its application to neural activity forecasting remains limited. To bridge this gap, we systematically evaluated eight probabilistic deep learning models, including two foundation models, that have demonstrated strong performance on general forecasting benchmarks. We compared them against four classical statistical models and two baseline methods on spontaneous neural activity recorded from mouse cortex via widefield imaging. Across prediction horizons, several deep learning models consistently outperformed classical approaches, with the best model producing informative forecasts up to 1.5 seconds into the future. Our findings point toward future control applications and open new avenues for probing the intrinsic temporal structure of neural activity. |
2025-10-20 | Investigating Demographic Bias in Brain MRI Segmentation: A Comparative Study of Deep-Learning and Non-Deep-Learning Methods | Ghazal Danaee, Marc Niethammer, Jarrett Rushmore, Sylvain Bouix | Link | Deep-learning-based segmentation algorithms have substantially advanced the field of medical image analysis, particularly in structural delineations in MRIs. However, an important consideration is the intrinsic bias in the data. Concerns about unfairness, such as performance disparities based on sensitive attributes like race and sex, are increasingly urgent. In this work, we evaluate the results of three different segmentation models (UNesT, nnU-Net, and CoTr) and a traditional atlas-based method (ANTs), applied to segment the left and right nucleus accumbens (NAc) in MRI images. We utilize a dataset including four demographic subgroups: black female, black male, white female, and white male. We employ manually labeled gold-standard segmentations to train and test segmentation models. This study consists of two parts: the first assesses the segmentation performance of models, while the second measures the volumes they produce to evaluate the effects of race, sex, and their interaction. Fairness is quantitatively measured using a metric designed to quantify fairness in segmentation performance. Additionally, linear mixed models analyze the impact of demographic variables on segmentation accuracy and derived volumes. Training on the same race as the test subjects leads to significantly better segmentation accuracy for some models. ANTs and UNesT show notable improvements in segmentation accuracy when trained and tested on race-matched data, unlike nnU-Net, which demonstrates robust performance independent of demographic matching. Finally, we examine sex and race effects on the volume of the NAc using segmentations from the manual rater and from our biased models. Results reveal that the sex effects observed with manual segmentation can also be observed with biased models, whereas the race effects disappear in all but one model. |
2025-10-20 | Robobench: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models as Embodied Brain | Yulin Luo, Chun-Kai Fan, Menghang Dong, Jiayu Shi, Mengdi Zhao, Bo-Wen Zhang, Cheng Chi, Jiaming Liu, Gaole Dai, Rongyu Zhang, Ruichuan An, Kun Wu, Zhengping Che, Shaoxuan Xie, Guocai Yao, Zhongxia Zhao, Pengwei Wang, Guang Liu, Zhongyuan Wang, Tiejun Huang, Shanghang Zhang | Link | Building robots that can perceive, reason, and act in dynamic, unstructured environments remains a core challenge. Recent embodied systems often adopt a dual-system paradigm, where System 2 handles high-level reasoning while System 1 executes low-level control. In this work, we refer to System 2 as the embodied brain, emphasizing its role as the cognitive core for reasoning and decision-making in manipulation tasks. Given this role, systematic evaluation of the embodied brain is essential. Yet existing benchmarks emphasize execution success, or when targeting high-level reasoning, suffer from incomplete dimensions and limited task realism, offering only a partial picture of cognitive capability. To bridge this gap, we introduce RoboBench, a benchmark that systematically evaluates multimodal large language models (MLLMs) as embodied brains. Motivated by the critical roles across the full manipulation pipeline, RoboBench defines five dimensions-instruction comprehension, perception reasoning, generalized planning, affordance prediction, and failure analysis-spanning 14 capabilities, 25 tasks, and 6092 QA pairs. To ensure realism, we curate datasets across diverse embodiments, attribute-rich objects, and multi-view scenes, drawing from large-scale real robotic data. For planning, RoboBench introduces an evaluation framework, MLLM-as-world-simulator. It evaluate embodied feasibility by simulating whether predicted plans can achieve critical object-state changes. Experiments on 14 MLLMs reveal fundamental limitations: difficulties with implicit instruction comprehension, spatiotemporal reasoning, cross-scenario planning, fine-grained affordance understanding, and execution failure diagnosis. RoboBench provides a comprehensive scaffold to quantify high-level cognition, and guide the development of next-generation embodied MLLMs. The project page is in https://robo-bench.github.io. |
2025-10-20 | Navigate in Demanding Missions: Integrating Human Intelligence and Brain-Inspired Intelligence | Xu He, Xiaolin Meng, Youdong Zhang, Lingfei Mo, Wenxuan Yin | Link | This perspective analyzes the intricate interplay among neuroscience, Brain-Inspired Intelligence (BII), and Brain-Inspired Navigation (BIN), revealing a current lack of cooperative relationship between Brain-Computer Interfaces (BCIs) and BIN fields. We advocate for the integration of neuromorphic-empowered BCI into BIN, thereby bolstering the unmanned systems' reliable navigation in demanding missions, such as deep space exploration, etc. We highlight that machine intelligence, reinforced by brain-inspired artificial consciousness, can extend human intelligence, with human intelligence mediated by neuromorphic-enabled BCI acting as a safeguard in case machine intelligence failures. This study also discusses the potentials of the proposed approach to enhance unmanned systems' capabilities and facilitate the diagnostics of spatial cognition disorders, while considering associated ethical and security concerns. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-10-20 | DAMSDAN: Distribution-Aware Multi-Source Domain Adaptation Network for Cross-Domain EEG-based Emotion Recognition | Fo Hu, Can Wang, Qinxu Zheng, Xusheng Yang, Bin Zhou, Gang Li, Yu Sun, Wen-an Zhang | Link | Significant inter-individual variability limits the generalization of EEG-based emotion recognition under cross-domain settings. We address two core challenges in multi-source adaptation: (1) dynamically modeling distributional heterogeneity across sources and quantifying their relevance to a target to reduce negative transfer; and (2) achieving fine-grained semantic consistency to strengthen class discrimination. We propose a distribution-aware multi-source domain adaptation network (DAMSDAN). DAMSDAN integrates prototype-based constraints with adversarial learning to drive the encoder toward discriminative, domain-invariant emotion representations. A domain-aware source weighting strategy based on maximum mean discrepancy (MMD) dynamically estimates inter-domain shifts and reweights source contributions. In addition, a prototype-guided conditional alignment module with dual pseudo-label interaction enhances pseudo-label reliability and enables category-level, fine-grained alignment, mitigating noise propagation and semantic drift. Experiments on SEED and SEED-IV show average accuracies of 94.86\% and 79.78\% for cross-subject, and 95.12\% and 83.15\% for cross-session protocols. On the large-scale FACED dataset, DAMSDAN achieves 82.88\% (cross-subject). Extensive ablations and interpretability analyses corroborate the effectiveness of the proposed framework for cross-domain EEG-based emotion recognition. |
2025-10-18 | NeurIPT: Foundation Model for Neural Interfaces | Zitao Fang, Chenxuan Li, Hongting Zhou, Shuyang Yu, Guodong Du, Ashwaq Qasem, Yang Lu, Jing Li, Junsong Zhang, Sim Kuan Goh | Link | Electroencephalography (EEG) has wide-ranging applications, from clinical diagnosis to brain-computer interfaces (BCIs). With the increasing volume and variety of EEG data, there has been growing interest in establishing foundation models (FMs) to scale up and generalize neural decoding. Despite showing early potential, applying FMs to EEG remains challenging due to substantial inter-subject, inter-task, and inter-condition variability, as well as diverse electrode configurations across recording setups. To tackle these open challenges, we propose NeurIPT, a foundation model developed for diverse EEG-based Neural Interfaces with a Pre-trained Transformer by capturing both homogeneous and heterogeneous spatio-temporal characteristics inherent in EEG signals. Temporally, we introduce Amplitude-Aware Masked Pretraining (AAMP), masking based on signal amplitude rather than random intervals, to learn robust representations across varying signal intensities beyond local interpolation. Moreover, this temporal representation is enhanced by a Progressive Mixture-of-Experts (PMoE) architecture, where specialized expert subnetworks are progressively introduced at deeper layers, adapting effectively to the diverse temporal characteristics of EEG signals. Spatially, NeurIPT leverages the 3D physical coordinates of electrodes, enabling effective transfer of embedding across varying EEG settings, and develops Intra-Inter Lobe Pooling (IILP) during fine-tuning to efficiently exploit regional brain features. Empirical evaluations across eight downstream BCI datasets, via fine-tuning, demonstrated NeurIPT consistently achieved state-of-the-art performance, highlighting its broad applicability and robust generalization. Our work pushes forward the state of FMs in EEG and offers insights into scalable and generalizable neural information processing systems. |
2025-10-17 | Decoding Listeners Identity: Person Identification from EEG Signals Using a Lightweight Spiking Transformer | Zheyuan Lin, Siqi Cai, Haizhou Li | Link | EEG-based person identification enables applications in security, personalized brain-computer interfaces (BCIs), and cognitive monitoring. However, existing techniques often rely on deep learning architectures at high computational cost, limiting their scope of applications. In this study, we propose a novel EEG person identification approach using spiking neural networks (SNNs) with a lightweight spiking transformer for efficiency and effectiveness. The proposed SNN model is capable of handling the temporal complexities inherent in EEG signals. On the EEG-Music Emotion Recognition Challenge dataset, the proposed model achieves 100% classification accuracy with less than 10% energy consumption of traditional deep neural networks. This study offers a promising direction for energy-efficient and high-performance BCIs. The source code is available at https://github.com/PatrickZLin/Decode-ListenerIdentity. |
2025-10-17 | Cortical-SSM: A Deep State Space Model for EEG and ECoG Motor Imagery Decoding | Shuntaro Suzuki, Shunya Nagashima, Masayuki Hirata, Komei Sugiura | Link | Classification of electroencephalogram (EEG) and electrocorticogram (ECoG) signals obtained during motor imagery (MI) has substantial application potential, including for communication assistance and rehabilitation support for patients with motor impairments. These signals remain inherently susceptible to physiological artifacts (e.g., eye blinking, swallowing), which pose persistent challenges. Although Transformer-based approaches for classifying EEG and ECoG signals have been widely adopted, they often struggle to capture fine-grained dependencies within them. To overcome these limitations, we propose Cortical-SSM, a novel architecture that extends deep state space models to capture integrated dependencies of EEG and ECoG signals across temporal, spatial, and frequency domains. We validated our method across three benchmarks: 1) two large-scale public MI EEG datasets containing more than 50 subjects, and 2) a clinical MI ECoG dataset recorded from a patient with amyotrophic lateral sclerosis. Our method outperformed baseline methods on the three benchmarks. Furthermore, visual explanations derived from our model indicate that it effectively captures neurophysiologically relevant regions of both EEG and ECoG signals. |
2025-10-16 | TRI-DEP: A Trimodal Comparative Study for Depression Detection Using Speech, Text, and EEG | Annisaa Fitri Nurfidausi, Eleonora Mancini, Paolo Torroni | Link | Depression is a widespread mental health disorder, yet its automatic detection remains challenging. Prior work has explored unimodal and multimodal approaches, with multimodal systems showing promise by leveraging complementary signals. However, existing studies are limited in scope, lack systematic comparisons of features, and suffer from inconsistent evaluation protocols. We address these gaps by systematically exploring feature representations and modelling strategies across EEG, together with speech and text. We evaluate handcrafted features versus pre-trained embeddings, assess the effectiveness of different neural encoders, compare unimodal, bimodal, and trimodal configurations, and analyse fusion strategies with attention to the role of EEG. Consistent subject-independent splits are applied to ensure robust, reproducible benchmarking. Our results show that (i) the combination of EEG, speech and text modalities enhances multimodal detection, (ii) pretrained embeddings outperform handcrafted features, and (iii) carefully designed trimodal models achieve state-of-the-art performance. Our work lays the groundwork for future research in multimodal depression detection. |
2025-10-16 | Using Information Geometry to Characterize Higher-Order Interactions in EEG | Eric Albers, Paul Marriott, Masami Tatsuno | Link | In neuroscience, methods from information geometry (IG) have been successfully applied in the modelling of binary vectors from spike train data, using the orthogonal decomposition of the Kullback-Leibler divergence and mutual information to isolate different orders of interaction between neurons. While spike train data is well-approximated with a binary model, here we apply these IG methods to data from electroencephalography (EEG), a continuous signal requiring appropriate discretization strategies. We developed and compared three different binarization methods and used them to identify third-order interactions in an experiment involving imagined motor movements. The statistical significance of these interactions was assessed using phase-randomized surrogate data that eliminated higher-order dependencies while preserving the spectral characteristics of the original signals. We validated our approach by implementing known second- and third-order dependencies in a forward model and quantified information attenuation at different steps of the analysis. This revealed that the greatest loss in information occurred when going from the idealized binary case to enforcing these dependencies using oscillatory signals. When applied to the real EEG dataset, our analysis detected statistically significant third-order interactions during the task condition despite the relatively sparse data (45 trials per condition). This work demonstrates that IG methods can successfully extract genuine higher-order dependencies from continuous neural recordings when paired with appropriate binarization schemes. |
2025-10-15 | DistilCLIP-EEG: Enhancing Epileptic Seizure Detection Through Multi-modal Learning and Knowledge Distillation | Zexin Wang, Lin Shi, Haoyu Wu, Junru Luo, Xiangzeng Kong, Jun Qi | Link | Epilepsy is a prevalent neurological disorder marked by sudden, brief episodes of excessive neuronal activity caused by abnormal electrical discharges, which may lead to some mental disorders. Most existing deep learning methods for epilepsy detection rely solely on unimodal EEG signals, neglecting the potential benefits of multimodal information. To address this, we propose a novel multimodal model, DistilCLIP-EEG, based on the CLIP framework, which integrates both EEG signals and text descriptions to capture comprehensive features of epileptic seizures. The model involves an EEG encoder based on the Conformer architecture as a text encoder, the proposed Learnable BERT (BERT-LP) as prompt learning within the encoders. Both operate in a shared latent space for effective cross-modal representation learning. To enhance efficiency and adaptability, we introduce a knowledge distillation method where the trained DistilCLIP-EEG serves as a teacher to guide a more compact student model to reduce training complexity and time. On the TUSZ, AUBMC, and CHB-MIT datasets, both the teacher and student models achieved accuracy rates exceeding 97%. Across all datasets, the F1-scores were consistently above 0.94, demonstrating the robustness and reliability of the proposed framework. Moreover, the student model's parameter count and model size are approximately 58.1% of those of the teacher model, significantly reducing model complexity and storage requirements while maintaining high performance. These results highlight the potential of our proposed model for EEG-based epilepsy detection and establish a solid foundation for deploying lightweight models in resource-constrained settings. |
2025-10-15 | Working Memory Functional Connectivity Analysis for Dementia Classification using EEG | Shivani Ranjan, Anant Jain, Robin Badal, Amit Kumar, Harshal Shende, Deepak Joshi, Pramod Yadav, Lalan Kumar | Link | Background: Dementia, particularly Alzheimer's Disease (AD), is a progressive neurodegenerative disorder marked by cognitive decline. Early detection, especially at the Mild Cognitive Impairment (MCI) stage, is essential for timely intervention. Working Memory (WM) impairment is a key early indicator of neurodegeneration, affecting higher cognitive processes. Electroencephalography (EEG), with its high temporal resolution, offers a cost-effective method to assess brain dynamics. This study investigates WM-related EEG functional connectivity (FC) to identify brain network alterations across dementia stages. Methods: EEG signals were recorded from 24 participants (8 AD, 8 MCI, and 8 healthy controls) during WM tasks, including encoding, recall, and retrieval stages. Data preprocessing involved noise reduction and feature extraction using Spherical and Head Harmonic Decomposition (SHD, HHD). FC was quantified using Cross-Plot Transition Entropy (CPTE) and Phase Lag Index (PLI). Network metrics such as Degree and Eigenvector Centrality were analyzed using Support Vector Machine, Random Forest, and XGBoost classifiers. Results: The CPTE-based connectivity metrics outperformed the traditional PLI approach in differentiating dementia stages, attaining a peak classification accuracy of 97.53% during the retrieval phase with the Random Forest model. A connectivity threshold of 0.5 was optimal for network discrimination. SHD and HHD features also demonstrated strong discriminative potential. AD subjects exhibited higher synchronization patterns during WM tasks than healthy controls. Conclusions: The integration of WM tasks with EEG-based FC analysis provides a robust framework for dementia classification. The proposed CPTE-based approach offers a robust, scalable, non-invasive, and effective diagnostic tool for early detection and monitoring of neurodegenerative diseases. |
2025-10-15 | NeuroRVQ: Multi-Scale EEG Tokenization for Generative Large Brainwave Models | Konstantinos Barmpas, Na Lee, Alexandros Koliousis, Yannis Panagakis, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou | Link | Electroencephalography (EEG) captures neural activity across multiple temporal and spectral scales, yielding signals that are rich but complex for representation learning. Recently, EEG foundation models trained to predict masked signal-tokens have shown promise for learning generalizable representations. However, their performance is hindered by their signal tokenization modules. Existing neural tokenizers fail to preserve high-frequency dynamics, limiting their ability to reconstruct EEG signals with high fidelity. We introduce NeuroRVQ, a scalable Large Brainwave Model (LBM) centered on a codebook-based tokenizer. Our tokenizer integrates: (i) multi-scale feature extraction modules that capture the full frequency neural spectrum; (ii) hierarchical residual vector quantization (RVQ) codebooks for high-resolution encoding; and, (iii) an EEG signal phase- and amplitude-aware loss function for efficient training. This design enables efficient EEG compression while supporting accurate reconstruction across all frequency bands, leading to robust generative masked modeling. Our empirical results demonstrate that NeuroRVQ achieves lower reconstruction error and outperforms existing LBMs on a variety of downstream tasks. More broadly, NeuroRVQ tokenizer establishes a strong prior for codebook-based general-purpose brainwave models, enabling advances in neural decoding, generative modeling and multimodal biosignal integration. |
2025-10-14 | Deep Learning-Based Visual Fatigue Detection Using Eye Gaze Patterns in VR | Numan Zafar, Johnathan Locke, Shafique Ahmad Chaudhry | Link | Prolonged exposure to virtual reality (VR) systems leads to visual fatigue, impairs user comfort, performance, and safety, particularly in high-stakes or long-duration applications. Existing fatigue detection approaches rely on subjective questionnaires or intrusive physiological signals, such as EEG, heart rate, or eye-blink count, which limit their scalability and real-time applicability. This paper introduces a deep learning-based study for detecting visual fatigue using continuous eye-gaze trajectories recorded in VR. We use the GazeBaseVR dataset comprising binocular eye-tracking data from 407 participants across five immersive tasks, extract cyclopean eye-gaze angles, and evaluate six deep classifiers. Our results demonstrate that EKYT achieves up to 94% accuracy, particularly in tasks demanding high visual attention, such as video viewing and text reading. We further analyze gaze variance and subjective fatigue measures, indicating significant behavioral differences between fatigued and non-fatigued conditions. These findings establish eye-gaze dynamics as a reliable and nonintrusive modality for continuous fatigue detection in immersive VR, offering practical implications for adaptive human-computer interactions. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-10-20 | Navigate in Demanding Missions: Integrating Human Intelligence and Brain-Inspired Intelligence | Xu He, Xiaolin Meng, Youdong Zhang, Lingfei Mo, Wenxuan Yin | Link | This perspective analyzes the intricate interplay among neuroscience, Brain-Inspired Intelligence (BII), and Brain-Inspired Navigation (BIN), revealing a current lack of cooperative relationship between Brain-Computer Interfaces (BCIs) and BIN fields. We advocate for the integration of neuromorphic-empowered BCI into BIN, thereby bolstering the unmanned systems' reliable navigation in demanding missions, such as deep space exploration, etc. We highlight that machine intelligence, reinforced by brain-inspired artificial consciousness, can extend human intelligence, with human intelligence mediated by neuromorphic-enabled BCI acting as a safeguard in case machine intelligence failures. This study also discusses the potentials of the proposed approach to enhance unmanned systems' capabilities and facilitate the diagnostics of spatial cognition disorders, while considering associated ethical and security concerns. |
2025-10-18 | NeurIPT: Foundation Model for Neural Interfaces | Zitao Fang, Chenxuan Li, Hongting Zhou, Shuyang Yu, Guodong Du, Ashwaq Qasem, Yang Lu, Jing Li, Junsong Zhang, Sim Kuan Goh | Link | Electroencephalography (EEG) has wide-ranging applications, from clinical diagnosis to brain-computer interfaces (BCIs). With the increasing volume and variety of EEG data, there has been growing interest in establishing foundation models (FMs) to scale up and generalize neural decoding. Despite showing early potential, applying FMs to EEG remains challenging due to substantial inter-subject, inter-task, and inter-condition variability, as well as diverse electrode configurations across recording setups. To tackle these open challenges, we propose NeurIPT, a foundation model developed for diverse EEG-based Neural Interfaces with a Pre-trained Transformer by capturing both homogeneous and heterogeneous spatio-temporal characteristics inherent in EEG signals. Temporally, we introduce Amplitude-Aware Masked Pretraining (AAMP), masking based on signal amplitude rather than random intervals, to learn robust representations across varying signal intensities beyond local interpolation. Moreover, this temporal representation is enhanced by a Progressive Mixture-of-Experts (PMoE) architecture, where specialized expert subnetworks are progressively introduced at deeper layers, adapting effectively to the diverse temporal characteristics of EEG signals. Spatially, NeurIPT leverages the 3D physical coordinates of electrodes, enabling effective transfer of embedding across varying EEG settings, and develops Intra-Inter Lobe Pooling (IILP) during fine-tuning to efficiently exploit regional brain features. Empirical evaluations across eight downstream BCI datasets, via fine-tuning, demonstrated NeurIPT consistently achieved state-of-the-art performance, highlighting its broad applicability and robust generalization. Our work pushes forward the state of FMs in EEG and offers insights into scalable and generalizable neural information processing systems. |
2025-10-17 | Decoding Listeners Identity: Person Identification from EEG Signals Using a Lightweight Spiking Transformer | Zheyuan Lin, Siqi Cai, Haizhou Li | Link | EEG-based person identification enables applications in security, personalized brain-computer interfaces (BCIs), and cognitive monitoring. However, existing techniques often rely on deep learning architectures at high computational cost, limiting their scope of applications. In this study, we propose a novel EEG person identification approach using spiking neural networks (SNNs) with a lightweight spiking transformer for efficiency and effectiveness. The proposed SNN model is capable of handling the temporal complexities inherent in EEG signals. On the EEG-Music Emotion Recognition Challenge dataset, the proposed model achieves 100% classification accuracy with less than 10% energy consumption of traditional deep neural networks. This study offers a promising direction for energy-efficient and high-performance BCIs. The source code is available at https://github.com/PatrickZLin/Decode-ListenerIdentity. |
2025-10-15 | EEGChaT: A Transformer-Based Modular Channel Selector for SEEG Analysis | Chen Wang, Yansen Wang, Dongqi Han, Zilong Wang, Dongsheng Li | Link | Analyzing stereoelectroencephalography (SEEG) signals is critical for brain-computer interface (BCI) applications and neuroscience research, yet poses significant challenges due to the large number of input channels and their heterogeneous relevance. Traditional channel selection methods struggle to scale or provide meaningful interpretability for SEEG data. In this work, we propose EEGChaT, a novel Transformer-based channel selection module designed to automatically identify the most task-relevant channels in SEEG recordings. EEGChaT introduces Channel Aggregation Tokens (CATs) to aggregate information across channels, and leverages an improved Attention Rollout technique to compute interpretable, quantitative channel importance scores. We evaluate EEGChaT on the DuIN dataset, demonstrating that integrating EEGChaT with existing classification models consistently improves decoding accuracy, achieving up to 17\% absolute gains. Furthermore, the channel weights produced by EEGChaT show substantial overlap with manually selected channels, supporting the interpretability of the approach. Our results suggest that EEGChaT is an effective and generalizable solution for channel selection in high-dimensional SEEG analysis, offering both enhanced performance and insights into neural signal relevance. |
2025-10-14 | HEAR: An EEG Foundation Model with Heterogeneous Electrode Adaptive Representation | Zhige Chen, Chengxuan Qin, Wenlong You, Rui Liu, Congying Chu, Rui Yang, Kay Chen Tan, Jibin Wu | Link | Electroencephalography (EEG) is an essential technique for neuroscience research and brain-computer interface (BCI) applications. Recently, large-scale EEG foundation models have been developed, exhibiting robust generalization capabilities across diverse tasks and subjects. However, the heterogeneity of EEG devices not only hinders the widespread adoption of these models but also poses significant challenges to their further scaling and development. In this paper, we introduce HEAR, the first EEG foundation model explicitly designed to support heterogeneous EEG devices, accommodating varying electrode layouts and electrode counts. HEAR employs a learnable, coordinate-based spatial embedding to map electrodes with diverse layouts and varying counts into a unified representational space. This unified spatial representation is then processed by a novel spatially-guided transformer, which effectively captures spatiotemporal dependencies across electrodes. To support the development of HEAR, we construct a large-scale EEG dataset comprising 8,782 hours of data collected from over 150 distinct electrode layouts with up to 1,132 electrodes. Experimental results demonstrate that HEAR substantially outperforms existing EEG foundation models in supporting heterogeneous EEG devices and generalizing across diverse cognitive tasks and subjects. |
2025-10-12 | The Cost of Simplicity: How Reducing EEG Electrodes Affects Source Localization and BCI Accuracy | Eva Guttmann-Flury, Yanyan Wei, Shan Zhao, Jian Zhao, Mohamad Sawan | Link | Electrode density optimization in electroencephalography (EEG)-based Brain-Computer Interfaces (BCIs) requires balancing practical usability against signal fidelity, particularly for source localization. Reducing electrodes enhances portability but its effects on neural source reconstruction quality and source connectivity - treated as proxies to BCI performance - remain understudied. We address this gap through systematic evaluation of 62-, 32-, and 16-channel configurations using a fixed, fully automated processing pipeline applied to the well-characterized P300 potential. This approach's rationale is to minimize variability and bias inherent to EEG analysis by leveraging the P300's stimulus-locked reproducibility and pipeline standardization. Analyzing 63 sessions (31 subjects) from the Eye-BCI dataset with rigorous artifact correction and channel validation, we demonstrate: (1) Progressive degradation in source reconstruction quality with sparser configurations, including obscured deep neural generators and spatiotemporal distortions; (2) A novel sqrt(Re) scaling law linking electrode reduction ratio (Re) to localization accuracy - a previously unquantified relationship to the best of our knowledge; (3) While reduced configurations preserve basic P300 topography and may suffice for communicative BCIs, higher-density channels are essential for reliable deep source reconstruction. Overall, this study establishes a first step towards quantitative benchmarks for electrode selection, with critical implications for clinical BCIs requiring anatomical precision in applications like neurodegenerative disease monitoring, where compromised spatial resolution could mask pathological signatures. Most importantly, the sqrt(Re) scaling law may provide the first principled method to determine the minimal electrode density required based on acceptable error margins or expected effect sizes. |
2025-10-12 | Does Re-referencing Matter? Large Laplacian Filter Optimizes Single-Trial P300 BCI Performance | Eva Guttmann-Flury, Jian Zhao, Mohamad Sawan | Link | Electroencephalography (EEG) provides a non-invasive window into brain activity, enabling Brain-Computer Interfaces (BCIs) for communication and control. However, their performance is limited by signal fidelity issues, among which the choice of re-referencing strategy is a pervasive but often overlooked preprocessing bias. Addressing controversies about its necessity and optimal choice, we adopted a quantified approach to evaluate four strategies - no re-referencing, Common Average Reference (CAR), small Laplacian, and large Laplacian - using 62-channels EEG (31 subjects, 2,520 trials). To our knowledge, this is the first study systematically quantifying their impact on single-trial P300 classification accuracy. Our controlled pipeline isolated re-referencing effects for source-space reconstruction (eLORETA with Phase Lag Index) and anatomically constrained classification. The large Laplacian resolves distributed P3b networks while maintaining P3a specificity, achieving the best P300 peak classification accuracy (81.57% hybrid method; 75.97% majority regions of interest). Performance follows a consistent and statistically significant hierarchy: large Laplacian > CAR > no re-reference > small Laplacian, providing a foundation for unified methodological evaluation. |
2025-10-12 | FusionGen: Feature Fusion-Based Few-Shot EEG Data Generation | Yuheng Chen, Dingkun Liu, Xinyao Yang, Xinping Xu, Baicheng Chen, Dongrui Wu | Link | Brain-computer interfaces (BCIs) provide potential for applications ranging from medical rehabilitation to cognitive state assessment by establishing direct communication pathways between the brain and external devices via electroencephalography (EEG). However, EEG-based BCIs are severely constrained by data scarcity and significant inter-subject variability, which hinder the generalization and applicability of EEG decoding models in practical settings. To address these challenges, we propose FusionGen, a novel EEG data generation framework based on disentangled representation learning and feature fusion. By integrating features across trials through a feature matching fusion module and combining them with a lightweight feature extraction and reconstruction pipeline, FusionGen ensures both data diversity and trainability under limited data constraints. Extensive experiments on multiple publicly available EEG datasets demonstrate that FusionGen significantly outperforms existing augmentation techniques, yielding notable improvements in classification accuracy. |
2025-10-14 | BrainForm: a Serious Game for BCI Training and Data Collection | Michele Romani, Devis Zanoni, Elisabetta Farella, Luca Turchet | Link |
|
2025-10-11 | Bidirectional Time-Frequency Pyramid Network for Enhanced Robust EEG Classification | Jiahui Hong, Siqing Li, Muqing Jian, Luming Yang | Link | Existing EEG recognition models suffer from poor cross-paradigm generalization due to dataset-specific constraints and individual variability. To overcome these limitations, we propose BITE (Bidirectional Time-Freq Pyramid Network), an end-to-end unified architecture featuring robust multistream synergy, pyramid time-frequency attention (PTFA), and bidirectional adaptive convolutions. The framework uniquely integrates: 1) Aligned time-frequency streams maintaining temporal synchronization with STFT for bidirectional modeling, 2) PTFA-based multi-scale feature enhancement amplifying critical neural patterns, 3) BiTCN with learnable fusion capturing forward/backward neural dynamics. Demonstrating enhanced robustness, BITE achieves state-of-the-art performance across four divergent paradigms (BCICIV-2A/2B, HGD, SD-SSVEP), excelling in both within-subject accuracy and cross-subject generalization. As a unified architecture, it combines robust performance across both MI and SSVEP tasks with exceptional computational efficiency. Our work validates that paradigm-aligned spectral-temporal processing is essential for reliable BCI systems. Just as its name suggests, BITE "takes a bite out of EEG." The source code is available at https://github.com/cindy-hong/BiteEEG. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-10-21 | Topoformer: brain-like topographic organization in Transformer language models through spatial querying and reweighting | Taha Binhuraib, Greta Tuckute, Nicholas Blauch | Link | Spatial functional organization is a hallmark of biological brains: neurons are arranged topographically according to their response properties, at multiple scales. In contrast, representations within most machine learning models lack spatial biases, instead manifesting as disorganized vector spaces that are difficult to visualize and interpret. Here, we propose a novel form of self-attention that turns Transformers into "Topoformers" with topographic organization. We introduce spatial querying - where keys and queries are arranged on 2D grids, and local pools of queries are associated with a given key - and spatial reweighting, where we convert the standard fully connected layer of self-attention into a locally connected layer. We first demonstrate the feasibility of our approach by training a 1-layer Topoformer on a sentiment classification task. Training with spatial querying encourages topographic organization in the queries and keys, and spatial reweighting separately encourages topographic organization in the values and self-attention outputs. We then apply the Topoformer motifs at scale, training a BERT architecture with a masked language modeling objective. We find that the topographic variant performs on par with a non-topographic control model on NLP benchmarks, yet produces interpretable topographic organization as evaluated via eight linguistic test suites. Finally, analyzing an fMRI dataset of human brain responses to a large set of naturalistic sentences, we demonstrate alignment between low-dimensional topographic variability in the Topoformer model and human brain language network. Scaling up Topoformers further holds promise for greater interpretability in NLP research, and for more accurate models of the organization of linguistic information in the human brain. |
2025-10-20 | CausalMamba: Scalable Conditional State Space Models for Neural Causal Inference | Sangyoon Bae, Jiook Cha | Link | We introduce CausalMamba, a scalable framework that addresses fundamental limitations in fMRI-based causal inference: the ill-posed nature of inferring neural causality from hemodynamically distorted BOLD signals and the computational intractability of existing methods like Dynamic Causal Modeling (DCM). Our approach decomposes this complex inverse problem into two tractable stages: BOLD deconvolution to recover latent neural activity, followed by causal graph inference using a novel Conditional Mamba architecture. On simulated data, CausalMamba achieves 37% higher accuracy than DCM. Critically, when applied to real task fMRI data, our method recovers well-established neural pathways with 88% fidelity, whereas conventional approaches fail to identify these canonical circuits in over 99% of subjects. Furthermore, our network analysis of working memory data reveals that the brain strategically shifts its primary causal hub-recruiting executive or salience networks depending on the stimulus-a sophisticated reconfiguration that remains undetected by traditional methods. This work provides neuroscientists with a practical tool for large-scale causal inference that captures both fundamental circuit motifs and flexible network dynamics underlying cognitive function. |
2025-10-19 | Uncovering Brain-Like Hierarchical Patterns in Vision-Language Models through fMRI-Based Neural Encoding | Yudan Ren, Xinlong Wang, Kexin Wang, Tian Xia, Zihan Ma, Zhaowei Li, Xiangrong Bi, Xiao Li, Xiaowei He | Link | While brain-inspired artificial intelligence(AI) has demonstrated promising results, current understanding of the parallels between artificial neural networks (ANNs) and human brain processing remains limited: (1) unimodal ANN studies fail to capture the brain's inherent multimodal processing capabilities, and (2) multimodal ANN research primarily focuses on high-level model outputs, neglecting the crucial role of individual neurons. To address these limitations, we propose a novel neuron-level analysis framework that investigates the multimodal information processing mechanisms in vision-language models (VLMs) through the lens of human brain activity. Our approach uniquely combines fine-grained artificial neuron (AN) analysis with fMRI-based voxel encoding to examine two architecturally distinct VLMs: CLIP and METER. Our analysis reveals four key findings: (1) ANs successfully predict biological neurons (BNs) activities across multiple functional networks (including language, vision, attention, and default mode), demonstrating shared representational mechanisms; (2) Both ANs and BNs demonstrate functional redundancy through overlapping neural representations, mirroring the brain's fault-tolerant and collaborative information processing mechanisms; (3) ANs exhibit polarity patterns that parallel the BNs, with oppositely activated BNs showing mirrored activation trends across VLM layers, reflecting the complexity and bidirectional nature of neural information processing; (4) The architectures of CLIP and METER drive distinct BNs: CLIP's independent branches show modality-specific specialization, whereas METER's cross-modal design yields unified cross-modal activation, highlighting the architecture's influence on ANN brain-like properties. These results provide compelling evidence for brain-like hierarchical processing in VLMs at the neuronal level. |
2025-10-17 | Seeing Through the Brain: New Insights from Decoding Visual Stimuli with fMRI | Zheng Huang, Enpei Zhang, Yinghao Cai, Weikang Qiu, Carl Yang, Elynn Chen, Xiang Zhang, Rex Ying, Dawei Zhou, Yujun Yan | Link | Understanding how the brain encodes visual information is a central challenge in neuroscience and machine learning. A promising approach is to reconstruct visual stimuli, essentially images, from functional Magnetic Resonance Imaging (fMRI) signals. This involves two stages: transforming fMRI signals into a latent space and then using a pretrained generative model to reconstruct images. The reconstruction quality depends on how similar the latent space is to the structure of neural activity and how well the generative model produces images from that space. Yet, it remains unclear which type of latent space best supports this transformation and how it should be organized to represent visual stimuli effectively. We present two key findings. First, fMRI signals are more similar to the text space of a language model than to either a vision based space or a joint text image space. Second, text representations and the generative model should be adapted to capture the compositional nature of visual stimuli, including objects, their detailed attributes, and relationships. Building on these insights, we propose PRISM, a model that Projects fMRI sIgnals into a Structured text space as an interMediate representation for visual stimuli reconstruction. It includes an object centric diffusion module that generates images by composing individual objects to reduce object detection errors, and an attribute relationship search module that automatically identifies key attributes and relationships that best align with the neural activity. Extensive experiments on real world datasets demonstrate that our framework outperforms existing methods, achieving up to an 8% reduction in perceptual loss. These results highlight the importance of using structured text as the intermediate space to bridge fMRI signals and image reconstruction. |
2025-10-17 | Temporal Functional Factor Analysis of Brain Connectivity | Kyle Stanley, Nicole Lazar, Matthew Reimherr | Link | Many analyses of functional magnetic resonance imaging (fMRI) examine functional connectivity (FC), or the statistical dependencies among distant brain regions. These analyses are typically exploratory, guiding future confirmatory research. In this work, we present an approach based on factor analysis (FA) that is well-suited to studying FC. FA is appealing in this context because its flexible model assumptions permit a guided investigation of its target subspace consistent with the exploratory role of connectivity analyses. However, applying FA to fMRI data poses three problems: (1) its target subspace captures short-range spatial dependencies that should be treated as noise, (2) it requires factorization of a massive spatial covariance, and (3) it overlooks temporal dependencies in the data. To address these limitations, we develop a factor model within the framework of functional data analysis--a field which views certain data as arising from smooth underlying curves. The proposed approach (1) uses matrix completion techniques to filter short-range spatial dependencies out of its target subspace, (2) employs a distributed algorithm for factorizing large-scale covariance matrices, and (3) leverages functional regression to exploit temporal dynamics. Together, these innovations yield a comprehensive and scalable method for studying FC. |
2025-10-15 | Scaling Vision Transformers for Functional MRI with Flat Maps | Connor Lane, Daniel Z. Kaplan, Tanishq Mathew Abraham, Paul S. Scotti | Link | A key question for adapting modern deep learning architectures to functional MRI (fMRI) is how to represent the data for model input. To bridge the modality gap between fMRI and natural images, we transform the 4D volumetric fMRI data into videos of 2D fMRI activity flat maps. We train Vision Transformers on 2.3K hours of fMRI flat map videos from the Human Connectome Project using the spatiotemporal masked autoencoder (MAE) framework. We observe that masked fMRI modeling performance improves with dataset size according to a strict power scaling law. Downstream classification benchmarks show that our model learns rich representations supporting both fine-grained state decoding across subjects, as well as subject-specific trait decoding across changes in brain state. This work is part of an ongoing open science project to build foundation models for fMRI data. Our code and datasets are available at https://github.com/MedARC-AI/fmri-fm. |
2025-10-15 | Jacobian-Based Interpretation of Nonlinear Neural Encoding Model | Xiaohui Gao, Haoran Yang, Yue Cheng, Mengfei Zuo, Yiheng Liu, Peiyang Li, Xintao Hu | Link | In recent years, the alignment between artificial neural network (ANN) embeddings and blood oxygenation level dependent (BOLD) responses in functional magnetic resonance imaging (fMRI) via neural encoding models has significantly advanced research on neural representation mechanisms and interpretability in the brain. However, these approaches remain limited in characterizing the brain's inherently nonlinear response properties. To address this, we propose the Jacobian-based Nonlinearity Evaluation (JNE), an interpretability metric for nonlinear neural encoding models. JNE quantifies nonlinearity by statistically measuring the dispersion of local linear mappings (Jacobians) from model representations to predicted BOLD responses, thereby approximating the nonlinearity of BOLD signals. Centered on proposing JNE as a novel interpretability metric, we validated its effectiveness through controlled simulation experiments on various activation functions and network architectures, and further verified it on real fMRI data, demonstrating a hierarchical progression of nonlinear characteristics from primary to higher-order visual cortices, consistent with established cortical organization. We further extended JNE with Sample-Specificity (JNE-SS), revealing stimulus-selective nonlinear response patterns in functionally specialized brain regions. As the first interpretability metric for quantifying nonlinear responses, JNE provides new insights into brain information processing. Code available at https://github.com/Gaitxh/JNE. |
2025-10-12 | A compressed code for memory discrimination | Dale Zhou, Sharon Mina Noh, Nora C Harhen, Nidhi V Banavar, C. Brock Kirwan, Michael A Yassa, Aaron M Bornstein | Link | The ability to discriminate similar visual stimuli is an important index of memory function. This ability is widely thought to be supported by expanding the dimensionality of relevant neural codes, such that neural representations for similar stimuli are maximally distinct, or ``separated.'' An alternative hypothesis is that discrimination is supported by lossy compression of visual inputs, efficiently coding sensory information by discarding seemingly irrelevant details. A benefit of compression, relative to expansion, is that it allows individuals to retain fewer essential dimensions underlying stimulus variation -- a process linked to higher-order visual processing -- without hindering discrimination. Under this hypothesis, pattern separation is facilitated when more information from similar stimuli can be discarded, rather than preserved. We test the compression versus expansion hypotheses by predicting performance on the canonical mnemonic similarity task. We train neural networks to compress perceptual and semantic factors of stimuli, measuring lossiness using the mathematical framework underlying compression. Consistent with the compression hypothesis, and not the expansion hypothesis, greater lossiness predicts the ease and performance of lure discrimination, especially in deeper convolutional network layers that predict higher-order visual brain activity. We then confirm these predictions across two image sets, four behavioral datasets, and alternative lossiness metrics. Finally, using task fMRI, we identify signatures of lossy compression -- neural dimensionality reduction and information loss -- in higher-order visual regions V4 and IT and hippocampal DG/CA3 and CA1 linked to lure discrimination. These results suggest lossy compression supports mnemonic discrimination by discarding redundant and overlapping information. |
2025-10-10 | Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model | Beige Jerry Jin, Leila Wehbe | Link | Current non-invasive neuroimaging techniques trade off between spatial resolution and temporal resolution. While magnetoencephalography (MEG) can capture rapid neural dynamics and functional magnetic resonance imaging (fMRI) can spatially localize brain activity, a unified picture that preserves both high resolutions remains an unsolved challenge with existing source localization or MEG-fMRI fusion methods, especially for single-trial naturalistic data. We collected whole-head MEG when subjects listened passively to more than seven hours of narrative stories, using the same stimuli in an open fMRI dataset (LeBel et al., 2023). We developed a transformer-based encoding model that combines the MEG and fMRI from these two naturalistic speech comprehension experiments to estimate latent cortical source responses with high spatiotemporal resolution. Our model is trained to predict MEG and fMRI from multiple subjects simultaneously, with a latent layer that represents our estimates of reconstructed cortical sources. Our model predicts MEG better than the common standard of single-modality encoding models, and it also yields source estimates with higher spatial and temporal fidelity than classic minimum-norm solutions in simulation experiments. We validated the estimated latent sources by showing its strong generalizability across unseen subjects and modalities. Estimated activity in our source space predict electrocorticography (ECoG) better than an ECoG-trained encoding model in an entirely new dataset. By integrating the power of large naturalistic experiments, MEG, fMRI, and encoding models, we propose a practical route towards millisecond-and-millimeter brain mapping. |
2025-10-07 | Beyond Grid-Locked Voxels: Neural Response Functions for Continuous Brain Encoding | Haomiao Chen, Keith W Jamison, Mert R. Sabuncu, Amy Kuceyeski | Link | Neural encoding models aim to predict fMRI-measured brain responses to natural images. fMRI data is acquired as a 3D volume of voxels, where each voxel has a defined spatial location in the brain. However, conventional encoding models often flatten this volume into a 1D vector and treat voxel responses as independent outputs. This removes spatial context, discards anatomical information, and ties each model to a subject-specific voxel grid. We introduce the Neural Response Function (NRF), a framework that models fMRI activity as a continuous function over anatomical space rather than a flat vector of voxels. NRF represents brain activity as a continuous implicit function: given an image and a spatial coordinate (x, y, z) in standardized MNI space, the model predicts the response at that location. This formulation decouples predictions from the training grid, supports querying at arbitrary spatial resolutions, and enables resolution-agnostic analyses. By grounding the model in anatomical space, NRF exploits two key properties of brain responses: (1) local smoothness -- neighboring voxels exhibit similar response patterns; modeling responses continuously captures these correlations and improves data efficiency, and (2) cross-subject alignment -- MNI coordinates unify data across individuals, allowing a model pretrained on one subject to be fine-tuned on new subjects. In experiments, NRF outperformed baseline models in both intrasubject encoding and cross-subject adaptation, achieving high performance while reducing the data size needed by orders of magnitude. To our knowledge, NRF is the first anatomically aware encoding model to move beyond flattened voxels, learning a continuous mapping from images to brain responses in 3D space. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-10-20 | MEG-GPT: A transformer-based foundation model for magnetoencephalography data | Rukuang Huang, Sungjun Cho, Chetan Gohil, Oiwi Parker Jones, Mark Woolrich | Link | Modelling the complex spatiotemporal patterns of large-scale brain dynamics is crucial for neuroscience, but traditional methods fail to capture the rich structure in modalities such as magnetoencephalography (MEG). Recent advances in deep learning have enabled significant progress in other domains, such as language and vision, by using foundation models at scale. Here, we introduce MEG-GPT, a transformer based foundation model that uses time-attention and next time-point prediction. To facilitate this, we also introduce a novel data-driven tokeniser for continuous MEG data, which preserves the high temporal resolution of continuous MEG signals without lossy transformations. We trained MEG-GPT on tokenised brain region time-courses extracted from a large-scale MEG dataset (N=612, eyes-closed rest, Cam-CAN data), and show that the learnt model can generate data with realistic spatio-spectral properties, including transient events and population variability. Critically, it performs well in downstream decoding tasks, improving downstream supervised prediction task, showing improved zero-shot generalisation across sessions (improving accuracy from 0.54 to 0.59) and subjects (improving accuracy from 0.41 to 0.49) compared to a baseline methods. Furthermore, we show the model can be efficiently fine-tuned on a smaller labelled dataset to boost performance in cross-subject decoding scenarios. This work establishes a powerful foundation model for electrophysiological data, paving the way for applications in computational neuroscience and neural decoding. |
2025-10-10 | Estimating Brain Activity with High Spatial and Temporal Resolution using a Naturalistic MEG-fMRI Encoding Model | Beige Jerry Jin, Leila Wehbe | Link | Current non-invasive neuroimaging techniques trade off between spatial resolution and temporal resolution. While magnetoencephalography (MEG) can capture rapid neural dynamics and functional magnetic resonance imaging (fMRI) can spatially localize brain activity, a unified picture that preserves both high resolutions remains an unsolved challenge with existing source localization or MEG-fMRI fusion methods, especially for single-trial naturalistic data. We collected whole-head MEG when subjects listened passively to more than seven hours of narrative stories, using the same stimuli in an open fMRI dataset (LeBel et al., 2023). We developed a transformer-based encoding model that combines the MEG and fMRI from these two naturalistic speech comprehension experiments to estimate latent cortical source responses with high spatiotemporal resolution. Our model is trained to predict MEG and fMRI from multiple subjects simultaneously, with a latent layer that represents our estimates of reconstructed cortical sources. Our model predicts MEG better than the common standard of single-modality encoding models, and it also yields source estimates with higher spatial and temporal fidelity than classic minimum-norm solutions in simulation experiments. We validated the estimated latent sources by showing its strong generalizability across unseen subjects and modalities. Estimated activity in our source space predict electrocorticography (ECoG) better than an ECoG-trained encoding model in an entirely new dataset. By integrating the power of large naturalistic experiments, MEG, fMRI, and encoding models, we propose a practical route towards millisecond-and-millimeter brain mapping. |
2025-10-09 | BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution | Terry Yue Zhuo, Xiaolong Jin, Hange Liu, Juyong Jiang, Tianyang Liu, Chen Gong, Bhupesh Bishnoi, Vaisakhi Mishra, Marek Suppa, Noah Ziems, Saiteja Utpala, Ming Xu, Guangyu Song, Kaixin Li, Yuhan Cao, Bo Liu, Zheng Liu, Sabina Abdurakhmanova, Wenhao Yu, Mengzhao Jia, Jihan Yao, Kenneth Hamilton, Kumar Shridhar, Minh Chien Vu, Dingmin Wang, Jiawei Liu, Zijian Wang, Qian Liu, Binyuan Hui, Meg Risdal, Ahsen Khaliq, Atin Sood, Zhenchang Xing, Wasi Uddin Ahmad, John Grundy, David Lo, Banghua Zhu, Xiaoning Du, Torsten Scholak, Leandro von Werra | Link | Crowdsourced model evaluation platforms, such as Chatbot Arena, enable real-time evaluation from human perspectives to assess the quality of model responses. In the coding domain, manually examining the quality of LLM-generated content is extremely challenging, as it requires understanding long chunks of raw code and deliberately simulating code execution. To this end, we introduce BigCodeArena, an open human evaluation platform for code generation backed by a comprehensive and on-the-fly execution environment. Built on top of Chatbot Arena, BigCodeArena enables the execution of LLM-generated code and allows humans to interact with the execution process and outcomes. We collected over 14,000 raw code-centric conversation sessions across 10 widely used LLMs, spanning 10 languages and 8 types of execution environments. Among these conversations, we identified more than 4,700 multi-turn samples with pairwise human preferences. Further analysis uncovers underexplored preferences of LLMs in fine-grained domains characterized by tasks, languages, and frameworks. To systematically examine code understanding and generation capabilities of frontier LLMs, we curated two benchmarks based on the collected data, namely BigCodeReward and AutoCodeArena. For BigCodeReward, we post-processed the 4,700 conversations and evaluated the consistency between reward models and human preferences. The evaluation shows that most LLMs have superior performance in judging coding preferences when the execution results are available. Inspired by these findings, we propose AutoCodeArena, an automatic Elo rating benchmark designed to assess the coding quality of LLMs without human involvement. We find that proprietary LLMs like GPT-5, Claude-Sonnet-4, and Claude-Opus-4 still lead in code generation performance among recent emerging models. |
2025-10-09 | Expanding the Landscape of Exotic Muon Decays | Admir Greljo, Ajdin Palavrić, Mirsad Tunja, Jure Zupan | Link | We chart new-physics models that produce exotic, high-multiplicity muon decays featuring prompt or displaced |
2025-10-07 | The Gamma-ray Luminosity Function of Flat-Spectrum Radio Quasars | Garima Rajguru, Lea Marcotulli, Marco Ajello, Mattia Di Mauro, Meg Urry | Link | We have utilized the largest sample of |
2025-09-18 | UMind: A Unified Multitask Network for Zero-Shot M/EEG Visual Decoding | Chengjian Xu, Yonghao Song, Zelin Liao, Haochuan Zhang, Qiong Wang, Qingqing Zheng | Link | Decoding visual information from time-resolved brain recordings, such as EEG and MEG, plays a pivotal role in real-time brain-computer interfaces. However, existing approaches primarily focus on direct brain-image feature alignment and are limited to single-task frameworks or task-specific models. In this paper, we propose a Unified MultItask Network for zero-shot M/EEG visual Decoding (referred to UMind), including visual stimulus retrieval, classification, and reconstruction, where multiple tasks mutually enhance each other. Our method learns robust neural-visual and semantic representations through multimodal alignment with both image and text modalities. The integration of both coarse and fine-grained texts enhances the extraction of these neural representations, enabling more detailed semantic and visual decoding. These representations then serve as dual conditional inputs to a pre-trained diffusion model, guiding visual reconstruction from both visual and semantic perspectives. Extensive evaluations on MEG and EEG datasets demonstrate the effectiveness, robustness, and biological plausibility of our approach in capturing spatiotemporal neural dynamics. Our approach sets a multitask pipeline for brain visual decoding, highlighting the synergy of semantic information in visual feature extraction. |
2025-09-17 | Quantum-like representation of neuronal networks' activity: modeling "mental entanglement" | Andrei Khrennikov, Makiko Yamada | Link | Quantum-like modeling (QLM) - quantum theory applications outside of physics - are intensively developed with applications in biology, cognition, psychology, and decision-making. For cognition, QLM should be distinguished from quantum reductionist models in the spirit of Hameroff and Penrose and well as Umezawa and Vitiello. QLM is not concerned with just quantum physical processes in the brain but also QL information processing by macroscopic neuronal structures. Although QLM of cognition and decision-making has seen some success, it suffers from a knowledge gap that exists between oscillatory neuronal network functioning in the brain and QL behavioral patterns. Recently, steps toward closing this gap have been taken using the generalized probability theory and prequantum classical statistical field theory (PCSFT) - a random field model beyond the complex Hilbert space formalism. PCSFT is used to move from the classical ``oscillatory cognition'' of the neuronal networks to QLM for decision.making. In this study, we addressed the most difficult problem within this construction: QLM for entanglement generation by classical networks, i.e., mental entanglement. We started with the observational approach to entanglement based on operator algebras describing local observables and bringing into being the tensor product structure in the space of QL states. Moreover, we applied the standard states entanglement approach: entanglement generation by spatially separated networks in the brain. Finally, we discussed possible future experiments on mental entanglement detection using the EEG/MEG technique. |
2025-09-17 | Multi-Source Neural Activity Indices and Spatial Filters for EEG/MEG Inverse Problem: An Extension to MNE-Python | Julia Jurkowska, Joanna Dreszer, Monika Lewandowska, Krzysztof Tołpa, Tomasz Piotrowski | Link | Accurate EEG/MEG source localization is essential for understanding brain function, yet remains challenging because the inverse problem is inherently ill-posed. In spatial filtering (beamforming) approaches, single-source LCMV spatial filters, though widely used, suffer from source cancellation when sources are correlated - a common experimental scenario. Multi-source frameworks, such as the multi-source minimum-variance pseudo-unbiased reduced-rank (MV-PURE) method, offer improved reconstruction and robust neural activity indices, yet their adoption has been limited by incomplete theory and lack of accessible implementations. In this paper, we present a rigorous derivation of multi-source neural activity indices and spatial filters, establishing a complete analytical framework with automated parameter selection. The resulting compact algebraic forms enable straightforward implementation. To facilitate adoption, we provide a full implementation extending MNE-Python, along with an accompanying tutorial, and demonstrate its utility on EEG experimental data, highlighting the practical advantages of multi-source spatial filtering for source localization and reconstruction. |
2025-09-23 | MEGS$^{2}$: Memory-Efficient Gaussian Splatting via Spherical Gaussians and Unified Pruning | Jiarui Chen, Yikeng Chen, Yingshuang Zou, Ye Huang, Peng Wang, Yuan Liu, Yujing Sun, Wenping Wang | Link | 3D Gaussian Splatting (3DGS) has emerged as a dominant novel-view synthesis technique, but its high memory consumption severely limits its applicability on edge devices. A growing number of 3DGS compression methods have been proposed to make 3DGS more efficient, yet most only focus on storage compression and fail to address the critical bottleneck of rendering memory. To address this problem, we introduce MEGS$^{2}$, a novel memory-efficient framework that tackles this challenge by jointly optimizing two key factors: the total primitive number and the parameters per primitive, achieving unprecedented memory compression. Specifically, we replace the memory-intensive spherical harmonics with lightweight, arbitrarily oriented spherical Gaussian lobes as our color representations. More importantly, we propose a unified soft pruning framework that models primitive-number and lobe-number pruning as a single constrained optimization problem. Experiments show that MEGS$^{2}$ achieves a 50% static VRAM reduction and a 40% rendering VRAM reduction compared to existing methods, while maintaining comparable rendering quality. Project page: https://megs-2.github.io/ |
2025-09-15 | A fast machine learning tool to predict the composition of astronomical ices from infrared absorption spectra | Andrés Megías, Izaskun Jiménez-Serra, François Dulieu, Julie Vitorino, Belén Maté, David Ciudad, Will R. M. Rocha, Marcos Martínez Jiménez, Jacobo Aguirre | Link | Current observations taken by James Webb Space Telescope (JWST) allow us to observe the absorption features of icy mantles that cover interstellar dust grains, which are mainly composed of |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-10-09 | A Computational Perspective on NeuroAI and Synthetic Biological Intelligence | Dhruvik Patel, Md Sayed Tanveer, Jesus Gonzalez-Ferrer, Alon Loeffler, Brett J. Kagan, Mohammed A. Mostajo-Radji, Ge Wang | Link | NeuroAI is an emerging field at the intersection of neuroscience and artificial intelligence, where insights from brain function guide the design of intelligent systems. A central area within this field is synthetic biological intelligence (SBI), which combines the adaptive learning properties of biological neural networks with engineered hardware and software. SBI systems provide a platform for modeling neural computation, developing biohybrid architectures, and enabling new forms of embodied intelligence. In this review, we organize the NeuroAI landscape into three interacting domains: hardware, software, and wetware. We outline computational frameworks that integrate biological and non-biological systems and highlight recent advances in organoid intelligence, neuromorphic computing, and neuro-symbolic learning. These developments collectively point toward a new class of systems that compute through interactions between living neural tissue and digital algorithms. |
2025-07-09 | Quantifying Uncertainty in Error Consistency: Towards Reliable Behavioral Comparison of Classifiers | Thomas Klein, Sascha Meyen, Wieland Brendel, Felix A. Wichmann, Kristof Meding | Link | Benchmarking models is a key factor for the rapid progress in machine learning (ML) research. Thus, further progress depends on improving benchmarking metrics. A standard metric to measure the behavioral alignment between ML models and human observers is error consistency (EC). EC allows for more fine-grained comparisons of behavior than other metrics such as e.g. accuracy, and has been used in the influential Brain-Score benchmark to rank different DNNs by their behavioral consistency with humans. Previously, EC values have been reported without confidence intervals. However, empirically measured EC values are typically noisy -- thus, without confidence intervals, valid benchmarking conclusions are problematic. Here we improve on standard EC in two ways: First, we show how to obtain confidence intervals for EC using a bootstrapping technique, allowing us to derive significance tests for EC. Second, we propose a new computational model relating the EC between two classifiers to the implicit probability that one of them copies responses from the other. This view of EC allows us to give practical guidance to scientists regarding the number of trials required for sufficiently powerful, conclusive experiments. Finally, we use our methodology to revisit popular NeuroAI-results. We find that while the general trend of behavioral differences between humans and machines holds up to scrutiny, many reported differences between deep vision models are statistically insignificant. Our methodology enables researchers to design adequately powered experiments that can reliably detect behavioral differences between models, providing a foundation for more rigorous benchmarking of behavioral alignment. |
2025-07-02 | What Neuroscience Can Teach AI About Learning in Continuously Changing Environments | Daniel Durstewitz, Bruno Averbeck, Georgia Koppe | Link | Modern AI models, such as large language models, are usually trained once on a huge corpus of data, potentially fine-tuned for a specific task, and then deployed with fixed parameters. Their training is costly, slow, and gradual, requiring billions of repetitions. In stark contrast, animals continuously adapt to the ever-changing contingencies in their environments. This is particularly important for social species, where behavioral policies and reward outcomes may frequently change in interaction with peers. The underlying computational processes are often marked by rapid shifts in an animal's behaviour and rather sudden transitions in neuronal population activity. Such computational capacities are of growing importance for AI systems operating in the real world, like those guiding robots or autonomous vehicles, or for agentic AI interacting with humans online. Can AI learn from neuroscience? This Perspective explores this question, integrating the literature on continual and in-context learning in AI with the neuroscience of learning on behavioral tasks with shifting rules, reward probabilities, or outcomes. We will outline an agenda for how specifically insights from neuroscience may inform current developments in AI in this area, and - vice versa - what neuroscience may learn from AI, contributing to the evolving field of NeuroAI. |
2025-06-12 | NOBLE -- Neural Operator with Biologically-informed Latent Embeddings to Capture Experimental Variability in Biological Neuron Models | Luca Ghafourpour, Valentin Duruisseaux, Bahareh Tolooshams, Philip H. Wong, Costas A. Anastassiou, Anima Anandkumar | Link | Characterizing the diverse computational properties of human neurons via multimodal electrophysiological, transcriptomic, and morphological data provides the foundation for constructing and validating bio-realistic neuron models that can advance our understanding of fundamental mechanisms underlying brain function. However, current modeling approaches remain constrained by the limited availability and intrinsic variability of experimental neuronal data. To capture variability, ensembles of deterministic models are often used, but are difficult to scale as model generation requires repeating computationally expensive optimization for each neuron. While deep learning is becoming increasingly relevant in this space, it fails to capture the full biophysical complexity of neurons, their nonlinear voltage dynamics, and variability. To address these shortcomings, we introduce NOBLE, a neural operator framework that learns a mapping from a continuous frequency-modulated embedding of interpretable neuron features to the somatic voltage response induced by current injection. Trained on data generated from biophysically realistic neuron models, NOBLE predicts distributions of neural dynamics accounting for the intrinsic experimental variability. Unlike conventional bio-realistic neuron models, interpolating within the embedding space offers models whose dynamics are consistent with experimentally observed responses. NOBLE is the first scaled-up deep learning framework validated on real experimental data, enabling efficient generation of synthetic neurons that exhibit trial-to-trial variability and achieve a |
2025-05-21 | SynEVO: A neuro-inspired spatiotemporal evolutional framework for cross-domain adaptation | Jiayue Liu, Zhongchao Yi, Zhengyang Zhou, Qihe Huang, Kuo Yang, Xu Wang, Yang Wang | Link | Discovering regularities from spatiotemporal systems can benefit various scientific and social planning. Current spatiotemporal learners usually train an independent model from a specific source data that leads to limited transferability among sources, where even correlated tasks requires new design and training. The key towards increasing cross-domain knowledge is to enable collective intelligence and model evolution. In this paper, inspired by neuroscience theories, we theoretically derive the increased information boundary via learning cross-domain collective intelligence and propose a Synaptic EVOlutional spatiotemporal network, SynEVO, where SynEVO breaks the model independence and enables cross-domain knowledge to be shared and aggregated. Specifically, we first re-order the sample groups to imitate the human curriculum learning, and devise two complementary learners, elastic common container and task-independent extractor to allow model growth and task-wise commonality and personality disentanglement. Then an adaptive dynamic coupler with a new difference metric determines whether the new sample group should be incorporated into common container to achieve model evolution under various domains. Experiments show that SynEVO improves the generalization capacity by at most 42% under cross-domain scenarios and SynEVO provides a paradigm of NeuroAI for knowledge transfer and adaptation. |
2025-02-22 | Brain-Model Evaluations Need the NeuroAI Turing Test | Jenelle Feather, Meenakshi Khosla, N. Apurva Ratan Murty, Aran Nayebi | Link | What makes an artificial system a good model of intelligence? The classical test proposed by Alan Turing focuses on behavior, requiring that an artificial agent's behavior be indistinguishable from that of a human. While behavioral similarity provides a strong starting point, two systems with very different internal representations can produce the same outputs. Thus, in modeling biological intelligence, the field of NeuroAI often aims to go beyond behavioral similarity and achieve representational convergence between a model's activations and the measured activity of a biological system. This position paper argues that the standard definition of the Turing Test is incomplete for NeuroAI, and proposes a stronger framework called the ``NeuroAI Turing Test'', a benchmark that extends beyond behavior alone and \emph{additionally} requires models to produce internal neural representations that are empirically indistinguishable from those of a brain up to measured individual variability, i.e. the differences between a computational model and the brain is no more than the difference between one brain and another brain. While the brain is not necessarily the ceiling of intelligence, it remains the only universally agreed-upon example, making it a natural reference point for evaluating computational models. By proposing this framework, we aim to shift the discourse from loosely defined notions of brain inspiration to a systematic and testable standard centered on both behavior and internal representations, providing a clear benchmark for neuroscientific modeling and AI development. |
2025-01-04 | Asynchronous Hebbian/anti-Hebbian networks | Henrique Reis Aguiar, Matthias H. Hennig | Link | Lateral inhibition models coupled with Hebbian plasticity have been shown to learn factorised causal representations of input stimuli, for instance, oriented edges are learned from natural images. Currently, these models require the recurrent dynamics to settle into a stable state before weight changes can be applied, which is not only biologically implausible, but also impractical for real-time learning systems. Here, we propose a new Hebbian learning rule which is implemented using plausible biological mechanisms that have been observed experimentally. We find that this rule allows for efficient, time-continuous learning of factorised representations, very similar to the classic noncontinuous Hebbian/anti-Hebbian learning. Furthermore, we show that this rule naturally prevents catastrophic forgetting when stimuli from different distributions are shown sequentially. |
2025-04-03 | NeuroAI for AI Safety | Patrick Mineault, Niccolò Zanichelli, Joanne Zichen Peng, Anton Arkhipov, Eli Bingham, Julian Jara-Ettinger, Emily Mackevicius, Adam Marblestone, Marcelo Mattar, Andrew Payne, Sophia Sanborn, Karen Schroeder, Zenna Tavares, Andreas Tolias, Anthony Zador | Link | As AI systems become increasingly powerful, the need for safe AI has become more pressing. Humans are an attractive model for AI safety: as the only known agents capable of general intelligence, they perform robustly even under conditions that deviate significantly from prior experiences, explore the world safely, understand pragmatics, and can cooperate to meet their intrinsic goals. Intelligence, when coupled with cooperation and safety mechanisms, can drive sustained progress and well-being. These properties are a function of the architecture of the brain and the learning algorithms it implements. Neuroscience may thus hold important keys to technical AI safety that are currently underexplored and underutilized. In this roadmap, we highlight and critically evaluate several paths toward AI safety inspired by neuroscience: emulating the brain's representations, information processing, and architecture; building robust sensory and motor systems from imitating brain data and bodies; fine-tuning AI systems on brain data; advancing interpretability using neuroscience methods; and scaling up cognitively-inspired architectures. We make several concrete recommendations for how neuroscience can positively impact AI safety. |
2025-09-14 | Evaluating Representational Similarity Measures from the Lens of Functional Correspondence | Yiqing Bo, Ansh Soni, Sudhanshu Srivastava, Meenakshi Khosla | Link | Neuroscience and artificial intelligence (AI) both face the challenge of interpreting high-dimensional neural data, where the comparative analysis of such data is crucial for revealing shared mechanisms and differences between these complex systems. Despite the widespread use of representational comparisons and the abundance classes of comparison methods, a critical question remains: which metrics are most suitable for these comparisons? While some studies evaluate metrics based on their ability to differentiate models of different origins or constructions (e.g., various architectures), another approach is to assess how well they distinguish models that exhibit distinct behaviors. To investigate this, we examine the degree of alignment between various representational similarity measures and behavioral outcomes, employing group statistics and a comprehensive suite of behavioral metrics for comparison. In our evaluation of eight commonly used representational similarity metrics in the visual domain -- spanning alignment-based, Canonical Correlation Analysis (CCA)-based, inner product kernel-based, and nearest-neighbor methods -- we found that metrics like linear Centered Kernel Alignment (CKA) and Procrustes distance, which emphasize the overall geometric structure or shape of representations, excelled in differentiating trained from untrained models and aligning with behavioral measures, whereas metrics such as linear predictivity, commonly used in neuroscience, demonstrated only moderate alignment with behavior. These insights are crucial for selecting metrics that emphasize behaviorally meaningful comparisons in NeuroAI research. |
2024-09-09 | Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models | Emily Cheng, Richard J. Antonello | Link | Research has repeatedly demonstrated that intermediate hidden states extracted from large language models are able to predict measured brain response to natural language stimuli. Yet, very little is known about the representation properties that enable this high prediction performance. Why is it the intermediate layers, and not the output layers, that are most capable for this unique and highly general transfer task? In this work, we show that evidence from language encoding models in fMRI supports the existence of a two-phase abstraction process within LLMs. We use manifold learning methods to show that this abstraction process naturally arises over the course of training a language model and that the first "composition" phase of this abstraction process is compressed into fewer layers as training continues. Finally, we demonstrate a strong correspondence between layerwise encoding performance and the intrinsic dimensionality of representations from LLMs. We give initial evidence that this correspondence primarily derives from the inherent compositionality of LLMs and not their next-word prediction properties. |
Publish Date | Title | Authors | URL | Abstract |
---|---|---|---|---|
2025-10-21 | A Hybrid Enumeration Framework for Optimal Counterfactual Generation in Post-Acute COVID-19 Heart Failure | Jingya Cheng, Alaleh Azhir, Jiazi Tian, Hossein Estiri | Link | Counterfactual inference provides a mathematical framework for reasoning about hypothetical outcomes under alternative interventions, bridging causal reasoning and predictive modeling. We present a counterfactual inference framework for individualized risk estimation and intervention analysis, illustrated through a clinical application to post-acute sequelae of COVID-19 (PASC) among patients with pre-existing heart failure (HF). Using longitudinal diagnosis, laboratory, and medication data from a large health-system cohort, we integrate regularized predictive modeling with counterfactual search to identify actionable pathways to PASC-related HF hospital admissions. The framework combines exact enumeration with optimization-based methods, including the Nearest Instance Counterfactual Explanations (NICE) and Multi-Objective Counterfactuals (MOC) algorithms, to efficiently explore high-dimensional intervention spaces. Applied to more than 2700 individuals with confirmed SARS-CoV-2 infection and prior HF, the model achieved strong discriminative performance (AUROC: 0.88, 95% CI: 0.84-0.91) and generated interpretable, patient-specific counterfactuals that quantify how modifying comorbidity patterns or treatment factors could alter predicted outcomes. This work demonstrates how counterfactual reasoning can be formalized as an optimization problem over predictive functions, offering a rigorous, interpretable, and computationally efficient approach to personalized inference in complex biomedical systems. |
2025-10-21 | Improving the Generation and Evaluation of Synthetic Data for Downstream Medical Causal Inference | Harry Amad, Zhaozhi Qian, Dennis Frauen, Julianna Piskorz, Stefan Feuerriegel, Mihaela van der Schaar | Link | Causal inference is essential for developing and evaluating medical interventions, yet real-world medical datasets are often difficult to access due to regulatory barriers. This makes synthetic data a potentially valuable asset that enables these medical analyses, along with the development of new inference methods themselves. Generative models can produce synthetic data that closely approximate real data distributions, yet existing methods do not consider the unique challenges that downstream causal inference tasks, and specifically those focused on treatments, pose. We establish a set of desiderata that synthetic data containing treatments should satisfy to maximise downstream utility: preservation of (i) the covariate distribution, (ii) the treatment assignment mechanism, and (iii) the outcome generation mechanism. Based on these desiderata, we propose a set of evaluation metrics to assess such synthetic data. Finally, we present STEAM: a novel method for generating Synthetic data for Treatment Effect Analysis in Medicine that mimics the data-generating process of data containing treatments and optimises for our desiderata. We empirically demonstrate that STEAM achieves state-of-the-art performance across our metrics as compared to existing generative models, particularly as the complexity of the true data-generating process increases. |
2025-10-21 | Investigating LLM Capabilities on Long Context Comprehension for Medical Question Answering | Feras AlMannaa, Talia Tseriotou, Jenny Chim, Maria Liakata | Link | This study is the first to investigate LLM comprehension capabilities over long-context (LC) medical QA of clinical relevance. Our comprehensive assessment spans a range of content-inclusion settings based on their relevance, LLM models of varying capabilities and datasets across task formulations, revealing insights on model size effects, limitations, underlying memorization issues and the benefits of reasoning models. Importantly, we examine the effect of RAG on medical LC comprehension, uncover best settings in single versus multi-document reasoning datasets and showcase RAG strategies for improvements over LC. We shed light into some of the evaluation aspects using a multi-faceted approach. Our qualitative and error analyses address open questions on when RAG is beneficial over LC, revealing common failure cases. |
2025-10-21 | Exploring Membership Inference Vulnerabilities in Clinical Large Language Models | Alexander Nemecek, Zebin Yun, Zahra Rahmani, Yaniv Harel, Vipin Chaudhary, Mahmood Sharif, Erman Ayday | Link | As large language models (LLMs) become progressively more embedded in clinical decision-support, documentation, and patient-information systems, ensuring their privacy and trustworthiness has emerged as an imperative challenge for the healthcare sector. Fine-tuning LLMs on sensitive electronic health record (EHR) data improves domain alignment but also raises the risk of exposing patient information through model behaviors. In this work-in-progress, we present an exploratory empirical study on membership inference vulnerabilities in clinical LLMs, focusing on whether adversaries can infer if specific patient records were used during model training. Using a state-of-the-art clinical question-answering model, Llemr, we evaluate both canonical loss-based attacks and a domain-motivated paraphrasing-based perturbation strategy that more realistically reflects clinical adversarial conditions. Our preliminary findings reveal limited but measurable membership leakage, suggesting that current clinical LLMs provide partial resistance yet remain susceptible to subtle privacy risks that could undermine trust in clinical AI adoption. These results motivate continued development of context-aware, domain-specific privacy evaluations and defenses such as differential privacy fine-tuning and paraphrase-aware training, to strengthen the security and trustworthiness of healthcare AI systems. |
2025-10-21 | Prototyping an End-to-End Multi-Modal Tiny-CNN for Cardiovascular Sensor Patches | Mustafa Fuad Rifet Ibrahim, Tunc Alkanat, Maurice Meijer, Felix Manthey, Alexander Schlaefer, Peer Stelldinger | Link | The vast majority of cardiovascular diseases may be preventable if early signs and risk factors are detected. Cardiovascular monitoring with body-worn sensor devices like sensor patches allows for the detection of such signs while preserving the freedom and comfort of patients. However, the analysis of the sensor data must be robust, reliable, efficient, and highly accurate. Deep learning methods can automate data interpretation, reducing the workload of clinicians. In this work, we analyze the feasibility of applying deep learning models to the classification of synchronized electrocardiogram (ECG) and phonocardiogram (PCG) recordings on resource-constrained medical edge devices. We propose a convolutional neural network with early fusion of data to solve a binary classification problem. We train and validate our model on the synchronized ECG and PCG recordings from the Physionet Challenge 2016 dataset. Our approach reduces memory footprint and compute cost by three orders of magnitude compared to the state-of-the-art while maintaining competitive accuracy. We demonstrate the applicability of our proposed model on medical edge devices by analyzing energy consumption on a microcontroller and an experimental sensor device setup, confirming that on-device inference can be more energy-efficient than continuous data streaming. |
2025-10-21 | MorphModes: Non-rigid Registration via Adaptive Skinning Eigenmodes | Gabrielle Browne, Mengfei Liu, Eitan Grinspun, Otman Benchekroun | Link | Non-rigid registration is a crucial task with applications in medical imaging, industrial robotics, computer vision, and entertainment. Standard approaches accomplish this task using variations on the Non-Rigid Iterative Closest Point (NRICP) algorithms, which are prone to local minima and sensitive to initial conditions. We instead formulate the non-rigid registration problem as a Signed Distance Function (SDF) matching optimization problem, which provides richer shape information compared to traditional ICP methods. To avoid degenerate solutions, we propose to use a smooth Skinning Eigenmode subspace to parameterize the optimization problem. Finally, we propose an adaptive subspace optimization scheme to allow the resolution of localized deformations within the optimization. The result is a non-rigid registration algorithm that is more robust than NRICP, without the parameter sensitivity present in other SDF-matching approaches. |
2025-10-21 | Electromagnetic Field Exposure Assessment and Mitigation Strategies for Wireless Power Transfer Systems: A Review and Future Perspectives | Akimasa Hirata, Teruo Onishi, Naoki Shinohara, Valerio De Santis, Yinliang Diao, Mauro Feliziani, Takashi Hikage, Junqing Lan, Francesca Maradei, Keishi Miwa, Alexander Prokop, Yujun Shin, Seungyoung Ahn | Link | Wireless power transfer (WPT) technologies are increasingly being applied in fields ranging from consumer electronics and electric vehicles to space-based energy systems and medical implants. While WPT offers contactless power delivery, it introduces electromagnetic field (EMF) emissions, necessitating careful assessment to address safety and public health concerns. Exposure guidelines developed by ICNIRP and IEEE define frequency-dependent limits based on internal quantities, such as electric field strength and specific absorption rate, intended to prevent tissue nerve stimulation < 100 kHz and heating > 100 kHz, respectively. Complementing these guidelines, assessment standards including the International Electrotechnical Commission (IEC)/IEEE 63184 and IEC Technical Report 63377, provide practical procedures for evaluating the EMF exposure in WPT systems. This review offers a comparative overview of major WPT modalities, with a focus on recent developments in computational dosimetry and standardized assessment techniques for the complex, non-uniform fields typical of WPT environments. It also discusses electromagnetic interference with medical devices and exposure scenarios involving partial body proximity and various postures. A notable observation across modalities is the considerable variability, often spanning an order of magnitude, in the allowable transfer power, depending on the field distribution and assessment approach. Remaining challenges include the lack of harmonized guidance for intermediate frequencies and localized exposure, underscoring the importance of further coordination in international standardization efforts. Addressing these issues is essential for the safe and widespread deployment of WPT technologies. |
2025-10-21 | Privacy-Preserving Healthcare Data in IoT: A Synergistic Approach with Deep Learning and Blockchain | Behnam Rezaei Bezanjani, Seyyed Hamid Ghafouri, Reza Gholamrezaei | Link | The integration of Internet of Things (IoT) devices in healthcare has revolutionized patient care by enabling real-time monitoring, personalized treatments, and efficient data management. However, this technological advancement introduces significant security risks, particularly concerning the confidentiality, integrity, and availability of sensitive medical data. Traditional security measures are often insufficient to address the unique challenges posed by IoT environments, such as heterogeneity, resource constraints, and the need for real-time processing. To tackle these challenges, we propose a comprehensive three-phase security framework designed to enhance the security and reliability of IoT-enabled healthcare systems. In the first phase, the framework assesses the reliability of IoT devices using a reputation-based trust estimation mechanism, which combines device behavior analytics with off-chain data storage to ensure scalability. The second phase integrates blockchain technology with a lightweight proof-of-work mechanism, ensuring data immutability, secure communication, and resistance to unauthorized access. The third phase employs a lightweight Long Short-Term Memory (LSTM) model for anomaly detection and classification, enabling real-time identification of cyber threats. Simulation results demonstrate that the proposed framework outperforms existing methods, achieving a 2% increase in precision, accuracy, and recall, a 5% higher attack detection rate, and a 3% reduction in false alarm rate. These improvements highlight the framework's ability to address critical security concerns while maintaining scalability and real-time performance. |
2025-10-21 | IMB: An Italian Medical Benchmark for Question Answering | Antonio Romano, Giuseppe Riccio, Mariano Barone, Marco Postiglione, Vincenzo Moscato | Link | Online medical forums have long served as vital platforms where patients seek professional healthcare advice, generating vast amounts of valuable knowledge. However, the informal nature and linguistic complexity of forum interactions pose significant challenges for automated question answering systems, especially when dealing with non-English languages. We present two comprehensive Italian medical benchmarks: \textbf{IMB-QA}, containing 782,644 patient-doctor conversations from 77 medical categories, and \textbf{IMB-MCQA}, comprising 25,862 multiple-choice questions from medical specialty examinations. We demonstrate how Large Language Models (LLMs) can be leveraged to improve the clarity and consistency of medical forum data while retaining their original meaning and conversational style, and compare a variety of LLM architectures on both open and multiple-choice question answering tasks. Our experiments with Retrieval Augmented Generation (RAG) and domain-specific fine-tuning reveal that specialized adaptation strategies can outperform larger, general-purpose models in medical question answering tasks. These findings suggest that effective medical AI systems may benefit more from domain expertise and efficient information retrieval than from increased model scale. We release both datasets and evaluation frameworks in our GitHub repository to support further research on multilingual medical question answering: https://github.com/PRAISELab-PicusLab/IMB. |
2025-10-21 | Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents | Guangfu Guo, Xiaoqian Lu, Yue Feng | Link | Visual Language Models (VLMs) achieve promising results in medical reasoning but struggle with hallucinations, vague descriptions, inconsistent logic and poor localization. To address this, we propose a agent framework named Medical Visual Reasoning Agent (\textbf{Med-VRAgent}). The approach is based on Visual Guidance and Self-Reward paradigms and Monte Carlo Tree Search (MCTS). By combining the Visual Guidance with tree search, Med-VRAgent improves the medical visual reasoning capabilities of VLMs. We use the trajectories collected by Med-VRAgent as feedback to further improve the performance by fine-tuning the VLMs with the proximal policy optimization (PPO) objective. Experiments on multiple medical VQA benchmarks demonstrate that our method outperforms existing approaches. |