Skip to content

Automatically update arXiv papers about SOT & VLT, Multi-modal Learning, LLM and Video Understanding using Github Actions.

Notifications You must be signed in to change notification settings

Xuchen-Li/cv-arxiv-daily

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,340 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Updated on 2026.02.11

Table of Contents
  1. Single Object & Visual Language Tracking
  2. Large Language Model
  3. Video Understanding
  4. Multi-modal Learning

Single Object & Visual Language Tracking

Publish Date Title Authors PDF Code
2025-07-22 Explicit Context Reasoning with Supervision for Visual Tracking Fansheng Zeng et.al. 2507.16191 null
2025-07-21 Is Tracking really more challenging in First Person Egocentric Vision? Matteo Dunnhofer et.al. 2507.16015 null
2025-07-23 EndoControlMag: Robust Endoscopic Vascular Motion Magnification with Periodic Reference Resetting and Hierarchical Tissue-aware Dual-Mask Contro An Wang et.al. 2507.15292 null
2025-07-11 SAM2RL: Towards Reinforcement Learning Memory Control in Segment Anything Model 2 Alen Adamyan et.al. 2507.08548 null
2025-07-10 Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking Qiangqiang Wu et.al. 2507.07483 null
2025-07-09 Token Bottleneck: One Token to Remember Dynamics Taekyung Kim et.al. 2507.06543 null
2025-07-08 What You Have is What You Track: Adaptive and Robust Multimodal Tracking Yuedong Tan et.al. 2507.05899 null
2025-07-08 Stable Tracking-in-the-Loop Control of Cable-Driven Surgical Manipulators under Erroneous Kinematic Chains Neelay Joglekar et.al. 2507.05663 null
2025-07-07 Spatial and Semantic Embedding Integration for Stereo Sound Event Localization and Detection in Regular Videos Davide Berghi et.al. 2507.04845 null
2025-07-05 Sensitive and accurate femtosecond pulse characterization via two-photon absorption in Fabry-Pérot laser diodes Adrian F. Chlebowski et.al. 2507.03978 null
2025-07-01 UMDATrack: Unified Multi-Domain Adaptive Tracking Under Adverse Weather Conditions Siyuan Yao et.al. 2507.00648 null
2025-07-01 ATSTrack: Enhancing Visual-Language Tracking by Aligning Temporal and Spatial Scales Yihao Zhen et.al. 2507.00454 null
2025-06-30 Mamba-FETrack V2: Revisiting State Space Model for Frame-Event based Visual Object Tracking Shiao Wang et.al. 2506.23783 null
2025-07-22 R1-Track: Direct Application of MLLMs to Visual Object Tracking via Reinforcement Learning Biao Wang et.al. 2506.21980 null
2025-06-25 Exploiting Lightweight Hierarchical ViT and Dynamic Framework for Efficient Visual Tracking Ben Kang et.al. 2506.20381 null
2025-06-17 Comparison of Two Methods for Stationary Incident Detection Based on Background Image Deepak Ghimire et.al. 2506.14256 null
2025-06-03 MVTD: A Benchmark Dataset for Maritime Visual Object Tracking Ahsan Baidar Bakht et.al. 2506.02866 null
2025-05-31 Towards Effective and Efficient Adversarial Defense with Diffusion Models for Robust Visual Tracking Long Xu et.al. 2506.00325 link
2025-05-29 CLDTracker: A Comprehensive Language Description for Visual Tracking Mohamad Alansari et.al. 2505.23704 link
2025-05-29 TrackVLA: Embodied Visual Tracking in the Wild Shaoan Wang et.al. 2505.23189 null
2025-05-28 TwinTrack: Bridging Vision and Contact Physics for Real-Time Tracking of Unknown Dynamic Objects Wen Yang et.al. 2505.22882 null
2025-05-27 Fully Spiking Neural Networks for Unified Frame-Event Object Tracking Jingjun Yang et.al. 2505.20834 null
2025-05-28 VLM Can Be a Good Assistant: Enhancing Embodied Visual Tracking with Self-Improving Vision-Language Models Kui Wu et.al. 2505.20718 null
2025-05-27 Hierarchical Instruction-aware Embodied Visual Tracking Kui Wu et.al. 2505.20710 null
2025-06-01 HAND Me the Data: Fast Robot Adaptation via Hand Path Retrieval Matthew Hong et.al. 2505.20455 null
2025-05-28 Progressive Scaling Visual Object Tracking Jack Hong et.al. 2505.19990 null
2025-05-23 Adapting SAM 2 for Visual Object Tracking: 1st Place Solution for MMVPR Challenge Multi-Modal Tracking Cheng-Yen Yang et.al. 2505.18111 null
2025-05-22 Efficient Motion Prompt Learning for Robust Visual Tracking Jie Zhao et.al. 2505.16321 link
2025-05-19 Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach Shiao Wang et.al. 2505.12903 link
2025-05-13 Towards Adaptive Meta-Gradient Adversarial Examples for Visual Tracking Wei-Long Tian et.al. 2505.08999 link
2025-05-11 DeepSORT-Driven Visual Tracking Approach for Gesture Recognition in Interactive Systems Tong Zhang et.al. 2505.07110 null
2025-05-09 CGTrack: Cascade Gating Network with Hierarchical Feature Aggregation for UAV Tracking Weihong Li et.al. 2505.05936 link
2025-05-07 Predicting Road Surface Anomalies by Visual Tracking of a Preceding Vehicle Petr Jahoda et.al. 2505.04392 null
2025-04-19 Adversarial Attack for RGB-Event based Visual Object Tracking Qiang Chen et.al. 2504.14423 link
2025-05-05 SAM2MOT: A Novel Paradigm of Multi-Object Tracking by Segmentation Junjie Jiang et.al. 2504.04519 link
2025-03-24 SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking Wenrui Cai et.al. 2503.18338 link
2025-03-22 MUST: The First Dataset and Unified Framework for Multispectral UAV Single Object Tracking Haolin Qin et.al. 2503.17699 link
2025-03-21 Dynamic Attention Mechanism in Spatiotemporal Memory Networks for Object Tracking Meng Zhou et.al. 2503.16768 null
2025-03-17 UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network Siyuan Yao et.al. 2503.12888 link
2025-03-16 A Plug-and-Play Learning-based IMU Bias Factor for Robust Visual-Inertial Odometry Yang Yi et.al. 2503.12527 null
2025-03-14 Towards General Multimodal Visual Tracking Andong Lu et.al. 2503.11218 null
2025-03-09 Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking Chaocan Xue et.al. 2503.06625 link
2025-03-09 Dynamic Updates for Language Adaptation in Visual-Language Tracking Xiaohai Li et.al. 2503.06621 link
2025-02-28 Technical Report for ReID-SAM on SkiTB Visual Tracking Challenge 2025 Kunjun Li et.al. 2503.01907 null
2025-03-01 Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking Jiawen Zhu et.al. 2503.00516 link
2025-02-27 MITracker: Multi-View Integration for Visual Object Tracking Mengjie Xu et.al. 2502.20111 null
2025-02-27 CFTrack: Enhancing Lightweight Visual Tracking through Contrastive Learning and Feature Matching Juntao Liang et.al. 2502.19705 null
2025-02-26 Enhanced Transformer-Based Tracking for Skiing Events: Overcoming Multi-Camera Challenges, Scale Variations and Rapid Motion -- SkiTB Visual Tracking Challenge 2025 Akhil Penta et.al. 2502.18867 null
2025-02-25 UASTrack: A Unified Adaptive Selection Framework with Modality-Customization in Single Object Tracking He Wang et.al. 2502.18220 null
2025-02-08 Event Stream-based Visual Object Tracking: HDETrack V2 and A High-Definition Benchmark Shiao Wang et.al. 2502.05574 link
2025-01-13 Robust Single Object Tracking in LiDAR Point Clouds under Adverse Weather Conditions Xiantong Zhao et.al. 2501.07133 null
2025-01-05 DeTrack: In-model Latent Denoising Learning for Visual Object Tracking Xinyu Zhou et.al. 2501.02467 null
2025-01-13 FusionSORT: Fusion Methods for Online Multi-object Visual Tracking Nathanael L. Baisa et.al. 2501.00843 link
2025-01-01 Less is More: Token Context-aware Learning for Object Tracking Chenlong Xu et.al. 2501.00758 link
2024-12-28 Learning Adaptive and View-Invariant Vision Transformer with Multi-Teacher Knowledge Distillation for Real-Time UAV Tracking You Wu et.al. 2412.20002 link
2024-12-26 SUTrack: Towards Simple and Unified Single Object Tracking Xin Chen et.al. 2412.19138 link
2024-12-15 Exploring Enhanced Contextual Information for Video-Level Object Tracking Ben Kang et.al. 2412.11023 link
2024-12-13 Visual Object Tracking across Diverse Data Modalities: A Review Mengmeng Wang et.al. 2412.09991 null
2025-03-07 MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues Zhaofeng Hu et.al. 2412.02734 link
2024-12-03 GSOT3D: Towards Generic 3D Single Object Tracking in the Wild Yifan Jiao et.al. 2412.02129 link
2025-02-06 Improving Accuracy and Generalization for Efficient Visual Tracking Ram Zaveri et.al. 2411.18855 null
2024-11-27 A comparison of extended object tracking with multi-modal sensors in indoor environment Jiangtao Shuai et.al. 2411.18476 null
2024-12-04 A Distractor-Aware Memory for Visual Object Tracking with SAM2 Jovana Videnovic et.al. 2411.17576 link
2024-11-23 How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking Xuchen Li et.al. 2411.15600 null
2024-11-24 ClickTrack: Towards Real-time Interactive Single Object Tracking Kuiran Wang et.al. 2411.13183 null
2024-11-30 SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory Cheng-Yen Yang et.al. 2411.11922 link
2024-12-09 Vision Eagle Attention: a new lens for advancing image classification Mahmudul Hasan et.al. 2411.10564 link
2024-11-14 MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation Jonas Serych et.al. 2411.09551 link
2024-11-12 Visual Tracking with Intermittent Visibility: Switched Control Design and Implementation Yangge Li et.al. 2411.08144 null
2024-12-16 ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model Yiming Sun et.al. 2411.01756 null
2024-10-30 IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking Run Luo et.al. 2410.23907 null
2024-10-27 NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object Tracking Yu Liu et.al. 2410.20421 link
2024-10-19 The Solution for Single Object Tracking Task of Perception Test Challenge 2024 Zhiqiang Zhong et.al. 2410.16329 null
2024-10-13 Gaussian Splatting Visual MPC for Granular Media Manipulation Wei-Cheng Tseng et.al. 2410.09740 null
2024-10-09 DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM Xuchen Li et.al. 2410.02492 null
2024-09-30 Opt-in Camera: Person Identification in Video via UWB Localization and Its Application to Opt-in Systems Matthew Ishige et.al. 2409.19891 null
2024-09-27 Improving Visual Object Tracking through Visual Prompting Shih-Fang Chen et.al. 2409.18901 link
2024-09-26 General Compression Framework for Efficient Transformer Object Tracking Lingyi Hong et.al. 2409.17564 null
2024-09-25 Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2 Chunhui Zhang et.al. 2409.16902 link
2024-09-25 Conditional Generative Denoiser for Nighttime UAV Tracking Yucheng Wang et.al. 2409.16834 link
2024-09-25 Progressive Representation Learning for Real-Time UAV Tracking Changhong Fu et.al. 2409.16652 link
2024-09-25 Enhancing Nighttime UAV Tracking with Light Distribution Suppression Liangliang Yao et.al. 2409.16631 link
2024-09-19 WeHelp: A Shared Autonomy System for Wheelchair Users Abulikemu Abuduweili et.al. 2409.12159 link
2024-09-18 Distilling Channels for Efficient Deep Tracking Shiming Ge et.al. 2409.11785 null
2024-09-13 Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark Xuchen Li et.al. 2409.08887 null
2024-09-10 VBIT: Towards Enhancing Privacy Control Over IoT Devices Jad Al Aaraj et.al. 2409.06233 null
2024-09-03 Ultra-broadband room-temperature Fourier transform spectrometer with watt-level power consumption Jakub Mnich et.al. 2409.01875 null
2024-08-25 Camouflaged_Object_Tracking__A_Benchmark Xiaoyu Guo et.al. 2408.13877 link
2024-08-21 Low-Light Object Tracking: A Benchmark Pengzhi Zhong et.al. 2408.11463 link
2024-08-20 MambaEVT: Event Stream based Visual Object Tracking using State Space Model Xiao Wang et.al. 2408.10487 link
2024-08-05 VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking Yuxuan Lu et.al. 2408.02263 null
2024-09-06 3D Single-object Tracking in Point Clouds with High Temporal Variation Qiao Wu et.al. 2408.02049 null
2024-09-09 SiamMo: Siamese Motion-Centric 3D Object Tracking Yuxiang Yang et.al. 2408.01688 link
2024-08-02 Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach Yabin Zhu et.al. 2408.00969 link
2024-08-06 Broadband THz wave generation and detection in organic crystal PNPA at MHz repetition rates Lukasz A. Sterczewski et.al. 2407.20745 null
2024-07-16 Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers Zhengbo Zhang et.al. 2407.08394 null
2024-07-11 PINN-Ray: A Physics-Informed Neural Network to Model Soft Robotic Fin Ray Fingers Xing Wang et.al. 2407.08222 null
2024-07-07 Addressing single object tracking in satellite imagery through prompt-engineered solutions Athena Psalta et.al. 2407.05518 null
2024-07-07 Learning Motion Blur Robust Vision Transformers with Dynamic Early Exit for Real-Time UAV Tracking You Wu et.al. 2407.05383 null
2024-07-09 P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds Jiahao Nie et.al. 2407.05238 link
2024-07-07 Tracking Reflected Objects: A Benchmark Xiaoyu Guo et.al. 2407.05235 null
2024-07-04 TrackPGD: A White-box Attack using Binary Masks against Robust Transformer Trackers Fatemeh Nourilenjan Nokabadi et.al. 2407.03946 link
2024-07-02 FlowTrack: Point-level Flow Network for 3D Single Object Tracking Shuo Li et.al. 2407.01959 null
2024-09-07 eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking Yucheng Chen et.al. 2406.20024 null
2024-06-14 Constrained Motion Planning for a Robotic Endoscope Holder based on Hierarchical Quadratic Programming Jacinto Colan et.al. 2406.09982 null
2024-06-14 Robust compressive tracking via online weighted multiple instance learning Sandeep Singh Sengar et.al. 2406.09914 null
2024-07-01 Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking Xiangyang Yang et.al. 2406.08037 null
2024-06-07 Multi-Granularity Language-Guided Multi-Object Tracking Yuhao Li et.al. 2406.04844 link
2024-06-02 Robust Visual Tracking via Iterative Gradient Descent and Threshold Selection Zhuang Qi et.al. 2406.00589 null
2024-05-28 Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion Hongze Sun et.al. 2405.17903 link
2024-05-27 LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking Shaohua Dong et.al. 2405.17660 null
2024-05-31 Awesome Multi-modal Object Tracking Chunhui Zhang et.al. 2405.14200 link
2024-05-20 DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM Xuchen Li et.al. 2405.12139 null
2024-05-16 A Novel Bounding Box Regression Method for Single Object Tracking Omar Abdelaziz et.al. 2405.10444 null
2024-05-16 Beyond Traditional Single Object Tracking: A Survey Omar Abdelaziz et.al. 2405.10439 null
2024-05-08 TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking Pengcheng Shao et.al. 2405.05004 link
2024-04-22 360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos Yinzhe Xu et.al. 2404.13953 link
2024-05-25 An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training Jin Gao et.al. 2404.12210 link
2024-04-16 Attention-Aware Visualization: Tracking and Responding to User Perception Over Time Arvind Srinivasan et.al. 2404.10732 null
2024-04-15 Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL Fangwei Zhong et.al. 2404.09857 null
2024-04-15 Learning Tracking Representations from Single Point Annotations Qiangqiang Wu et.al. 2404.09504 null
2024-04-11 PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds Weisheng Xu et.al. 2404.07495 link
2024-05-02 Longitudinal Analysis and Quantitative Assessment of Child Development through Mobile Interaction Juan Carlos Ruiz-Garcia et.al. 2404.06919 link
2024-04-09 LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks Jianlang Chen et.al. 2404.06247 link
2024-04-08 Semi-Supervised Novelty Detection for Precise Ultra-Wideband Error Signal Prediction Umberto Albertin et.al. 2404.05351 null
2024-03-29 Context-Aware Integration of Language and Visual References for Natural Language Tracking Yanyan Shao et.al. 2403.19975 null
2024-03-27 TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes Liangyu Xu et.al. 2403.18238 null
2024-03-26 OmniVid: A Generative Framework for Universal Video Understanding Junke Wang et.al. 2403.17935 link
2024-03-26 Exploring Dynamic Transformer for Efficient Object Tracking Jiawen Zhu et.al. 2403.17651 null
2024-03-29 Elysium: Exploring Object-level Perception in Videos via MLLM Han Wang et.al. 2403.16558 link
2024-03-25 Multi-attention Associate Prediction Network for Visual Tracking Xinglong Sun et.al. 2403.16395 null
2024-03-28 SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking Xiaojun Hou et.al. 2403.16002 link
2024-03-23 Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking Shaoyu Sun et.al. 2403.15831 null
2024-03-19 TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO Chaoran Xiong et.al. 2403.12504 link
2024-03-18 Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model Jan Krejčí et.al. 2403.11978 null
2024-03-16 A Spectrum-based Image Denoising Method with Edge Feature Enhancement Peter Luvton et.al. 2403.11036 null
2024-03-15 Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers Jinxia Xie et.al. 2403.10574 null
2024-03-14 OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning Lingyi Hong et.al. 2403.09634 null
2024-02-27 ACTrack: Adding Spatio-Temporal Condition for Visual Object Tracking Yushan Han et.al. 2403.07914 null
2024-04-03 Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline Xiao Wang et.al. 2403.05839 link
2024-03-08 Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance Liting Lin et.al. 2403.05231 link
2024-03-08 Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy Yuelin Zhang et.al. 2403.05146 link
2024-03-06 VastTrack: Vast Category Visual Object Tracking Liang Peng et.al. 2403.03493 link
2024-02-28 Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks Zhewei Wu et.al. 2402.17976 null
2024-02-26 SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking Yu Lin et.al. 2402.16249 link
2024-02-26 Reading Relevant Feature from Global Representation Memory for Visual Object Tracking Xinyu Zhou et.al. 2402.14392 null
2024-02-13 Optimized Information Flow for Transformer Tracking Janani Kugarajeevan et.al. 2402.08195 link
2024-02-07 BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision Xin Zhao et.al. 2402.04519 null
2024-02-04 Spatio-temporal Prompting Network for Robust Video Feature Extraction Guanxiong Sun et.al. 2402.02574 link
2024-01-24 Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region Shengjing Tian et.al. 2401.13285 null
2024-01-23 Correlation-Embedded Transformer Tracking: A Single-Branch Framework Fei Xie et.al. 2401.12743 link
2024-01-20 Unifying Visual and Vision-Language Tracking via Contrastive Learning Yinchao Ma et.al. 2401.11228 link
2024-01-20 Towards Category Unification of 3D Single Object Tracking on Point Clouds Jiahao Nie et.al. 2401.11204 null
2024-01-18 Multi-task Learning for Joint Re-identification, Team Affiliation, and Role Classification for Sports Visual Tracking Amir M. Mansourian et.al. 2401.09942 null
2024-01-12 Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements Muhammad Wasim Nawaz et.al. 2401.06396 null
2024-01-18 Hold 'em and Fold 'em: Towards Human-scale, Feedback-Controlled Soft Origami Robots Immanuel Ampomah Mensah et.al. 2401.04650 null
2024-01-06 Explicit Visual Prompts for Visual Object Tracking Liangtao Shi et.al. 2401.03142 link
2024-01-03 ODTrack: Online Dense Temporal Token Learning for Visual Tracking Yaozong Zheng et.al. 2401.01686 link
2023-12-27 X Modality Assisting RGBT Object Tracking Zhaisheng Ding et.al. 2312.17273 null
2023-12-22 Cross-Modal Object Tracking via Modality-Aware Fusion Network and A Large-Scale Dataset Lei Liu et.al. 2312.14446 link
2023-12-18 Multi-Correlation Siamese Transformer Network with Dense Connection for 3D Single Object Tracking Shihao Feng et.al. 2312.11051 link
2023-12-17 Robust 3D Tracking with Quality-Aware Shape Completion Jingwen Zhang et.al. 2312.10608 null
2023-12-15 Tracking Skiers from the Top to the Bottom Matteo Dunnhofer et.al. 2312.09723 null
2023-12-11 M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking Jiaming Liu et.al. 2312.06117 link
2023-12-07 Instance Tracking in 3D Scenes from Egocentric Videos Yunhan Zhao et.al. 2312.04117 link
2024-02-19 Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking Jiawei Ge et.al. 2311.17085 null
2023-11-21 Visual tracking brain computer interface Changxing Huang et.al. 2311.12592 null
2024-01-10 ViKi-HyCo: A Hybrid-Control approach for complex car-like maneuvers Edison P. Velasco Sánchez et.al. 2311.07268 null

(back to top)

Large Language Model

Publish Date Title Authors PDF Code
2025-07-23 Pretraining on the Test Set Is No Longer All You Need: A Debate-Driven Approach to QA Benchmarks Linbo Cao et.al. 2507.17747 null
2025-07-23 Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains Anisha Gunjal et.al. 2507.17746 null
2025-07-23 Megrez2 Technical Report Boxun Li et.al. 2507.17728 null
2025-07-23 BetterCheck: Towards Safeguarding VLMs for Automotive Perception Systems Malsha Ashani Mahawatta Dona et.al. 2507.17722 null
2025-07-23 AI Telephone Surveying: Automating Quantitative Data Collection with an AI Interviewer Danny D. Leybzon et.al. 2507.17718 null
2025-07-23 HydraOpt: Navigating the Efficiency-Performance Trade-off of Adapter Merging Taha Ceritli et.al. 2507.17706 null
2025-07-23 Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models Changxin Tian et.al. 2507.17702 null
2025-07-23 Thinking Isn't an Illusion: Overcoming the Limitations of Reasoning Models via Tool Augmentations Zhao Song et.al. 2507.17699 null
2025-07-23 Symbiotic Agents: A Novel Paradigm for Trustworthy AGI-driven Networks Ilias Chatzistefanidis et.al. 2507.17695 null
2025-07-23 Simulating multiple human perspectives in socio-ecological systems using large language models Yongchao Zeng et.al. 2507.17680 null
2025-07-23 See the Forest and the Trees: A Synergistic Reasoning Framework for Knowledge-Based Visual Question Answering Junjie Wang et.al. 2507.17659 null
2025-07-23 Who Attacks, and Why? Using LLMs to Identify Negative Campaigning in 18M Tweets across 19 Countries Victor Hartman et.al. 2507.17636 null
2025-07-23 A Hybrid Early-Exit Algorithm for Large Language Models Based on Space Alignment Decoding (SPADE) Bowen Zheng et.al. 2507.17618 null
2025-07-23 Decoding Consumer Preferences Using Attention-Based Language Models Joshua Foster et.al. 2507.17564 null
2025-07-23 BoSS: Beyond-Semantic Speech Qing Wang et.al. 2507.17563 null
2025-07-23 CodeReasoner: Enhancing the Code Reasoning Ability with Reinforcement Learning Lingxiao Tang et.al. 2507.17548 null
2025-07-23 Anticipate, Simulate, Reason (ASR): A Comprehensive Generative AI Framework for Combating Messaging Scams Xue Wen Tan et.al. 2507.17543 null
2025-07-23 AssertFlip: Reproducing Bugs via Inversion of LLM-Generated Passing Tests Lara Khatib et.al. 2507.17542 null
2025-07-23 Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning Xinyao Liu et.al. 2507.17539 null
2025-07-23 InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation Shuai Yang et.al. 2507.17520 null
2025-07-22 Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning Junhao Shen et.al. 2507.16814 null
2025-07-22 LingBench++: A Linguistically-Informed Benchmark and Reasoning Framework for Multi-Step and Cross-Cultural Inference with LLMs Da-Chen Lian et.al. 2507.16809 null
2025-07-22 Rethinking LLM-Based RTL Code Optimization Via Timing Logic Metamorphosis Zhihao Xu et.al. 2507.16808 null
2025-07-22 Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty Mehul Damani et.al. 2507.16806 null
2025-07-23 Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning Yanjun Zheng et.al. 2507.16802 null
2025-07-23 Test-Time-Matching: Decouple Personality, Memory, and Linguistic Style in LLM-based Role-Playing Language Agent Xiaoyu Zhan et.al. 2507.16799 null
2025-07-22 Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning Helena Casademunt et.al. 2507.16795 null
2025-07-22 ChatChecker: A Framework for Dialogue System Testing and Evaluation Through Non-cooperative User Simulation Roman Mayr et.al. 2507.16792 null
2025-07-22 Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning Hongyin Luo et.al. 2507.16784 null
2025-07-22 Cooling Matters: Benchmarking Large Language Models and Vision-Language Models on Liquid-Cooled Versus Air-Cooled H100 GPU Systems Imran Latif et.al. 2507.16781 null
2025-07-22 When LLMs Copy to Think: Uncovering Copy-Guided Attacks in Reasoning LLMs Yue Li et.al. 2507.16773 null
2025-07-22 WGRAMMAR: Leverage Prior Knowledge to Accelerate Structured Decoding Ran Wang et.al. 2507.16768 null
2025-07-22 Never Come Up Empty: Adaptive HyDE Retrieval for Improving LLM Developer Support Fangjian Lei et.al. 2507.16754 null
2025-07-22 CMP: A Composable Meta Prompt for SAM-Based Cross-Domain Few-Shot Segmentation Shuai Chen et.al. 2507.16753 null
2025-07-22 Collaborative Inference and Learning between Edge SLMs and Cloud LLMs: A Survey of Algorithms, Execution, and Open Challenges Senyao Li et.al. 2507.16731 null
2025-07-23 Deliberative Searcher: Improving LLM Reliability via Reinforcement Learning with constraints Zhenyun Yin et.al. 2507.16727 null
2025-07-22 SALM: Spatial Audio Language Model with Structured Embeddings for Understanding and Editing Jinbo Hu et.al. 2507.16724 null
2025-07-22 Enhancing Remote Sensing Vision-Language Models Through MLLM and LLM-Based High-Quality Image-Text Dataset Generation Yiguo He et.al. 2507.16716 null
2025-07-22 Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory Guowei Lan et.al. 2507.16713 null
2025-07-22 Advancing Risk and Quality Assurance: A RAG Chatbot for Improved Regulatory Compliance Lars Hillebrand et.al. 2507.16711 null
2025-07-21 Diffusion Beats Autoregressive in Data-Constrained Settings Mihir Prabhudesai et.al. 2507.15857 null
2025-07-21 Gemini 2.5 Pro Capable of Winning Gold at IMO 2025 Yichen Huang et.al. 2507.15855 null
2025-07-22 SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction Zhixiong Zhang et.al. 2507.15852 null
2025-07-21 The Other Mind: How Language Models Exhibit Human Temporal Cognition Lingyu Li et.al. 2507.15851 null
2025-07-21 3LM: Bridging Arabic, STEM, and Code through Benchmarking Basma El Amel Boussaha et.al. 2507.15850 null
2025-07-21 The Impact of Language Mixing on Bilingual LLM Reasoning Yihao Li et.al. 2507.15849 null
2025-07-21 FASTGEN: Fast and Cost-Effective Synthetic Tabular Data Generation with LLMs Anh Nguyen et.al. 2507.15839 null
2025-07-21 Just Ask for Music (JAM): Multimodal and Personalized Natural Language Music Recommendation Alessandro B. Melchiorre et.al. 2507.15826 null
2025-07-21 ACS: An interactive framework for conformal selection Yu Gui et.al. 2507.15825 null
2025-07-21 Can Your Model Separate Yolks with a Water Bottle? Benchmarking Physical Commonsense Understanding in Video Generation Models Enes Sanli et.al. 2507.15824 null
2025-07-21 Do AI models help produce verified bug fixes? Li Huang et.al. 2507.15822 null
2025-07-21 LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra Seth Karten et.al. 2507.15815 null
2025-07-21 True Multimodal In-Context Learning Needs Attention to the Visual Context Shuo Chen et.al. 2507.15807 null
2025-07-21 ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction Danhui Chen et.al. 2507.15803 null
2025-07-21 Regularized Low-Rank Adaptation for Few-Shot Organ Segmentation Ghassen Baklouti et.al. 2507.15793 null
2025-07-21 Small LLMs Do Not Learn a Generalizable Theory of Mind via Reinforcement Learning Sneheel Sarangi et.al. 2507.15788 null
2025-07-21 Reservoir Computing as a Language Model Felix Köster et.al. 2507.15779 null
2025-07-21 Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR Jiakang Wang et.al. 2507.15778 null
2025-07-21 Left Leaning Models: AI Assumptions on Economic Policy Maxim Chupilkin et.al. 2507.15771 null
2025-07-21 A Framework for Analyzing Abnormal Emergence in Service Ecosystems Through LLM-based Agent Intention Mining Yifan Shen et.al. 2507.15770 null
2025-07-18 Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning Shashanka Venkataramanan et.al. 2507.14137 null
2025-07-18 CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning Xiaoya Li et.al. 2507.14111 null
2025-07-18 Automated Interpretation of Non-Destructive Evaluation Contour Maps Using Large Language Models for Bridge Condition Assessment Viraj Nishesh Darji et.al. 2507.14107 null
2025-07-18 Generative AI-Driven High-Fidelity Human Motion Simulation Hari Iyer et.al. 2507.14097 null
2025-07-18 Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) track Brian Ondov et.al. 2507.14096 null
2025-07-18 DPMT: Dual Process Multi-scale Theory of Mind Framework for Real-time Human-AI Collaboration Xiyun Li et.al. 2507.14088 null
2025-07-18 DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits Garapati Keerthana et.al. 2507.14079 null
2025-07-18 VLA-Mark: A cross modal watermark for large vision-language alignment model Shuliang Liu et.al. 2507.14067 null
2025-07-18 Foundation Models as Class-Incremental Learners for Dermatological Image Classification Mohamed Elkhayat et.al. 2507.14050 null
2025-07-18 EdgeVLA: Efficient Vision-Language-Action Models Paweł Budzianowski et.al. 2507.14049 null
2025-07-18 Evaluating the Effectiveness of Cost-Efficient Large Language Models in Benchmark Biomedical Tasks Israt Jahan et.al. 2507.14045 null
2025-07-18 Architecting Human-AI Cocreation for Technical Services -- Interaction Modes and Contingency Factors Jochen Wulf et.al. 2507.14034 null
2025-07-18 KROMA: Ontology Matching with Knowledge Retrieval and Large Language Models Lam Nguyen et.al. 2507.14032 null
2025-07-18 Moodifier: MLLM-Enhanced Emotion-Driven Image Editing Jiarong Ye et.al. 2507.14024 null
2025-07-18 Efficient Temporal Tokenization for Mobility Prediction with Large Language Models Haoyu He et.al. 2507.14017 null
2025-07-18 OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models Ningyong Wu et.al. 2507.13993 null
2025-07-18 Leveraging Pathology Foundation Models for Panoptic Segmentation of Melanoma in H&E Images Jiaqi Lv et.al. 2507.13974 null
2025-07-18 Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need Bhishma Dedhia et.al. 2507.13966 null
2025-07-18 DUALRec: A Hybrid Sequential and Language Model Framework for Context-Aware Movie Recommendation Yitong Li et.al. 2507.13957 null
2025-07-18 Cross-modal Causal Intervention for Alzheimer's Disease Prediction Yutao Jin et.al. 2507.13956 null
2025-07-17 VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding Shihao Wang et.al. 2507.13353 null
2025-07-17 VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Senqiao Yang et.al. 2507.13348 null
2025-07-17 Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes Tyler Loakman et.al. 2507.13335 null
2025-07-17 A Survey of Context Engineering for Large Language Models Lingrui Mei et.al. 2507.13334 null
2025-07-17 The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner Zhouqi Hua et.al. 2507.13332 null
2025-07-17 Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It Yulu Qin et.al. 2507.13328 null
2025-07-17 GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM Kyeongjin Ahn et.al. 2507.13323 null
2025-07-17 HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals Guimin Hu et.al. 2507.13318 null
2025-07-17 Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark Junsu Kim et.al. 2507.13314 null
2025-07-17 The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations Carlos Arriaga et.al. 2507.13302 null
2025-07-17 AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research Yilun Zhao et.al. 2507.13300 null
2025-07-17 Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management Luis Gasco et.al. 2507.13275 null
2025-07-17 Automating Steering for Safe Multimodal Large Language Models Lyucheng Wu et.al. 2507.13255 null
2025-07-17 HATS: Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models Ashray Gupta et.al. 2507.13238 null
2025-07-17 Enhancing Cross-task Transfer of Large Language Models via Activation Steering Xinyu Tang et.al. 2507.13236 null
2025-07-18 MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling Etienne Le Naour et.al. 2507.13207 null
2025-07-18 Automatically assessing oral narratives of Afrikaans and isiXhosa children Retief Louw et.al. 2507.13205 null
2025-07-17 GEMMAS: Graph-based Evaluation Metrics for Multi Agent Systems Jisoo Lee et.al. 2507.13190 null
2025-07-17 Black Box Deployed -- Functional Criteria for Artificial Moral Agents in the LLM Era Matthew E. Brophy et.al. 2507.13175 null
2025-07-17 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities Hao Sun et.al. 2507.13158 null
2025-07-16 Language Models Improve When Pretraining Data Matches Target Tasks David Mizrahi et.al. 2507.12466 null
2025-07-16 PhysX: Physical-Grounded 3D Asset Generation Ziang Cao et.al. 2507.12465 null
2025-07-16 CytoSAE: Interpretable Cell Embeddings for Hematology Muhammed Furkan Dasdelen et.al. 2507.12464 null
2025-07-16 Mitigating Object Hallucinations via Sentence-Level Early Intervention Shangpin Peng et.al. 2507.12455 null
2025-07-16 Characterizing State Space Model (SSM) and SSM-Transformer Hybrid Language Model Performance with Long Context Length Saptarshi Mitra et.al. 2507.12442 null
2025-07-16 Describe Anything Model for Visual Question Answering on Text-rich Images Yen-Linh Vu et.al. 2507.12441 null
2025-07-16 Can We Predict Alignment Before Models Finish Thinking? Towards Monitoring Misaligned Reasoning Models Yik Siu Chan et.al. 2507.12428 null
2025-07-16 Advancing Retrieval-Augmented Generation for Structured Enterprise and Internal Data Chandana Cheerla et.al. 2507.12425 null
2025-07-16 SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? Xinyi He et.al. 2507.12415 null
2025-07-16 AutoVDC: Automated Vision Data Cleaning Using Vision-Language Models Santosh Vasa et.al. 2507.12414 null
2025-07-16 ROC-n-reroll: How verifier imperfection affects test-time scaling Florian E. Dorner et.al. 2507.12399 null
2025-07-16 Assessing the Value of Visual Input: A Benchmark of Multimodal Large Language Models for Robotic Path Planning Jacinto Colan et.al. 2507.12391 null
2025-07-16 Probing for Arithmetic Errors in Language Models Yucheng Sun et.al. 2507.12379 null
2025-07-16 Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker Rachna Saxena et.al. 2507.12378 null
2025-07-16 Web-Browsing LLMs Can Access Social Media Profiles and Infer User Demographics Meysam Alizadeh et.al. 2507.12372 null
2025-07-16 Beyond Single Models: Enhancing LLM Detection of Ambiguity in Requests through Debate Ana Davila et.al. 2507.12370 null
2025-07-16 GitChameleon: Evaluating AI Code Generation Against Python Library Version Incompatibilities Diganta Misra et.al. 2507.12367 null
2025-07-16 Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models Samuel Lavoie et.al. 2507.12318 null
2025-07-16 Thought Purity: Defense Paradigm For Chain-of-Thought Attack Zihao Xue et.al. 2507.12314 null
2025-07-16 Chain-of-Descriptions: Improving Code LLMs for VHDL Code Generation and Summarization Prashanth Vijayaraghavan et.al. 2507.12308 null
2025-07-15 Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation Zhen Xu et.al. 2507.11540 null
2025-07-15 Streaming 4D Visual Geometry Transformer Dong Zhuo et.al. 2507.11539 null
2025-07-15 DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering Yinsheng Li et.al. 2507.11527 null
2025-07-15 LLM-based ambiguity detection in natural language instructions for collaborative surgical robots Ana Davila et.al. 2507.11525 null
2025-07-15 AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air Shiyi Yang et.al. 2507.11515 null
2025-07-15 LRMR: LLM-Driven Relational Multi-node Ranking for Lymph Node Metastasis Assessment in Rectal Cancer Yaoxian Dong et.al. 2507.11457 null
2025-07-16 Reasoning Strategies in Large Language Models: Can They Follow, Prefer, and Optimize? Yanjian Zhang et.al. 2507.11423 null
2025-07-15 Quantifying the Energy Consumption and Carbon Emissions of LLM Inference via Simulations Miray Özcan et.al. 2507.11417 null
2025-07-15 Seq vs Seq: An Open Suite of Paired Encoders and Decoders Orion Weller et.al. 2507.11412 null
2025-07-15 KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning? Soumadeep Saha et.al. 2507.11408 null
2025-07-15 EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes LG AI Research et.al. 2507.11407 null
2025-07-15 DCR: Quantifying Data Contamination in LLMs Evaluation Cheng Xu et.al. 2507.11405 null
2025-07-15 Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs Gabriel Bo et.al. 2507.11371 null
2025-07-15 From Chaos to Automation: Enabling the Use of Unstructured Data for Robotic Process Automation Kelly Kurowski et.al. 2507.11364 null
2025-07-15 What is the Best Process Model Representation? A Comparative Analysis for Process Modeling with Large Language Models Alexis Brissard et.al. 2507.11356 null
2025-07-15 Foundation Models for Logistics: Toward Certifiable, Conversational Planning Interfaces Yunhao Yang et.al. 2507.11352 null
2025-07-15 RefModel: Detecting Refactorings using Foundation Models Pedro Simões et.al. 2507.11346 null
2025-07-15 Guiding LLM Decision-Making with Fairness Reward Models Zara Hall et.al. 2507.11344 null
2025-07-15 MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network Jianfei Jiang et.al. 2507.11333 null
2025-07-16 Automated Novelty Evaluation of Academic Paper: A Collaborative Approach Integrating Human and Large Language Model Knowledge Wenqing Wu et.al. 2507.11330 null
2025-07-14 EmbRACE-3K: Embodied Reasoning and Action in Complex Environments Mingxian Lin et.al. 2507.10548 null
2025-07-14 Fusing LLM Capabilities with Routing Data Tao Feng et.al. 2507.10540 null
2025-07-14 Graph World Model Tao Feng et.al. 2507.10539 null
2025-07-14 CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks Hongchao Jiang et.al. 2507.10535 null
2025-07-14 Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Mingqi Wu et.al. 2507.10532 null
2025-07-14 Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Sangmin Bae et.al. 2507.10524 null
2025-07-14 Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI Jiangkai Wu et.al. 2507.10510 null
2025-07-14 Scene-Aware Conversational ADAS with Generative AI for Real-Time Driver Assistance Kyungtae Han et.al. 2507.10500 null
2025-07-14 Can You Detect the Difference? İsmail Tarım et.al. 2507.10475 null
2025-07-14 MLAR: Multi-layer Large Language Model-based Robotic Process Automation Applicant Tracking Mohamed T. Younes et.al. 2507.10472 null
2025-07-14 An Empirical Evaluation of AI-Powered Non-Player Characters' Perceived Realism and Performance in Virtual Reality Environments Mikko Korkiakoski et.al. 2507.10469 null
2025-07-14 Logic layer Prompt Control Injection (LPCI): A Novel Security Vulnerability Class in Agentic Systems Hammad Atta et.al. 2507.10457 null
2025-07-14 CoralVQA: A Large-Scale Visual Question Answering Dataset for Coral Reef Image Understanding Hongyong Han et.al. 2507.10449 null
2025-07-15 Text-Visual Semantic Constrained AI-Generated Image Quality Assessment Qiang Li et.al. 2507.10432 null
2025-07-14 Towards Emotion Co-regulation with LLM-powered Socially Assistive Robots: Integrating LLM Prompts and Robotic Behaviors to Support Parent-Neurodivergent Child Dyads Jing Li et.al. 2507.10427 null
2025-07-14 Multiple Choice Learning of Low Rank Adapters for Language Modeling Victor Letzelter et.al. 2507.10419 null
2025-07-14 Zorse: Optimizing LLM Training Efficiency on Heterogeneous GPU Clusters Runsheng Benson Guo et.al. 2507.10392 null
2025-07-14 Extracting Important Tokens in E-Commerce Queries with a Tag Interaction-Aware Transformer Model Md. Ahsanul Kabir et.al. 2507.10385 null
2025-07-14 Test-Time Canonicalization by Foundation Models for Robust Perception Utkarsh Singhal et.al. 2507.10375 null
2025-07-14 Beyond Graph Model: Reliable VLM Fine-Tuning via Random Graph Adapter Bo Jiang et.al. 2507.10355 null
2025-07-11 The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability? Denis Sutter et.al. 2507.08802 null
2025-07-11 Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective Hangjie Yuan et.al. 2507.08801 null
2025-07-11 KV Cache Steering for Inducing Reasoning in Small Language Models Max Belitsky et.al. 2507.08799 null
2025-07-11 One Token to Fool LLM-as-a-Judge Yulai Zhao et.al. 2507.08794 null
2025-07-11 From One to More: Contextual Part Latents for 3D Generation Shaocong Dong et.al. 2507.08772 null
2025-07-11 BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity Chenyang Song et.al. 2507.08771 null
2025-07-11 EqualMotion: Accessible Motion Capture for the Creative Industries Clarice Hilton et.al. 2507.08744 null
2025-07-11 Multilingual Multimodal Software Developer for Code Generation Linzheng Chai et.al. 2507.08719 null
2025-07-11 Unreal is all you need: Multimodal ISAC Data Simulation with Only One Engine Kongwu Huang et.al. 2507.08716 null
2025-07-11 KG-Attention: Knowledge Graph-Guided Attention at Test-Time via Bidirectional Information Aggregation Songlin Zhai et.al. 2507.08704 null
2025-07-11 ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way Rajarshi Roy et.al. 2507.08679 null
2025-07-11 LLMCup: Ranking-Enhanced Comment Updating with LLMs Hua Ge et.al. 2507.08671 null
2025-07-11 KELPS: A Framework for Verified Multi-Language Autoformalization via Semantic-Syntactic Alignment Jiyao Zhang et.al. 2507.08665 null
2025-07-11 Introspection of Thought Helps AI Agents Haoran Sun et.al. 2507.08664 null
2025-07-11 Leanabell-Prover-V2: Verifier-integrated Reasoning for Formal Theorem Proving via Reinforcement Learning Xingguang Ji et.al. 2507.08649 null
2025-07-11 DatasetAgent: A Novel Multi-Agent System for Auto-Constructing Datasets from Real-World Images Haoran Sun et.al. 2507.08648 null
2025-07-11 NL in the Middle: Code Translation with LLMs and Intermediate Representations Chi-en Amy Tai et.al. 2507.08627 null
2025-07-11 Adaptive Framework for Ambient Intelligence in Rehabilitation Assistance Gábor Baranyi et.al. 2507.08624 null
2025-07-11 A comprehensive study of LLM-based argument classification: from LLAMA through GPT-4o to Deepseek-R1 Marcin Pietroń et.al. 2507.08621 null
2025-07-11 Agentic Large Language Models for Conceptual Systems Engineering and Design Soheyl Massoudi et.al. 2507.08619 null
2025-07-10 Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs Ziyue Li et.al. 2507.07996 null
2025-07-10 Multigranular Evaluation for Brain Visual Decoding Weihao Xia et.al. 2507.07993 null
2025-07-10 Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs Jeongseok Hyun et.al. 2507.07990 null
2025-07-10 Automating Expert-Level Medical Reasoning Evaluation of Large Language Models Shuang Zhou et.al. 2507.07988 null
2025-07-10 CLIP Won't Learn Object-Attribute Binding from Natural Data and Here is Why Bijay Gurung et.al. 2507.07985 null
2025-07-10 OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding JingLi Lin et.al. 2507.07984 null
2025-07-10 Performance and Practical Considerations of Large and Small Language Models in Clinical Decision Support in Rheumatology Sabine Felde et.al. 2507.07983 null
2025-07-10 Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling Haoyu Wu et.al. 2507.07982 null
2025-07-10 Why is Your Language Model a Poor Implicit Reward Model? Noam Razin et.al. 2507.07981 null
2025-07-10 Defending Against Prompt Injection With a Few DefensiveTokens Sizhe Chen et.al. 2507.07974 null
2025-07-10 Scaling RL to Long Videos Yukang Chen et.al. 2507.07966 null
2025-07-10 MIRIX: Multi-Agent Memory System for LLM-Based Agents Yu Wang et.al. 2507.07957 null
2025-07-10 Dynamic Chunking for End-to-End Hierarchical Sequence Modeling Sukjun Hwang et.al. 2507.07955 null
2025-07-10 Input Conditioned Layer Dropping in Speech Foundation Models Abdul Hannan et.al. 2507.07954 null
2025-07-10 SAGE: A Visual Language Model for Anomaly Detection via Fact Enhancement and Entropy-aware Alignment Guoxin Zang et.al. 2507.07939 null
2025-07-10 Can Large Language Models Improve Phishing Defense? A Large-Scale Controlled Experiment on Warning Dialogue Explanations Federico Maria Cau et.al. 2507.07916 null
2025-07-10 MIRA: A Novel Framework for Fusing Modalities in Medical RAG Jinhong Wang et.al. 2507.07902 null
2025-07-10 An Integrated Framework of Prompt Engineering and Multidimensional Knowledge Graphs for Legal Dispute Analysis Mingda Zhang et.al. 2507.07893 null
2025-07-10 Automating MD simulations for Proteins using Large language Models: NAMD-Agent Achuth Chandrasekhar et.al. 2507.07887 null
2025-07-10 Opting Out of Generative AI: a Behavioral Experiment on the Role of Education in Perplexity AI Avoidance Roberto Ulloa et.al. 2507.07881 null
2025-07-09 Towards Multimodal Understanding via Stable Diffusion as a Task-Aware Feature Extractor Vatsal Agarwal et.al. 2507.07106 null
2025-07-09 4KAgent: Agentic Any Image to 4K Super-Resolution Yushen Zuo et.al. 2507.07105 null
2025-07-09 Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models Tiezheng Zhang et.al. 2507.07104 null
2025-07-09 Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful Martin Marek et.al. 2507.07101 null
2025-07-09 Evaluating Attribute Confusion in Fashion Text-to-Image Generation Ziyue Liu et.al. 2507.07079 null
2025-07-09 5C Prompt Contracts: A Minimalist, Creative-Friendly, Token-Efficient Design Framework for Individual and SME LLM Usage Ugur Ari et.al. 2507.07045 null
2025-07-09 UniConv: Unifying Retrieval and Response Generation for Large Language Models in Conversations Fengran Mo et.al. 2507.07030 null
2025-07-09 FlexOlmo: Open Language Models for Flexible Data Use Weijia Shi et.al. 2507.07024 null
2025-07-09 First Return, Entropy-Eliciting Explore Tianyu Zheng et.al. 2507.07017 null
2025-07-09 Integrating Pathology Foundation Models and Spatial Transcriptomics for Cellular Decomposition from Histology Images Yutong Sun et.al. 2507.07013 null
2025-07-09 GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning S M Taslim Uddin Raju et.al. 2507.07006 null
2025-07-09 Learning Deliberately, Acting Intuitively: Unlocking Test-Time Reasoning in Multimodal LLMs Yahan Yu et.al. 2507.06999 null
2025-07-09 MCA-RG: Enhancing LLMs with Medical Concept Alignment for Radiology Report Generation Qilong Xing et.al. 2507.06992 null
2025-07-09 Are They All Good? Evaluating the Quality of CoTs in LLM-based Code Generation Binquan Zhang et.al. 2507.06980 null
2025-07-09 Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM Qiyuan Dai et.al. 2507.06973 null
2025-07-09 Scaling Towards the Information Boundary of Instruction Set: InfinityInstruct-Subject Technical Report Li Du et.al. 2507.06968 null
2025-07-09 CheXPO: Preference Optimization for Chest X-ray VLMs with Counterfactual Rationale Xiao Liang et.al. 2507.06959 null
2025-07-09 Investigating the Robustness of Retrieval-Augmented Generation at the Query Level Sezen Perçin et.al. 2507.06956 null
2025-07-10 What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models Keyon Vafa et.al. 2507.06952 null
2025-07-10 Rethinking Verification for LLM Code Generation: From Generation to Testing Zihan Ma et.al. 2507.06920 null
2025-07-08 RSRefSeg 2: Decoupling Referring Remote Sensing Image Segmentation with Foundation Models Keyan Chen et.al. 2507.06231 null
2025-07-08 Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers Zhiyuan Peng et.al. 2507.06223 null
2025-07-08 Aligned Textual Scoring Rules Yuxuan Lu et.al. 2507.06221 null
2025-07-08 Is Diversity All You Need for Scalable Robotic Manipulation? Modi Shi et.al. 2507.06219 null
2025-07-08 CultureCLIP: Empowering CLIP with Cultural Awareness through Synthetic Images and Contextualized Captions Yuchen Huang et.al. 2507.06210 null
2025-07-08 Ontological differentiation as a measure of semantic accuracy Pablo Garcia-Cuadrillero et.al. 2507.06208 null
2025-07-08 Differential Mamba Nadav Schneider et.al. 2507.06204 null
2025-07-08 A Survey on Latent Reasoning Rui-Jie Zhu et.al. 2507.06203 null
2025-07-08 UQLM: A Python Package for Uncertainty Quantification in Large Language Models Dylan Bouchard et.al. 2507.06196 null
2025-07-08 SQLBarber: A System Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads Jiale Lao et.al. 2507.06192 null
2025-07-08 The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains Scott Geng et.al. 2507.06187 null
2025-07-08 Hidden Prompts in Manuscripts Exploit AI-Assisted Peer Review Zhicheng Lin et.al. 2507.06185 null
2025-07-08 Enhancing Scientific Visual Question Answering through Multimodal Reasoning and Ensemble Modeling Prahitha Movva et.al. 2507.06183 null
2025-07-08 Data-Semantics-Aware Recommendation of Diverse Pivot Tables Whanhee Cho et.al. 2507.06171 null
2025-07-09 Skywork-R1V3 Technical Report Wei Shen et.al. 2507.06167 null
2025-07-08 Evaluation of Habitat Robotics using Large Language Models William Li et.al. 2507.06157 null
2025-07-08 Large Language Models Predict Human Well-being -- But Not Equally Everywhere Pat Pataranutaporn et.al. 2507.06141 null
2025-07-08 LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models Zhihao Chen et.al. 2507.06140 null
2025-07-08 Coding Triangle: How Does Large Language Model Understand Code? Taolin Zhang et.al. 2507.06138 null
2025-07-08 PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder Optimization Dongsheng Zuo et.al. 2507.06127 null
2025-07-07 Beyond Simple Edits: X-Planner for Complex Instruction-Based Image Editing Chun-Hsiao Yeh et.al. 2507.05259 null
2025-07-07 Spatio-Temporal LLM: Reasoning about Environments and Actions Haozhen Zheng et.al. 2507.05258 null
2025-07-07 Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions Yuanzhe Hu et.al. 2507.05257 null
2025-07-07 Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning Yana Wei et.al. 2507.05255 null
2025-07-07 Response Attack: Exploiting Contextual Priming to Jailbreak Large Language Models Ziqi Miao et.al. 2507.05248 null
2025-07-07 When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors Scott Emmons et.al. 2507.05246 null
2025-07-07 StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Meng Wei et.al. 2507.05240 null
2025-07-07 Logit Reweighting for Topic-Focused Summarization Joschka Braun et.al. 2507.05235 null
2025-07-07 NavigScene: Bridging Local Perception and Global Navigation for Beyond-Visual-Range Autonomous Driving Qucheng Peng et.al. 2507.05227 null
2025-07-07 QuEst: Enhancing Estimates of Quantile-Based Distributional Measures Using Model Predictions Zhun Deng et.al. 2507.05220 null
2025-07-07 All in One: Visual-Description-Guided Unified Point Cloud Segmentation Zongyan Han et.al. 2507.05211 null
2025-07-07 MedGemma Technical Report Andrew Sellergren et.al. 2507.05201 null
2025-07-07 Train-before-Test Harmonizes Language Model Rankings Guanhua Zhang et.al. 2507.05195 null
2025-07-07 CREW-WILDFIRE: Benchmarking Agentic Multi-Agent Collaborations at Scale Jonathan Hyun et.al. 2507.05178 null
2025-07-08 OpenS2S: Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model Chen Wang et.al. 2507.05177 null
2025-07-07 Differential Attention for Multimodal Crisis Event Analysis Nusrat Munia et.al. 2507.05165 null
2025-07-07 InfoSteer: Steering Information Utility in Language Model Post-Training Chunyuan Deng et.al. 2507.05158 null
2025-07-07 AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models Chinnappa Guggilla et.al. 2507.05157 null
2025-07-07 Interpretable Mnemonic Generation for Kanji Learning via Expectation-Maximization Jaewook Lee et.al. 2507.05137 null
2025-07-07 LERa: Replanning with Visual Feedback in Instruction Following Svyatoslav Pchelintsev et.al. 2507.05135 null
2025-07-03 Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation Jiaer Xia et.al. 2507.02859 null
2025-07-03 Requirements Elicitation Follow-Up Question Generation Yuchen Shen et.al. 2507.02858 null
2025-07-03 Answer Matching Outperforms Multiple Choice for Language Model Evaluation Nikhil Chandak et.al. 2507.02856 null
2025-07-03 MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs Purbesh Mitra et.al. 2507.02851 null
2025-07-03 LLM Hypnosis: Exploiting User Feedback for Unauthorized Knowledge Injection to All Users Almog Hilel et.al. 2507.02850 null
2025-07-03 Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection Ziqi Miao et.al. 2507.02844 null
2025-07-03 LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding Yuchen Ma et.al. 2507.02843 null
2025-07-03 StepHint: Multi-level Stepwise Hints Enhance Reinforcement Learning to Reason Kaiyi Zhang et.al. 2507.02841 null
2025-07-03 ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning Ruiyang Zhou et.al. 2507.02834 null
2025-07-03 Generalizing Verifiable Instruction Following Valentina Pyatkin et.al. 2507.02833 null
2025-07-03 SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model Wencheng Zhang et.al. 2507.02822 null
2025-07-03 Multimodal Mathematical Reasoning with Diverse Solving Perspective Wenhao Shi et.al. 2507.02804 null
2025-07-03 Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models Riccardo Cantini et.al. 2507.02799 null
2025-07-03 No time to train! Training-Free Reference-Based Instance Segmentation Miguel Espinosa et.al. 2507.02798 null
2025-07-03 From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding Xiangfeng Wang et.al. 2507.02790 null
2025-07-03 Moral Responsibility or Obedience: What Do We Want from AI? Joseph Boland et.al. 2507.02788 null
2025-07-03 Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs Ken Tsui et.al. 2507.02778 null
2025-07-03 KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs Yuzhang Xie et.al. 2507.02773 null
2025-07-03 DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Ke-Han Lu et.al. 2507.02768 null
2025-07-03 Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work Guangwei Zhang et.al. 2507.02760 null
2025-07-02 How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks Rahul Ramachandran et.al. 2507.01955 null
2025-07-02 Kwai Keye-VL Technical Report Kwai Keye Team et.al. 2507.01949 null
2025-07-02 SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars Xiaosheng Zhao et.al. 2507.01939 null
2025-07-02 The Thin Line Between Comprehension and Persuasion in LLMs Adrian de Wynter et.al. 2507.01936 null
2025-07-03 Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations Wenhao Wang et.al. 2507.01930 null
2025-07-02 A Survey on Vision-Language-Action Models: An Action Tokenization Perspective Yifan Zhong et.al. 2507.01925 null
2025-07-03 Decision-Oriented Text Evaluation Yu-Shiang Huang et.al. 2507.01923 null
2025-07-02 Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models Chengao Li et.al. 2507.01915 null
2025-07-02 Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning Qingdong He et.al. 2507.01908 null
2025-07-02 AI4Research: A Survey of Artificial Intelligence for Scientific Research Qiguang Chen et.al. 2507.01903 null
2025-07-02 High-Layer Attention Pruning with Rescaling Songtao Liu et.al. 2507.01900 null
2025-07-02 MiCoTA: Bridging the Learnability Gap with Intermediate CoT and Teacher Assistants Dongyi Ding et.al. 2507.01887 null
2025-07-02 A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs Niccolò McConnell et.al. 2507.01881 null
2025-07-02 Towards Foundation Auto-Encoders for Time-Series Anomaly Detection Gastón García González et.al. 2507.01875 null
2025-07-02 DIY-MKG: An LLM-Based Polyglot Language Learning System Kenan Tang et.al. 2507.01872 null
2025-07-02 Bridging UI Design and chatbot Interactions: Applying Form-Based Principles to Conversational Agents Sanjay Krishna Anbalagan et.al. 2507.01862 null
2025-07-02 TypeTele: Releasing Dexterity in Teleoperation by Dexterous Manipulation Types Yuhao Lin et.al. 2507.01857 null
2025-07-02 Eka-Eval : A Comprehensive Evaluation Framework for Large Language Models in Indian Languages Samridhi Raj Sinha et.al. 2507.01853 null
2025-07-02 Low-Perplexity LLM-Generated Sequences and Where To Find Them Arthur Wuhrmann et.al. 2507.01844 null
2025-07-02 MoIRA: Modular Instruction Routing Architecture for Multi-Task Robotics Dmytro Kuzmenko et.al. 2507.01843 null
2025-07-01 Teaching Time Series to See and Speak: Forecasting with Aligned Visual and Textual Perspectives Sixun Dong et.al. 2506.24124 null
2025-06-30 Calligrapher: Freestyle Text Image Customization Yue Ma et.al. 2506.24123 null
2025-06-30 Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime Yuqing Wang et.al. 2506.24120 null
2025-07-01 SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Bo Liu et.al. 2506.24119 null
2025-07-01 Intertextual Parallel Detection in Biblical Hebrew: A Transformer-Based Benchmark David M. Smiley et.al. 2506.24117 null
2025-06-30 On the Predictive Power of Representation Dispersion in Language Models Yanhong Li et.al. 2506.24106 null
2025-06-30 DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World Xiangtai Li et.al. 2506.24102 null
2025-06-30 MotionGPT3: Human Motion as a Second Modality Bingfan Zhu et.al. 2506.24086 null
2025-06-30 Logit-Gap Steering: Efficient Short-Suffix Jailbreaks for Aligned Large Language Models Tung-Ling Li et.al. 2506.24056 null
2025-06-30 Agent.xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC Xinming Wei et.al. 2506.24045 null
2025-06-30 A Survey on Vision-Language-Action Models for Autonomous Driving Sicong Jiang et.al. 2506.24044 null
2025-06-30 Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data Shubhabrata Mukherjee et.al. 2506.24039 null
2025-06-30 Ella: Embodied Social Agents with Lifelong Memory Hongxin Zhang et.al. 2506.24019 null
2025-06-30 EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations Hyunjong Kim et.al. 2506.24016 null
2025-06-30 Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective Anselm R. Strohmaier et.al. 2506.24006 null
2025-06-30 The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models Lijun Sheng et.al. 2506.24000 null
2025-06-30 Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning Seungjun Yi et.al. 2506.23998 null
2025-06-30 StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving Ruiyang Hao et.al. 2506.23982 null
2025-06-30 TaP: A Taxonomy-Guided Framework for Automated and Scalable Preference Data Generation Renren Jin et.al. 2506.23979 null
2025-06-30 Visual and Memory Dual Adapter for Multi-Modal Object Tracking Boyue Xu et.al. 2506.23972 null
2025-06-27 MiCo: Multi-image Contrast for Reinforcement Visual Reasoning Xi Chen et.al. 2506.22434 null
2025-06-27 The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements Bingchen Zhao et.al. 2506.22419 null
2025-06-27 Sequential Diagnosis with Language Models Harsha Nori et.al. 2506.22405 null
2025-06-27 HyperCLOVA X THINK Technical Report NAVER Cloud HyperCLOVA X Team et.al. 2506.22403 null
2025-06-27 Refining Czech GEC: Insights from a Multi-Experiment Approach Petr Pechman et.al. 2506.22402 null
2025-06-27 QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization Danush Khanna et.al. 2506.22396 null
2025-06-27 Test-Time Consistency in Vision Language Models Shih-Han Chou et.al. 2506.22395 null
2025-06-27 What Makes ChatGPT Effective for Software Issue Resolution? An Empirical Study of Developer-ChatGPT Conversations in GitHub Ramtin Ehsani et.al. 2506.22390 null
2025-06-27 Can Video Large Multimodal Models Think Like Doubters-or Double-Down: A Study on Defeasible Video Entailment Yue Zhang et.al. 2506.22385 null
2025-06-27 Probabilistic Optimality for Inference-time Scaling Youkang Wang et.al. 2506.22376 null
2025-06-27 Exploiting Vision Language Model for Training-Free 3D Point Cloud OOD Detection via Graph Score Propagation Tiankai Chen et.al. 2506.22375 null
2025-06-27 Towards Fair Rankings: Leveraging LLMs for Gender Bias Detection and Measurement Maryam Mousavian et.al. 2506.22372 null
2025-06-27 Can Large Language Models Help Students Prove Software Correctness? An Experimental Study with Dafny Carolina Carreira et.al. 2506.22370 null
2025-06-27 DiffSoundStream: Efficient Speech Tokenization via Diffusion Decoding Yang Yang et.al. 2506.22362 null
2025-06-27 Concept-Level AI for Telecom: Moving Beyond Large Language Models Viswanath Kumarskandpriya et.al. 2506.22359 null
2025-06-27 Optimal Estimation of Watermark Proportions in Hybrid AI-Human Texts Xiang Li et.al. 2506.22343 null
2025-06-27 Evaluating Scoring Bias in LLM-as-a-Judge Qingquan Li et.al. 2506.22316 null
2025-06-27 Detection of Personal Data in Structured Datasets Using a Large Language Model Albert Agisha Ntwali et.al. 2506.22305 null
2025-06-27 Rethinking Visual Token Reduction in LVLMs under Cross-modal Misalignment Rui Xu et.al. 2506.22283 null
2025-06-27 COOCO -- Common Objects Out-of-Context -- Semantic Violation in Scenes: Investigating Multimodal Context in Referential Communication Filippo Merlo et.al. 2506.22274 null
2025-06-26 Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test Ziyue Li et.al. 2506.21551 null
2025-06-26 mTSBench: Benchmarking Multivariate Time Series Anomaly Detection and Model Selection at Scale Xiaona Zhou et.al. 2506.21550 null
2025-06-26 SAM4D: Segment Anything in Camera and LiDAR Streams Jianyun Xu et.al. 2506.21547 null
2025-06-26 Data Efficacy for Language Model Training Yalun Dai et.al. 2506.21545 null
2025-06-26 PsyLite Technical Report Fangjun Ding et.al. 2506.21536 null
2025-06-26 Exploring the Design Space of 3D MLLMs for CT Report Generation Mohammed Baharoon et.al. 2506.21535 null
2025-06-26 "What's Up, Doc?": Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets Akshay Paruchuri et.al. 2506.21532 null
2025-06-26 Potemkin Understanding in Large Language Models Marina Mancoridis et.al. 2506.21521 null
2025-06-26 Assessing an evolutionary search engine for small language models, prompts, and evaluation metrics Cláudio Lúcio do Val Lopes et.al. 2506.21512 null
2025-06-26 Mitigating Hallucination of Large Vision-Language Models via Dynamic Logits Calibration Jiahe Chen et.al. 2506.21509 null
2025-06-26 skLEP: A Slovak General Language Understanding Benchmark Marek Šuppa et.al. 2506.21508 null
2025-06-26 Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge Boyu Gou et.al. 2506.21506 null
2025-06-26 Bridging Offline and Online Reinforcement Learning for LLMs Jack Lanchantin et.al. 2506.21495 null
2025-06-26 Global and Local Entailment Learning for Natural World Imagery Srikumar Sastry et.al. 2506.21476 null
2025-06-26 TopK Language Models Ryosuke Takahashi et.al. 2506.21468 null
2025-06-26 Efficient and Reuseable Cloud Configuration Search Using Discovery Spaces Michael Johnston et.al. 2506.21467 null
2025-06-26 Aligning Spoken Dialogue Models from User Interactions Anne Wu et.al. 2506.21463 null
2025-06-26 Spatial Mental Modeling from Limited Views Baiqiao Yin et.al. 2506.21458 null
2025-06-26 ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing Huadai Liu et.al. 2506.21448 null
2025-06-26 Text2Cypher Across Languages: Evaluating Foundational Models Beyond English Makbule Gulcin Ozsoy et.al. 2506.21445 null
2025-06-25 The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind Andrei Lupu et.al. 2506.20664 null
2025-06-25 Memento: Note-Taking for Your Future Self Chao Wan et.al. 2506.20642 null
2025-06-25 Towards Community-Driven Agents for Machine Learning Engineering Sijie Li et.al. 2506.20640 null
2025-06-26 DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation Shansan Gong et.al. 2506.20639 null
2025-06-25 Shape2Animal: Creative Animal Generation from Natural Silhouettes Quoc-Duy Tran et.al. 2506.20616 null
2025-06-25 AI Assistants to Enhance and Exploit the PETSc Knowledge Base Barry Smith et.al. 2506.20608 null
2025-06-25 Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or Harm Baixiang Huang et.al. 2506.20606 null
2025-06-25 Video Perception Models for 3D Scene Synthesis Rui Huang et.al. 2506.20601 null
2025-06-25 HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction Zhonghao Shi et.al. 2506.20566 null
2025-06-25 Large Language Model-Driven Code Compliance Checking in Building Information Modeling Soumya Madireddy et.al. 2506.20551 null
2025-06-25 When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs Ammar Khairi et.al. 2506.20544 null
2025-06-25 WattsOnAI: Measuring, Analyzing, and Visualizing Energy and Carbon Footprint of AI Workloads Hongzhen Huang et.al. 2506.20535 null
2025-06-25 Case-based Reasoning Augmented Large Language Model Framework for Decision Making in Realistic Safety-Critical Driving Scenarios Wenbin Gan et.al. 2506.20531 null
2025-06-25 Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards Charles Arnal et.al. 2506.20520 null
2025-06-25 OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Zengzhi Wang et.al. 2506.20512 null
2025-06-25 BotHash: Efficient and Training-Free Bot Detection Through Approximate Nearest Neighbor Edoardo Di Paolo et.al. 2506.20503 null
2025-06-25 ReCode: Updating Code API Knowledge with Reinforcement Learning Haoze Wu et.al. 2506.20495 null
2025-06-25 Brains and language models converge on a shared conceptual space across different languages Zaid Zada et.al. 2506.20489 null
2025-06-25 Behavior Foundation Model: Towards Next-Generation Whole-Body Control System of Humanoid Robots Mingqi Yuan et.al. 2506.20487 null
2025-06-25 Counterfactual Influence as a Distributional Quantity Matthieu Meeus et.al. 2506.20481 null
2025-06-24 Unified Vision-Language-Action Model Yuqi Wang et.al. 2506.19850 null
2025-06-24 Orthogonal Finetuning Made Scalable Zeju Qiu et.al. 2506.19847 null
2025-06-24 JoyAgents-R1: Joint Evolution Dynamics for Versatile Multi-LLM Agents with Reinforcement Learning Ai Han et.al. 2506.19846 null
2025-06-24 MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration Yucheng Zhou et.al. 2506.19835 null
2025-06-24 Evaluating Compliance with Visualization Guidelines in Diagrams for Scientific Publications Using Large Vision Language Models Johannes Rückert et.al. 2506.19825 null
2025-06-24 Persona Features Control Emergent Misalignment Miles Wang et.al. 2506.19823 null
2025-06-24 CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation Hao Li et.al. 2506.19816 null
2025-06-24 Curating art exhibitions using machine learning Eurico Covas et.al. 2506.19813 null
2025-06-24 KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality Baochang Ren et.al. 2506.19807 null
2025-06-24 LLM-Based Social Simulations Require a Boundary Zengqing Wu et.al. 2506.19806 null
2025-06-24 KnowML: Improving Generalization of ML-NIDS with Attack Knowledge Graphs Xin Fan Guo et.al. 2506.19802 null
2025-06-24 Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study Yuqi Zhu et.al. 2506.19794 null
2025-06-24 SAGE: Strategy-Adaptive Generation Engine for Query Rewriting Teng Wang et.al. 2506.19783 null
2025-06-24 Multi-Preference Lambda-weighted Listwise DPO for Dynamic Preference Alignment Yuhui Sun et.al. 2506.19780 null
2025-06-24 SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Yuqian Fu et.al. 2506.19767 null
2025-06-24 Arabic Dialect Classification using RNNs, Transformers, and Large Language Models: A Comparative Analysis Omar A. Essameldin et.al. 2506.19753 null
2025-06-24 Breaking Barriers: Do Reinforcement Post Training Gains Transfer To Unseen Domains? Chuxuan Hu et.al. 2506.19733 null
2025-06-24 LLM-Driven Medical Document Analysis: Enhancing Trustworthy Pathology and Differential Diagnosis Lei Kang et.al. 2506.19702 null
2025-06-24 Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models Jungwoo Park et.al. 2506.19697 null
2025-06-24 UltraAD: Fine-Grained Ultrasound Anomaly Classification via Few-Shot CLIP Adaptation Yue Zhou et.al. 2506.19694 null
2025-06-23 Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Jiaming Han et.al. 2506.18898 null
2025-06-23 ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Jiaru Zou et.al. 2506.18896 null
2025-06-23 Steering Conceptual Bias via Transformer Latent-Subspace Activation Vansh Sharma et.al. 2506.18887 null
2025-06-23 Universal Video Temporal Grounding with Generative Multi-modal Large Language Models Zeqian Li et.al. 2506.18883 null
2025-06-23 OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization Yiyou Sun et.al. 2506.18880 null
2025-06-23 CommVQ: Commutative Vector Quantization for KV Cache Compression Junyan Li et.al. 2506.18879 null
2025-06-23 OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation Qijun Gan et.al. 2506.18866 null
2025-06-23 TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting Zhongbin Guo et.al. 2506.18862 null
2025-06-23 LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning Yuhao Wu et.al. 2506.18841 null
2025-06-23 STU-PID: Steering Token Usage via PID Controller for Efficient Large Language Model Reasoning Aryasomayajula Ram Bharadwaj et.al. 2506.18831 null
2025-06-23 Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories Islem Bouzenia et.al. 2506.18824 null
2025-06-23 RWESummary: A Framework and Test for Choosing Large Language Models to Summarize Real-World Evidence (RWE) Studies Arjun Mukerji et.al. 2506.18819 null
2025-06-23 Context-Aware CodeLLM Eviction for AI-assisted Coding Kishanthan Thangarajah et.al. 2506.18796 null
2025-06-23 TRIZ Agents: A Multi-Agent LLM Approach for TRIZ-Based Innovation Kamil Szczepanik et.al. 2506.18783 null
2025-06-23 Existing LLMs Are Not Self-Consistent For Simple Tasks Zhenru Lin et.al. 2506.18781 null
2025-06-23 Programming by Backprop: LLMs Acquire Reusable Algorithmic Abstractions During Code Training Jonathan Cook et.al. 2506.18777 null
2025-06-23 Towards Group Fairness with Multiple Sensitive Attributes in Federated Foundation Models Yuning Yang et.al. 2506.18732 null
2025-06-23 PARALLELPROMPT: Extracting Parallelism from Large Language Model Queries Steven Kolawole et.al. 2506.18728 null
2025-06-23 Multi-modal Anchor Gated Transformer with Knowledge Distillation for Emotion Recognition in Conversation Jie Li et.al. 2506.18716 link
2025-06-23 LLM-enhanced Interactions in Human-Robot Collaborative Drawing with Older Adults Marianne Bossema et.al. 2506.18711 null
2025-06-20 VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning Zhangyang Qi et.al. 2506.17221 null
2025-06-20 No Free Lunch: Rethinking Internal Feedback for LLM Reasoning Yanzhi Zhang et.al. 2506.17219 null
2025-06-20 Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens Zeyuan Yang et.al. 2506.17218 link
2025-06-20 BREAD: Branched Rollouts from Expert Anchors Bridge SFT & RL for Reasoning Xuechen Zhang et.al. 2506.17211 null
2025-06-20 Fine-Tuning Lowers Safety and Disrupts Evaluation Consistency Kathleen C. Fraser et.al. 2506.17209 null
2025-06-20 Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM- and Agent-Based Repair Systems Matias Martinez et.al. 2506.17208 null
2025-06-20 DreamCube: 3D Panorama Generation via Multi-plane Synchronization Yukun Huang et.al. 2506.17206 null
2025-06-20 Confidence Scoring for LLM-Generated SQL in Supply Chain Data Extraction Jiekai Ma et.al. 2506.17203 null
2025-06-20 Detecting LLM-Generated Short Answers and Effects on Learner Performance Shambhavi Bhushan et.al. 2506.17196 link
2025-06-20 CLEAR-3K: Assessing Causal Explanatory Capabilities in Language Models Naiming Liu et.al. 2506.17180 null
2025-06-20 The MedPerturb Dataset: What Non-Content Perturbations Reveal About Human and Clinical LLM Decision Making Abinitha Gourabathina et.al. 2506.17163 null
2025-06-20 Analyzing PDFs like Binaries: Adversarially Robust PDF Malware Analysis via Intermediate Representation and Language Model Side Liu et.al. 2506.17162 null
2025-06-20 Do We Need Large VLMs for Spotting Soccer Actions? Ritabrata Chakraborty et.al. 2506.17144 null
2025-06-20 MeDi: Metadata-Guided Diffusion Models for Mitigating Biases in Tumor Classification David Jacob Drexlin et.al. 2506.17140 null
2025-06-20 Large Language Model Unlearning for Source Code Xue Jiang et.al. 2506.17125 null
2025-06-20 When Can Model-Free Reinforcement Learning be Enough for Thinking? Josiah P. Hanna et.al. 2506.17124 null
2025-06-20 Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs? Adithya Bhaskar et.al. 2506.17121 link
2025-06-20 Reassessing Code Authorship Attribution in the Era of Language Models Atish Kumar Dipongkor et.al. 2506.17120 null
2025-06-20 Are Bias Evaluation Methods Biased ? Lina Berrayana et.al. 2506.17111 null
2025-06-20 Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving Chuxue Cao et.al. 2506.17104 null
2025-06-18 PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning Yuhui Shi et.al. 2506.15683 null
2025-06-18 GenRecal: Generation after Recalibration from Large to Small Vision-Language Models Byung-Kwan Lee et.al. 2506.15681 null
2025-06-18 Dense SAE Latents Are Features, Not Bugs Xiaoqing Sun et.al. 2506.15679 null
2025-06-18 SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence Yao Zhang et.al. 2506.15672 null
2025-06-18 CC-LEARN: Cohort-based Consistency Learning Xiao Ye et.al. 2506.15662 null
2025-06-18 PhishDebate: An LLM-Based Multi-Agent Framework for Phishing Website Detection Wenhao Li et.al. 2506.15656 null
2025-06-18 AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning Tevin Wang et.al. 2506.15651 null
2025-06-18 Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning Ankan Deria et.al. 2506.15649 null
2025-06-18 deepSURF: Detecting Memory Safety Vulnerabilities in Rust Through Fuzzing LLM-Augmented Harnesses Georgios Androutsopoulos et.al. 2506.15648 null
2025-06-18 Exploring and Exploiting the Inherent Efficiency within Large Reasoning Models for Self-Guided Efficiency Enhancement Weixiang Zhao et.al. 2506.15647 null
2025-06-18 Demystifying the Visual Quality Paradox in Multimodal Large Language Models Shuo Xing et.al. 2506.15645 null
2025-06-18 FindingDory: A Benchmark to Evaluate Memory in Embodied Agents Karmesh Yadav et.al. 2506.15635 null
2025-06-18 Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability Yusuke Sakai et.al. 2506.15629 null
2025-06-18 The Effect of State Representation on LLM Agent Behavior in Dynamic Routing Games Lyle Goodyear et.al. 2506.15624 null
2025-06-18 The Compositional Architecture of Regret in Large Language Models Xiangxiang Cui et.al. 2506.15617 null
2025-06-18 BoxFusion: Reconstruction-Free Open-Vocabulary 3D Object Detection via Real-Time Multi-View Box Fusion Yuqing Lan et.al. 2506.15610 null
2025-06-18 LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning Gabrel J. Perin et.al. 2506.15606 link
2025-06-18 LiteGD: Lightweight and dynamic GPU Dispatching for Large-scale Heterogeneous Clusters Kunming Zhang et.al. 2506.15595 null
2025-06-18 WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts Negar Foroutan et.al. 2506.15594 link
2025-06-18 DiscoSG: Towards Discourse-Level Text Scene Graph Parsing through Iterative Graph Refinement Shaoqing Lin et.al. 2506.15583 link
2025-06-17 A Variational Framework for Improving Naturalness in Generative Spoken Language Models Li-Wei Chen et.al. 2506.14767 link
2025-06-17 ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM Yujun Wang et.al. 2506.14766 null
2025-06-17 Scaling-Up the Pretraining of the Earth Observation Foundation Model PhilEO to the MajorTOM Dataset Nikolaos Dionelis et.al. 2506.14765 link
2025-06-17 RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills Chunru Lin et.al. 2506.14763 null
2025-06-17 From Bytes to Ideas: Language Modeling with Autoregressive U-Nets Mathurin Videau et.al. 2506.14761 link
2025-06-17 Reasoning with Exploration: An Entropy Perspective Daixuan Cheng et.al. 2506.14758 null
2025-06-17 Large Language Models -- the Future of Fundamental Physics? Caroline Heneka et.al. 2506.14757 null
2025-06-17 Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs Ring Team et.al. 2506.14731 null
2025-06-17 AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes Jiahao Qiu et.al. 2506.14728 null
2025-06-17 Casper: Inferring Diverse Intents for Assistive Teleoperation with Vision Language Models Huihan Liu et.al. 2506.14727 null
2025-06-17 Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data Anton Changalidis et.al. 2506.14704 link
2025-06-17 AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions Aishan Liu et.al. 2506.14697 null
2025-06-17 Unified Software Engineering agent as AI Software Engineer Leonhard Applis et.al. 2506.14683 null
2025-06-17 AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models Ads Dawson et.al. 2506.14682 link
2025-06-17 Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality Yuto Harada et.al. 2506.14681 null
2025-06-17 Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models Ling Li et.al. 2506.14674 null
2025-06-17 StreetLens: Enabling Human-Centered AI Agents for Neighborhood Assessment from Street View Imagery Jina Kim et.al. 2506.14670 null
2025-06-17 GuiLoMo: Allocating Expert Number and Rank for LoRA-MoE via Bilevel Optimization with GuidedSelection Vectors Hengyuan Zhang et.al. 2506.14646 link
2025-06-17 Passing the Turing Test in Political Discourse: Fine-Tuning LLMs to Mimic Polarized Social Media Comments . Pazzaglia et.al. 2506.14645 null
2025-06-17 Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot Xiang Cheng et.al. 2506.14641 null
2025-06-16 Touch begins where vision ends: Generalizable policies for contact-rich manipulation Zifan Zhao et.al. 2506.13762 null
2025-06-16 Prompting with the Future: Open-World Model Predictive Control with Interactive Digital Twins Chuanruo Ning et.al. 2506.13761 null
2025-06-16 Discrete Diffusion in Large Language and Multimodal Models: A Survey Runpeng Yu et.al. 2506.13759 link
2025-06-16 AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning Zewei Zhou et.al. 2506.13757 link
2025-06-16 Steering LLM Thinking with Budget Guidance Junyan Li et.al. 2506.13752 link
2025-06-16 Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability Shova Kuikel et.al. 2506.13746 link
2025-06-16 Instruction Following by Boosting Attention of Large Language Models Vitoria Guardieiro et.al. 2506.13734 null
2025-06-16 Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs Sayed Mohammad Vakilzadeh Hatefi et.al. 2506.13727 link
2025-06-16 Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models Arjun Krishna et.al. 2506.13726 null
2025-06-16 OTFusion: Bridging Vision-only and Vision-Language Models via Optimal Transport for Transductive Zero-Shot Learning Qiyu Xu et.al. 2506.13723 null
2025-06-16 TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning Junru Zhang et.al. 2506.13705 link
2025-06-16 Value-Free Policy Optimization via Reward Partitioning Bilal Faye et.al. 2506.13702 link
2025-06-16 Balancing Knowledge Delivery and Emotional Comfort in Healthcare Conversational Systems Shang-Chi Tsai et.al. 2506.13692 null
2025-06-16 What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers Pulkit Gopalani et.al. 2506.13688 link
2025-06-16 An LLM's Apology: Outsourcing Awkwardness in the Age of AI Twm Stone et.al. 2506.13685 link
2025-06-16 Turning Down the Heat: A Critical Analysis of Min-p Sampling in Language Models Rylan Schaeffer et.al. 2506.13681 null
2025-06-16 ROSA: Harnessing Robot States for Vision-Language and Action Alignment Yuqing Wen et.al. 2506.13679 null
2025-06-16 Prefix-Tuning+: Modernizing Prefix-Tuning through Attention Independent Prefix Data Haonan Wang et.al. 2506.13674 null
2025-06-16 We Should Identify and Mitigate Third-Party Safety Risks in MCP-Powered Agent Systems Junfeng Fang et.al. 2506.13666 link
2025-06-16 DesignCoder: Hierarchy-Aware and Self-Correcting UI Code Generation with Large Language Models Yunnong Chen et.al. 2506.13663 null
2025-06-13 EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction Hsi-Che Lin et.al. 2506.12015 null
2025-06-13 code_transformed: The Influence of Large Language Models on Code Yuliang Xu et.al. 2506.12014 null
2025-06-13 Tracing LLM Reasoning Processes with Strategic Games: A Framework for Planning, Revision, and Resource-Constrained Decision Making Xiaopeng Yuan et.al. 2506.12012 null
2025-06-13 Affogato: Learning Open-Vocabulary Affordance Grounding with Automated Data Generation at Scale Junha Lee et.al. 2506.12009 null
2025-06-13 Generative Representational Learning of Foundation Models for Recommendation Zheli Zhou et.al. 2506.11999 null
2025-06-13 pLSTM: parallelizable Linear Source Transition Mark networks Korbinian Pöppel et.al. 2506.11997 null
2025-06-13 VGR: Visual Grounded Reasoning Jiacong Wang et.al. 2506.11991 null
2025-06-13 How Visual Representations Map to Language Feature Space in Multimodal LLMs Constantin Venhoff et.al. 2506.11976 null
2025-06-13 Improving Large Language Model Safety with Contrastive Representation Learning Samuel Simko et.al. 2506.11938 link
2025-06-13 Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback Dongwei Jiang et.al. 2506.11930 null
2025-06-13 LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? Zihan Zheng et.al. 2506.11928 null
2025-06-13 GeistBERT: Breathing Life into German NLP Raphael Scheible-Schmitt et.al. 2506.11903 null
2025-06-13 Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache Xiaoran Liu et.al. 2506.11886 null
2025-06-13 Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment Alejandro Peña et.al. 2506.11880 null
2025-06-13 A Short Survey on Formalising Software Requirements using Large Language Models Arshad Beg et.al. 2506.11874 null
2025-06-13 Post Persona Alignment for Multi-Session Dialogue Generation Yi-Pei Chen et.al. 2506.11857 null
2025-06-13 TrustGLM: Evaluating the Robustness of GraphLLMs Against Prompt, Text, and Structure Attacks Qihai Zhang et.al. 2506.11844 null
2025-06-13 Your Ride, Your Rules: Psychology and Cognition Enabled Automated Driving Systems Zhipeng Bao et.al. 2506.11842 null
2025-06-13 CLEAN-MI: A Scalable and Efficient Pipeline for Constructing High-Quality Neurodata in Motor Imagery Paradigm Dingkun Liu et.al. 2506.11830 null
2025-06-13 Revealing Political Bias in LLMs through Structured Multi-Agent Debate Aishwarya Bandaru et.al. 2506.11825 link
2025-06-12 AutoMind: Adaptive Knowledgeable Agent for Automated Data Science Yixin Ou et.al. 2506.10974 link
2025-06-12 Farseer: A Refined Scaling Law in Large Language Models Houyi Li et.al. 2506.10972 link
2025-06-12 Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs Qizhe Zhang et.al. 2506.10967 link
2025-06-12 GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation Ning Gao et.al. 2506.10966 null
2025-06-12 ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark Kangwei Liu et.al. 2506.10960 link
2025-06-12 Distillation of atomistic foundation models across architectures and chemical domains John L. A. Gardner et.al. 2506.10956 link
2025-06-12 SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks Lianghong Guo et.al. 2506.10954 link
2025-06-12 Build the web for agents, not agents for the web Xing Han Lù et.al. 2506.10953 null
2025-06-12 Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training Mozhi Zhang et.al. 2506.10952 null
2025-06-12 Execution Guided Line-by-Line Code Generation Boaz Lavon et.al. 2506.10948 link
2025-06-12 GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models Evelyn Ma et.al. 2506.10946 null
2025-06-12 Self-Adapting Language Models Adam Zweiger et.al. 2506.10943 null
2025-06-12 Dynamic Epistemic Friction in Dialogue Timothy Obiso et.al. 2506.10934 null
2025-06-12 The Role of Generative AI in Facilitating Social Interactions: A Scoping Review T. T. J. E. Arets et.al. 2506.10927 null
2025-06-12 Robustly Improving LLM Fairness in Realistic Settings via Interpretability Adam Karvonen et.al. 2506.10922 link
2025-06-12 Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization Or Shafran et.al. 2506.10920 link
2025-06-12 Sequential-Parallel Duality in Prefix Scannable Models Morris Yau et.al. 2506.10918 null
2025-06-12 Foundation Models for Causal Inference via Prior-Data Fitted Networks Yuchen Ma et.al. 2506.10914 null
2025-06-12 Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification? Fei Lin et.al. 2506.10912 null
2025-06-12 NoLoCo: No-all-reduce Low Communication Training Method for Large Models Jari Kolehmainen et.al. 2506.10911 link
2025-06-11 Flipping Against All Odds: Reducing LLM Coin Flip Bias via Verbalized Rejection Sampling Tim Z. Xiao et.al. 2506.09998 null
2025-06-11 From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring Yang Li et.al. 2506.09996 null
2025-06-11 Large Language Models for Toxic Language Detection in Low-Resource Balkan Languages Amel Muminovic et.al. 2506.09992 link
2025-06-11 Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation Xinyu Yang et.al. 2506.09991 null
2025-06-11 EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits Ron Yosef et.al. 2506.09988 null
2025-06-11 A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs Benno Krojer et.al. 2506.09987 null
2025-06-11 V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning Mido Assran et.al. 2506.09985 link
2025-06-11 Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs Hiroshi Matsuda et.al. 2506.09983 link
2025-06-11 AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation Zijie Wu et.al. 2506.09982 null
2025-06-11 SRLAgent: Enhancing Self-Regulated Learning Skills through Gamification and LLM Assistance Wentao Ge et.al. 2506.09968 null
2025-06-11 Resa: Transparent Reasoning Models via SAEs Shangshang Wang et.al. 2506.09967 link
2025-06-11 Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing Junfei Wu et.al. 2506.09965 link
2025-06-11 Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy Sushant Gautam et.al. 2506.09958 null
2025-06-11 LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge Sahar Abdelnabi et.al. 2506.09956 link
2025-06-11 Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking Wuwei Zhang et.al. 2506.09944 link
2025-06-11 VerIF: Verification Engineering for Reinforcement Learning in Instruction Following Hao Peng et.al. 2506.09942 link
2025-06-11 From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models Irving Fang et.al. 2506.09930 null
2025-06-11 PersonaLens: A Benchmark for Personalization Evaluation in Conversational AI Assistants Zheng Zhao et.al. 2506.09902 link
2025-06-11 The Emergence of Abstract Thought in Large Language Models Beyond Any Language Yuxin Chen et.al. 2506.09890 null
2025-06-11 Attention Head Embeddings with Trainable Deep Kernels for Hallucination Detection in LLMs Rodion Oblovatny et.al. 2506.09886 null
2025-06-10 VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning Li Kang et.al. 2506.09049 null
2025-06-10 Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs Yaniv Nikankin et.al. 2506.09047 link
2025-06-10 Agentic Neural Networks: Self-Evolving Multi-Agent Systems via Textual Backpropagation Xiaowen Ma et.al. 2506.09046 null
2025-06-10 Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models Xuanchi Ren et.al. 2506.09042 link
2025-06-10 Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better Dianyi Wang et.al. 2506.09040 link
2025-06-10 AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions Polina Kirichenko et.al. 2506.09038 link
2025-06-10 FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed Sizhe Dang et.al. 2506.09034 null
2025-06-10 Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning Haozhen Zhang et.al. 2506.09033 link
2025-06-10 Do MIL Models Transfer? Daniel Shao et.al. 2506.09022 link
2025-06-10 SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning Ruiqi Zhang et.al. 2506.09016 link
2025-06-10 Learning to Reason Across Parallel Samples for LLM Reasoning Jianing Qi et.al. 2506.09014 null
2025-06-10 Boosting Rust Unit Test Coverage through Hybrid Program Analysis and Large Language Models Bei Chu et.al. 2506.09002 null
2025-06-10 Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models Chenyu Lian et.al. 2506.08990 link
2025-06-10 SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning Xiao Liang et.al. 2506.08989 link
2025-06-10 On Finetuning Tabular Foundation Models Ivan Rubachev et.al. 2506.08982 link
2025-06-10 AdaDec: Uncertainty-Guided Adaptive Decoding for LLM-based Code Generation Kaifeng He et.al. 2506.08980 null
2025-06-10 Propositional Logic for Probing Generalization in Neural Networks Anna Langedijk et.al. 2506.08978 null
2025-06-10 Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Scheduling System Yuan Guo et.al. 2506.08972 null
2025-06-10 ADAM: Autonomous Discovery and Annotation Model using LLMs for Context-Aware Annotations Amirreza Rouhi et.al. 2506.08968 null
2025-06-10 Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model Ailin Huang et.al. 2506.08967 null
2025-06-09 GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior Penghao Wu et.al. 2506.08012 null
2025-06-09 Play to Generalize: Learning to Reason Through Game Play Yunfei Xie et.al. 2506.08011 link
2025-06-09 Vision Transformers Don't Need Trained Registers Nick Jiang et.al. 2506.08010 link
2025-06-09 Hidden in plain sight: VLMs overlook their visual representations Stephanie Fu et.al. 2506.08008 null
2025-06-09 Reinforcement Pre-Training Qingxiu Dong et.al. 2506.08007 null
2025-06-09 Reparameterized LLM Training via Orthogonal Equivalence Transformation Zeju Qiu et.al. 2506.08001 null
2025-06-09 Supporting Construction Worker Well-Being with a Multi-Agent Conversational AI System Fan Yang et.al. 2506.07997 null
2025-06-09 HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization Hongzheng Chen et.al. 2506.07972 link
2025-06-09 CyberV: Cybernetics for Test-time Scaling in Video Understanding Jiahao Meng et.al. 2506.07971 link
2025-06-09 SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence Ziyang Gong et.al. 2506.07966 link
2025-06-09 Reinforcing Multimodal Understanding and Generation with Dual Self-rewards Jixiang Hong et.al. 2506.07963 null
2025-06-09 Correlated Errors in Large Language Models Elliot Kim et.al. 2506.07962 null
2025-06-09 BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models Peiyan Li et.al. 2506.07961 null
2025-06-09 Language Models over Canonical Byte-Pair Encodings Tim Vieira et.al. 2506.07956 null
2025-06-09 TokenBreak: Bypassing Text Classification Models Through Token Manipulation Kasimir Schulz et.al. 2506.07948 null
2025-06-09 Statistical Hypothesis Testing for Auditing Robustness in Language Models Paulius Rauba et.al. 2506.07947 null
2025-06-09 ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols Arnav Sheth et.al. 2506.07945 link
2025-06-09 Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations Yizhen Li et.al. 2506.07943 null
2025-06-09 Adversarial Attack Classification and Robustness Testing for Large Language Models for Code Yang Liu et.al. 2506.07942 null
2025-06-09 Gradients: When Markets Meet Fine-tuning -- A Distributed Approach to Model Optimisation Christopher Subia-Waud et.al. 2506.07940 null
2025-06-06 TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation Muhammad Sohail Danish et.al. 2506.06281 null
2025-06-06 Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias Yuanzhe Hu et.al. 2506.06280 null
2025-06-06 CoMemo: LVLMs Need Image Context with Image Memory Shi Liu et.al. 2506.06279 null
2025-06-06 Movie Facts and Fibs (MF $^2$ ): A Benchmark for Long Movie Understanding Emmanouil Zaranis et.al. 2506.06275 null
2025-06-06 AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization Mukur Gupta et.al. 2506.06273 null
2025-06-06 RecGPT: A Foundation Model for Sequential Recommendation Yangqin Jiang et.al. 2506.06270 link
2025-06-06 Cartridges: Lightweight and general-purpose long context representations via self-study Sabri Eyuboglu et.al. 2506.06266 null
2025-06-06 PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time Weizhi Zhang et.al. 2506.06254 null
2025-06-06 DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation Jingyu Xiao et.al. 2506.06251 link
2025-06-06 Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models Zahra Babaiee et.al. 2506.06242 null
2025-06-06 Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge Yi Sui et.al. 2506.06240 null
2025-06-06 Explaining Matters: Leveraging Definitions and Semantic Expansion for Sexism Detection Sahrish Khan et.al. 2506.06238 null
2025-06-06 Challenging Vision-Language Models with Surgical Data: A New Dataset and Broad Benchmarking Study Leon Mayer et.al. 2506.06232 null
2025-06-06 CompilerGPT: Leveraging Large Language Models for Analyzing and Acting on Compiler Optimization Reports Peter Pirkelbauer et.al. 2506.06227 null
2025-06-06 PROVSYN: Synthesizing Provenance Graphs for Data Augmentation in Intrusion Detection Systems Yi Huang et.al. 2506.06226 null
2025-06-06 GenIR: Generative Visual Feedback for Mental Image Retrieval Diji Yang et.al. 2506.06220 null
2025-06-06 STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving Christian Fruhwirth-Reisinger et.al. 2506.06218 link
2025-06-06 Corrector Sampling in Language Models Itai Gat et.al. 2506.06215 null
2025-06-06 Can Theoretical Physics Research Benefit from Language Agents? Sirui Lu et.al. 2506.06214 null
2025-06-06 PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts Hengzhi Li et.al. 2506.06211 null
2025-06-05 Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets Lei Hsiung et.al. 2506.05346 null
2025-06-05 SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs Jiahui Wang et.al. 2506.05344 link
2025-06-05 Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning Xingjian Ran et.al. 2506.05341 null
2025-06-05 Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models Anirudh Bharadwaj et.al. 2506.05339 link
2025-06-05 VideoMolmo: Spatio-Temporal Grounding Meets Pointing Ghazi Shazan Ahmad et.al. 2506.05336 link
2025-06-05 Search Arena: Analyzing Search-Augmented LLMs Mihran Miroyan et.al. 2506.05334 link
2025-06-05 Unleashing Hour-Scale Video Training for Long Video-Language Understanding Jingyang Lin et.al. 2506.05332 null
2025-06-05 MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning Xinyan Chen et.al. 2506.05331 link
2025-06-05 LSM-2: Learning from Incomplete Wearable Sensor Data Maxwell A. Xu et.al. 2506.05321 null
2025-06-06 Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs Haoyuan Li et.al. 2506.05318 null
2025-06-05 Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay Yifan Sun et.al. 2506.05316 null
2025-06-05 Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models Taha Entesari et.al. 2506.05314 null
2025-06-05 ProRefine: Inference-time Prompt Refinement with Textual Feedback Deepak Pandita et.al. 2506.05305 null
2025-06-05 Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos Weifeng Lin et.al. 2506.05302 null
2025-06-05 Power Law Guided Dynamic Sifting for Efficient Attention Nirav Koley et.al. 2506.05300 null
2025-06-05 Control Tax: The Price of Keeping AI in Check Mikhail Terekhov et.al. 2506.05296 null
2025-06-05 Sample Complexity and Representation Ability of Test-time Scaling Paradigms Baihe Huang et.al. 2506.05295 null
2025-06-05 EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World? Yuqian Yuan et.al. 2506.05287 null
2025-06-05 Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning Nan Huo et.al. 2506.05278 null
2025-06-05 Teaming in the AI Era: AI-Augmented Frameworks for Forming, Simulating, and Optimizing Human Teams Mohammed Almutairi et.al. 2506.05265 null
2025-06-04 OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis Junting Chen et.al. 2506.04217 link
2025-06-04 Language-Image Alignment with Fixed Text Encoders Jingfeng Yang et.al. 2506.04209 null
2025-06-04 Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning Shuang Chen et.al. 2506.04207 null
2025-06-04 EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation Jinghan Jia et.al. 2506.04205 link
2025-06-04 Cascadia: A Cascade Serving System for Large Language Models Youhe Jiang et.al. 2506.04203 null
2025-06-04 TracLLM: A Generic Framework for Attributing Long Context LLMs Yanting Wang et.al. 2506.04202 link
2025-06-04 R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning Qingfei Zhao et.al. 2506.04185 link
2025-06-04 SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models Yuhao Wu et.al. 2506.04180 null
2025-06-04 SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling Anhao Zhao et.al. 2506.04179 null
2025-06-04 Does Prompt Design Impact Quality of Data Imputation by LLMs? Shreenidhi Srinivasan et.al. 2506.04172 null
2025-06-04 VISCA: Inferring Component Abstractions for Automated End-to-End Testing Parsa Alian et.al. 2506.04161 null
2025-06-04 Image Editing As Programs with Diffusion Models Yujia Hu et.al. 2506.04158 null
2025-06-04 A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization Sarvesh Soni et.al. 2506.04156 null
2025-06-04 Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis Kejian Zhu et.al. 2506.04142 null
2025-06-04 MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos Kejian Zhu et.al. 2506.04141 null
2025-06-04 TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems Shaina Raza et.al. 2506.04133 null
2025-06-04 Recent Advances in Medical Image Classification Loan Dao et.al. 2506.04129 null
2025-06-04 Guided Speculative Inference for Efficient Test-Time Alignment of LLMs Jonathan Geuter et.al. 2506.04118 link
2025-06-05 Rectified Sparse Attention Yutao Sun et.al. 2506.04108 null
2025-06-04 TextAtari: 100K Frames Game Playing with Language Agents Wenhao Li et.al. 2506.04098 link
2025-06-03 Causal Estimation of Tokenisation Bias Pietro Lesci et.al. 2506.03149 null
2025-06-03 UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Bin Lin et.al. 2506.03147 null
2025-06-03 Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM Pralaypati Ta et.al. 2506.03145 null
2025-06-03 Not All Tokens Are Meant to Be Forgotten Xiangyu Zhou et.al. 2506.03142 null
2025-06-03 SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation Siqi Chen et.al. 2506.03139 null
2025-06-03 OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models Mengdi Jia et.al. 2506.03135 null
2025-06-03 Native-Resolution Image Synthesis Zidong Wang et.al. 2506.03131 null
2025-06-03 AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation Lu Qiu et.al. 2506.03126 null
2025-06-03 AUTOCIRCUIT-RL: Reinforcement Learning-Driven LLM for Automated Circuit Topology Generation Prashanth Vijayaraghavan et.al. 2506.03122 null
2025-06-03 Targeted Forgetting of Image Subgroups in CLIP Models Zeliang Zhang et.al. 2506.03117 null
2025-06-04 Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback Xiaoying Zhang et.al. 2506.03106 null
2025-06-03 Beyond Text Compression: Evaluating Tokenizers Across Scales Jonas F. Lotz et.al. 2506.03101 null
2025-06-03 TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models Chetwin Low et.al. 2506.03099 null
2025-06-03 EgoVLM: Policy Optimization for Egocentric Video Understanding Ashwin Vinod et.al. 2506.03097 link
2025-06-03 DPO Learning with LLMs-Judge Signal for Computer Use Agents Man Luo et.al. 2506.03095 null
2025-06-03 From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit Valérie Costa et.al. 2506.03093 null
2025-06-03 Literary Evidence Retrieval via Long-Context Language Models Katherine Thai et.al. 2506.03090 null
2025-06-03 StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs Qijun Luo et.al. 2506.03077 null
2025-06-03 LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM Roman Titkov et.al. 2506.03073 null
2025-06-03 EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models Mingzhe Li et.al. 2506.03067 null
2025-05-30 ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL Yu Zhang et.al. 2505.24875 null
2025-05-30 The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models Adam Stein et.al. 2505.24874 link
2025-05-30 ProxyThinker: Test-Time Guidance through Small Visual Reasoners Zilin Xiao et.al. 2505.24872 link
2025-05-30 MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning Yiqing Liang et.al. 2505.24871 null
2025-05-30 GenSpace: Benchmarking Spatially-Aware Image Generation Zehan Wang et.al. 2505.24870 null
2025-05-30 SiLVR: A Simple Language-based Video Reasoning Framework Ce Zhang et.al. 2505.24869 link
2025-05-30 Time Blindness: Why Video-Language Models Can't See What Humans Can? Ujjwal Upadhyay et.al. 2505.24867 null
2025-05-30 ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Mingjie Liu et.al. 2505.24864 link
2025-05-30 Beyond Multiple Choice: Evaluating Steering Vectors for Adaptive Free-Form Summarization Joschka Braun et.al. 2505.24859 null
2025-05-30 Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking Heli Ben-Hamu et.al. 2505.24857 null
2025-05-30 MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning Jingyan Shen et.al. 2505.24846 null
2025-05-30 Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning Wanyun Xie et.al. 2505.24844 link
2025-05-30 Cascading Adversarial Bias from Injection to Distillation in Language Models Harsh Chaudhari et.al. 2505.24842 null
2025-05-30 Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck Yuwen Tan et.al. 2505.24840 null
2025-05-30 VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software Brandon Man et.al. 2505.24838 link
2025-06-02 How much do language models memorize? John X. Morris et.al. 2505.24832 null
2025-05-30 Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs Juraj Vladika et.al. 2505.24830 null
2025-05-30 LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal Text Li yunhan et.al. 2505.24826 link
2025-05-30 PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models Yinggan Xu et.al. 2505.24823 null
2025-05-30 Bi-Manual Joint Camera Calibration and Scene Representation Haozhan Tang et.al. 2505.24819 null
2025-05-29 TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models Yao Xiao et.al. 2505.23769 link
2025-05-29 Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought Yunze Man et.al. 2505.23766 null
2025-05-29 From Chat Logs to Collective Insights: Aggregative Question Answering Wentao Zhang et.al. 2505.23765 null
2025-05-29 MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence Sihan Yang et.al. 2505.23764 null
2025-05-29 ZeroGUI: Automating Online GUI Learning at Zero Human Cost Chenyu Yang et.al. 2505.23762 link
2025-05-29 Differential Information: An Information-Theoretic Perspective on Preference Optimization Yunjae Won et.al. 2505.23761 null
2025-05-29 Puzzled by Puzzles: When Vision-Language Models Can't Take a Hint Heekyung Lee et.al. 2505.23759 link
2025-05-29 DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning Ziyin Zhang et.al. 2505.23754 link
2025-05-29 ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks Akashah Shabbir et.al. 2505.23752 link
2025-05-29 Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences? Paul Gölz et.al. 2505.23749 null
2025-05-29 Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Diankun Wu et.al. 2505.23747 null
2025-05-29 To Trust Or Not To Trust Your Vision-Language Model's Prediction Hao Dong et.al. 2505.23745 link
2025-05-29 LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization Ronghuan Wu et.al. 2505.23740 null
2025-05-29 ATLAS: Learning to Optimally Memorize the Context at Test Time Ali Behrouz et.al. 2505.23735 null
2025-05-29 Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time Mohamad Chehade et.al. 2505.23729 null
2025-05-29 PixelThink: Towards Efficient Chain-of-Pixel Reasoning Song Wang et.al. 2505.23727 null
2025-05-29 FMG-Det: Foundation Model Guided Robust Object Detection Darryl Hannan et.al. 2505.23726 null
2025-05-29 MuLoCo: Muon is a practical inner optimizer for DiLoCo Benjamin Thérien et.al. 2505.23725 null
2025-05-29 SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA Minrui Luo et.al. 2505.23724 null
2025-05-29 ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering Zexi Liu et.al. 2505.23723 link
2025-05-28 Zero-Shot Vision Encoder Grafting via LLM Surrogates Kaiyu Yue et.al. 2505.22664 link
2025-05-28 Training Free Stylized Abstraction Aimon Rahman et.al. 2505.22663 null
2025-05-28 AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models Feng Luo et.al. 2505.22662 null
2025-05-28 GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning Qingchen Yu et.al. 2505.22661 null
2025-05-28 Maximizing Confidence Alone Improves Reasoning Mihir Prabhudesai et.al. 2505.22660 null
2025-05-28 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model Wenbo Hu et.al. 2505.22657 null
2025-05-28 Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents Michael Kirchhof et.al. 2505.22655 null
2025-05-28 VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models Ce Zhang et.al. 2505.22654 null
2025-05-28 The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason Ang Lv et.al. 2505.22653 null
2025-05-28 Sherlock: Self-Correcting Reasoning in Vision-Language Models Yi Ding et.al. 2505.22651 null
2025-05-28 Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese Hanjia Lyu et.al. 2505.22645 link
2025-05-28 Understanding (Un)Reliability of Steering Vectors in Language Models Joschka Braun et.al. 2505.22637 null
2025-05-28 Learning Composable Chains-of-Thought Fangcong Yin et.al. 2505.22635 null
2025-05-28 Spatial Knowledge Graph-Guided Multimodal Synthesis Yida Xue et.al. 2505.22633 null
2025-05-28 Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs Ziling Cheng et.al. 2505.22630 null
2025-05-28 Principled Out-of-Distribution Generalization via Simplicity Jiawei Ge et.al. 2505.22622 null
2025-05-28 Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Chengyue Wu et.al. 2505.22618 null
2025-05-28 The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Ganqu Cui et.al. 2505.22617 null
2025-05-28 RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction Yuchi Wang et.al. 2505.22613 null
2025-05-28 Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates Haoning Xu et.al. 2505.22608 null
2025-05-27 Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making Yihan Wang et.al. 2505.21503 null
2025-05-27 ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models Dingming Li et.al. 2505.21500 null
2025-05-27 AdInject: Real-World Black-Box Attacks on Web Agents via Advertising Delivery Haowei Wang et.al. 2505.21499 link
2025-05-27 Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment Xiaojun Jia et.al. 2505.21494 link
2025-05-27 Reinforcing General Reasoning without Verifiers Xiangxin Zhou et.al. 2505.21493 link
2025-05-27 Robust Hypothesis Generation: LLM-Automated Language Bias for Inductive Logic Programming Yang Yang et.al. 2505.21486 null
2025-05-27 Are Language Models Consequentialist or Deontological Moral Reasoners? Keenan Samway et.al. 2505.21479 null
2025-05-27 Policy Optimized Text-to-Image Pipeline Design Uri Gadot et.al. 2505.21478 null
2025-05-27 Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration Mehrdad Fazli et.al. 2505.21472 null
2025-05-27 Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration Zijun Liu et.al. 2505.21471 link
2025-05-27 Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion Zhanqiu Hu et.al. 2505.21467 null
2025-05-27 ID-Align: RoPE-Conscious Position Remapping for Dynamic High-Resolution Adaptation in Vision-Language Models Bozhou Li et.al. 2505.21465 null
2025-05-27 LazyVLM: Neuro-Symbolic Approach to Video Analytics Xiangru Jian et.al. 2505.21459 null
2025-05-27 Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance Shintaro Ozaki et.al. 2505.21458 null
2025-05-27 Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO Muzhi Zhu et.al. 2505.21457 null
2025-05-27 Can Large Reasoning Models Self-Train? Sheikh Shafayat et.al. 2505.21444 null
2025-05-27 Towards Better Instruction Following Retrieval Models Yuchen Zhuang et.al. 2505.21439 null
2025-05-27 Hume: Introducing System-2 Thinking in Visual-Language-Action Model Haoming Song et.al. 2505.21432 null
2025-05-27 Policy Induction: Predicting Startup Success via Explainable Memory-Augmented In-Context Learning Xianling Mu et.al. 2505.21427 null
2025-05-27 GUARD:Dual-Agent based Backdoor Defense on Chain-of-Thought in Neural Code Generation Naizhu Jin et.al. 2505.21425 null
2025-05-26 Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs Hanting Chen et.al. 2505.20155 null
2025-05-26 UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models Xueyan Zhang et.al. 2505.20154 null
2025-05-26 MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents Ziming Wei et.al. 2505.20148 link
2025-05-26 FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities Jin Wang et.al. 2505.20147 null
2025-05-26 SeMe: Training-Free Language Model Merging via Semantic Alignment Jian Gu et.al. 2505.20144 null
2025-05-26 StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs Jialin Yang et.al. 2505.20139 null
2025-05-26 AweDist: Attention-aware Embedding Distillation for New Input Token Embeddings Konstantin Dobler et.al. 2505.20133 null
2025-05-26 Agentic 3D Scene Generation with Spatially Contextualized VLMs Xinhang Liu et.al. 2505.20129 null
2025-05-26 Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers Zhengliang Shi et.al. 2505.20128 link
2025-05-26 Agentic AI Process Observability: Discovering Behavioral Variability Fabiana Fournier et.al. 2505.20127 null
2025-05-26 MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models Anh Thai et.al. 2505.20122 null
2025-05-27 TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent Dominik Meier et.al. 2505.20118 link
2025-05-26 Named Entity Recognition in Historical Italian: The Case of Giacomo Leopardi's Zibaldone Cristian Santini et.al. 2505.20113 null
2025-05-26 ResSVD: Residual Compensated SVD for Large Language Model Compression Haolei Bai et.al. 2505.20112 null
2025-05-26 Language-Agnostic Suicidal Risk Detection Using Large Language Models June-Woo Kim et.al. 2505.20109 null
2025-05-26 Adaptive Deep Reasoning: Triggering Deep Thinking When Needed Yunhao Wang et.al. 2505.20101 null
2025-05-26 AdaTP: Attention-Debiased Token Pruning for Video Large Language Models Fengyuan Sun et.al. 2505.20100 null
2025-05-26 Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities Chuangtao Ma et.al. 2505.20099 link
2025-05-26 S2LPP: Small-to-Large Prompt Prediction across LLMs Liang Cheng et.al. 2505.20097 null
2025-05-26 Multi-Domain Explainability of Preferences Nitay Calderon et.al. 2505.20088 null
2025-05-26 Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models Makesh Narsimhan Sreedhar et.al. 2505.20087 null
2025-05-26 Inference-time Alignment in Continuous Space Yige Yuan et.al. 2505.20081 link
2025-05-23 Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs Wafa Alghallabi et.al. 2505.18152 link
2025-05-23 First Finish Search: Efficient Test-Time Scaling in Large Language Models Aradhye Agarwal et.al. 2505.18149 null
2025-05-23 Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find Owen Bianchi et.al. 2505.18148 null
2025-05-23 Graph-Linguistic Fusion: Using Language Models for Wikidata Vandalism Detection Mykola Trokhymovych et.al. 2505.18136 null
2025-05-23 Gaming Tool Preferences in Agentic LLMs Kazem Faghih et.al. 2505.18135 link
2025-05-23 VideoGameBench: Can Vision-Language Models complete popular video games? Alex L. Zhang et.al. 2505.18134 null
2025-05-23 One RL to See Them All: Visual Triple Unified Reinforcement Learning Yan Ma et.al. 2505.18129 null
2025-05-23 Reward Model Overoptimisation in Iterated RLHF Lorenz Wolf et.al. 2505.18126 null
2025-05-23 TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations Alan Arazi et.al. 2505.18125 null
2025-05-23 UNJOIN: Enhancing Multi-Table Text-to-SQL Generation via Schema Simplification Poojah Ganesan et.al. 2505.18122 null
2025-05-23 ProgRM: Build Better GUI Agents with Progress Rewards Danyang Zhang et.al. 2505.18121 null
2025-05-23 Bidirectional Knowledge Distillation for Enhancing Sequential Recommendation with Large Language Models Jiongran Wu et.al. 2505.18120 null
2025-05-23 Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM Zinuo Li et.al. 2505.18110 null
2025-05-23 ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework Lisheng Huang et.al. 2505.18105 link
2025-05-23 How Can I Publish My LLM Benchmark Without Giving the True Answers Away? Takashi Ishida et.al. 2505.18102 null
2025-05-23 Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL Joey Hong et.al. 2505.18098 null
2025-05-23 QwenLong-CPRS: Towards $\infty$ -LLMs with Dynamic Context Optimization Weizhou Shen et.al. 2505.18092 null
2025-05-23 Data Mixing Can Induce Phase Transitions in Knowledge Acquisition Xinran Gu et.al. 2505.18091 null
2025-05-23 CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays Hyungyung Lee et.al. 2505.18087 link
2025-05-23 Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding Xiaoyi Zhang et.al. 2505.18079 null
2025-05-22 CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms Shilin Yan et.al. 2505.17020 link
2025-05-22 Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning Framework Chenhao Zhang et.al. 2505.17019 link
2025-05-22 SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward Kaixuan Fan et.al. 2505.17018 link
2025-05-22 Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO Chengzhuo Tong et.al. 2505.17017 link
2025-05-22 Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models Runsen Xu et.al. 2505.17015 null
2025-05-22 SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding Haoning Wu et.al. 2505.17012 link
2025-05-22 R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning Huatong Song et.al. 2505.17005 link
2025-05-22 Do Large Language Models Excel in Complex Logical Reasoning with Formal Language? Jin Jiang et.al. 2505.16998 link
2025-05-22 DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization Chao Zhang et.al. 2505.16995 null
2025-05-22 Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding Runpeng Yu et.al. 2505.16990 link
2025-05-22 T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning Amartya Chakraborty et.al. 2505.16986 null
2025-05-22 UFT: Unifying Supervised and Reinforcement Fine-Tuning Mingyang Liu et.al. 2505.16984 link
2025-05-22 LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding Junlong Tong et.al. 2505.16983 link
2025-05-22 Beyond Correlation: Towards Causal Large Language Model Agents in Biomedicine Adib Bazgir et.al. 2505.16982 null
2025-05-22 HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation Weizhi Tang et.al. 2505.16978 link
2025-05-22 SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development Yaxin Du et.al. 2505.16975 link
2025-05-22 CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark Ahmed Heakl et.al. 2505.16968 link
2025-05-22 Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models Junjie Xiong et.al. 2505.16957 null
2025-05-22 On Multilingual Encoder Language Model Compression for Low-Resource Languages Daniil Gurgurov et.al. 2505.16956 null
2025-05-22 A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization Shengyu Feng et.al. 2505.16952 null
2025-05-21 InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition Yijie Zheng et.al. 2505.15818 link
2025-05-21 On the creation of narrow AI: hierarchy and nonlocality of neural network skills Eric J. Michaud et.al. 2505.15811 link
2025-05-21 MMaDA: Multimodal Large Diffusion Language Models Ling Yang et.al. 2505.15809 link
2025-05-21 The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation Patrick Kahardipraja et.al. 2505.15807 link
2025-05-21 Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering Hwan Chang et.al. 2505.15805 link
2025-05-21 STAR-R1: Spacial TrAnsformation Reasoning by Reinforcing Multimodal LLMs Zongzhao Li et.al. 2505.15804 link
2025-05-21 VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models Yuchen Yan et.al. 2505.15801 null
2025-05-21 Model Merging is Secretly Certifiable: Non-Vacuous Generalisation Bounds for Low-Shot Learning Taehoon Kim et.al. 2505.15798 null
2025-05-21 Reverse Engineering Human Preferences with Reinforcement Learning Lisa Alazraki et.al. 2505.15795 null
2025-05-21 HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving Zhiwen Chen et.al. 2505.15793 null
2025-05-21 Large Language Models as Computable Approximations to Solomonoff Induction Jun Wan et.al. 2505.15784 null
2025-05-21 dKV-Cache: The Cache for Diffusion Language Models Xinyin Ma et.al. 2505.15781 link
2025-05-21 ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning Changtai Zhu et.al. 2505.15776 link
2025-05-21 Beyond Hard and Soft: Hybrid Context Compression for Balancing Local and Global Information Retention Huanxuan Liao et.al. 2505.15774 link
2025-05-21 MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling Cheng Yifan et.al. 2505.15772 null
2025-05-21 An Empirical Analysis of Vulnerability Detection Tools for Solidity Smart Contracts Using Line Level Manually Annotated Vulnerabilities Francesco Salzano et.al. 2505.15756 null
2025-05-21 Exploring The Visual Feature Space for Multimodal Neural Decoding Weihao Xia et.al. 2505.15755 null
2025-05-21 Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval Taiye Chen et.al. 2505.15753 null
2025-05-21 Multi-modal Integration Analysis of Alzheimer's Disease Using Large Language Models and Knowledge Graphs Kanan Kiguchi et.al. 2505.15747 null
2025-05-21 Evolutionary Computation and Large Language Models: A Survey of Methods, Synergies, and Applications Dikshit Chauhan et.al. 2505.15741 null
2025-05-20 Language Models use Lookbacks to Track Beliefs Nikhil Prakash et.al. 2505.14685 null
2025-05-20 Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning Haolei Xu et.al. 2505.14684 null
2025-05-20 Emerging Properties in Unified Multimodal Pretraining Chaorui Deng et.al. 2505.14683 null
2025-05-20 UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation Rui Tian et.al. 2505.14682 null
2025-05-20 UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models Xiaojie Gu et.al. 2505.14679 link
2025-05-20 Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning Jiaer Xia et.al. 2505.14677 null
2025-05-20 Reward Reasoning Model Jiaxin Guo et.al. 2505.14674 null
2025-05-20 UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens Ruichuan An et.al. 2505.14671 link
2025-05-20 Quartet: Native FP4 Training Can Be Optimal for Large Language Models Roberto L. Castro et.al. 2505.14669 link
2025-05-20 ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions Bufang Yang et.al. 2505.14668 null
2025-05-20 Beyond Words: Multimodal LLM Knows When to Speak Zikai Liao et.al. 2505.14654 null
2025-05-20 General-Reasoner: Advancing LLM Reasoning Across All Domains Xueguang Ma et.al. 2505.14652 null
2025-05-20 Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits Tiantian Feng et.al. 2505.14648 link
2025-05-20 CAD-Coder: An Open-Source Vision-Language Model for Computer-Aided Design Code Generation Anna C. Doris et.al. 2505.14646 link
2025-05-20 Think Only When You Need with Large Hybrid-Reasoning Models Lingjie Jiang et.al. 2505.14631 null
2025-05-20 KERL: Knowledge-Enhanced Personalized Recipe Recommendation using Large Language Models Fnu Mohbat et.al. 2505.14629 link
2025-05-20 Debating for Better Reasoning: An Unsupervised Multimodal Approach Ashutosh Adhikari et.al. 2505.14627 null
2025-05-20 TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Zhangchen Xu et.al. 2505.14625 link
2025-05-20 Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs Morgan Lindsay Heisler et.al. 2505.14620 null
2025-05-20 Linear Control of Test Awareness Reveals Differential Compliance in Reasoning Models Sahar Abdelnabi et.al. 2505.14617 link
2025-05-19 CIE: Controlling Language Model Text Generations Using Continuous Signals Vinay Samuel et.al. 2505.13448 link
2025-05-19 Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards Xiaoyuan Liu et.al. 2505.13445 link
2025-05-19 ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models Liyan Tang et.al. 2505.13444 null
2025-05-19 GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation Abhay Deshpande et.al. 2505.13441 null
2025-05-19 Optimizing Anytime Reasoning via Budget Relative Policy Optimization Penghui Qi et.al. 2505.13438 link
2025-05-19 SMOTExT: SMOTE meets Large Language Models Mateusz Bystroński et.al. 2505.13434 null
2025-05-19 Fine-tuning Quantized Neural Networks with Zeroth-order Optimization Sifeng Shang et.al. 2505.13430 link
2025-05-19 MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision Lingxiao Du et.al. 2505.13427 link
2025-05-19 G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning Liang Chen et.al. 2505.13426 link
2025-05-19 Learnware of Language Models: Specialized Small Language Models Can Do Big Zhi-Hao Tan et.al. 2505.13425 link
2025-05-19 Make Still Further Progress: Chain of Thoughts for Tabular Data Leaderboard Si-Yang Liu et.al. 2505.13421 null
2025-05-19 FEALLM: Advancing Facial Emotion Analysis in Multimodal Large Language Models with Emotional Synergy and Reasoning Zhuozhao Hu et.al. 2505.13419 link
2025-05-19 CoT-Kinetics: A Theoretical Modeling Assessing LRM Reasoning Process Jinhe Bi et.al. 2505.13408 null
2025-05-19 AutoMathKG: The automated mathematical knowledge graph based on LLM and vector database Rong Bian et.al. 2505.13406 null
2025-05-19 MR. Judge: Multimodal Reasoner as a Judge Renjie Pi et.al. 2505.13403 null
2025-05-19 R3: Robust Rubric-Agnostic Reward Models David Anugraha et.al. 2505.13388 link
2025-05-19 CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition Nam V. Nguyen et.al. 2505.13380 link
2025-05-19 Thinkless: LLM Learns When to Think Gongfan Fang et.al. 2505.13379 link
2025-05-19 Seeing, Saying, Solving: An LLM-to-TL Framework for Cooperative Robots Dan BW Choe et.al. 2505.13376 null
2025-05-19 Multi-Armed Bandits Meet Large Language Models Djallel Bouneffouf et.al. 2505.13355 null
2025-05-16 Modeling cognitive processes of natural reading with transformer-based Language Models Bruno Bianchi et.al. 2505.11485 null
2025-05-16 msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML Zhaolan Huang et.al. 2505.11483 link
2025-05-16 Improving Assembly Code Performance with Large Language Models via Reinforcement Learning Anjiang Wei et.al. 2505.11480 null
2025-05-16 HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages Zhilin Wang et.al. 2505.11475 null
2025-05-16 Disentangling Reasoning and Knowledge in Medical Large Language Models Rahul Thapa et.al. 2505.11462 null
2025-05-16 ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks Zhixiong Zhuang et.al. 2505.11459 null
2025-05-16 LLMs unlock new paths to monetizing exploits Nicholas Carlini et.al. 2505.11449 null
2025-05-16 Is Compression Really Linear with Code Intelligence? Xianzhen Luo et.al. 2505.11441 null
2025-05-16 GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art Chenkai Zhang et.al. 2505.11436 link
2025-05-16 MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production Chao Jin et.al. 2505.11432 null
2025-05-16 Mergenetic: a Simple Evolutionary Model Merging Library Adrian Robert Minut et.al. 2505.11427 link
2025-05-16 When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs Xiaomin Li et.al. 2505.11423 null
2025-05-16 Towards Cultural Bridge by Bahnaric-Vietnamese Translation Using Transfer Learning of Sequence-To-Sequence Pre-training Language Model Phan Tran Minh Dat et.al. 2505.11421 null
2025-05-16 EdgeWisePersona: A Dataset for On-Device User Profiling from Natural Language Interactions Patryk Bartkowiak et.al. 2505.11417 link
2025-05-16 MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems Yinsicheng Jiang et.al. 2505.11415 null
2025-05-16 CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs Sijia Chen et.al. 2505.11413 null
2025-05-16 Visual Planning: Let's Think Only with Images Yi Xu et.al. 2505.11409 link
2025-05-16 Large Language Model Use Impact Locus of Control Jenny Xiyu Fu et.al. 2505.11406 null
2025-05-16 EmotionHallucer: Evaluating Emotion Hallucinations in Multimodal Large Language Models Bohao Xing et.al. 2505.11405 link
2025-05-16 Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner Wenchuan Zhang et.al. 2505.11404 link
2025-05-15 End-to-End Vision Tokenizer Tuning Wenxuan Wang et.al. 2505.10562 null
2025-05-15 Neural Thermodynamic Laws for Large Language Model Training Ziming Liu et.al. 2505.10559 null
2025-05-15 Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data Yiwen Liu et.al. 2505.10551 link
2025-05-15 Real-Time Out-of-Distribution Failure Prevention via Multi-Modal Reasoning Milan Ganai et.al. 2505.10547 null
2025-05-15 Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models Annie Wong et.al. 2505.10543 link
2025-05-15 Exploring Implicit Visual Misunderstandings in Multimodal Large Language Models through Attention Analysis Pengfei Wang et.al. 2505.10541 link
2025-05-15 S3C2 Summit 2024-09: Industry Secure Software Supply Chain Summit Imranur Rahman et.al. 2505.10538 null
2025-05-15 WorldPM: Scaling Human Preference Modeling Binghai Wang et.al. 2505.10527 link
2025-05-15 MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models Mugilan Ganesan et.al. 2505.10526 null
2025-05-15 Multi-Token Prediction Needs Registers Anastasios Gerontopoulos et.al. 2505.10518 link
2025-05-15 RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs Vibha Belavadi et.al. 2505.10495 null
2025-05-15 Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective Yutao Mou et.al. 2505.10494 link
2025-05-15 CL-RAG: Bridging the Gap in Retrieval-Augmented Generation with Curriculum Learning Shaohan Wang et.al. 2505.10493 null
2025-05-15 Campus AI vs Commercial AI: A Late-Breaking Study on How LLM As-A-Service Customizations Shape Trust and Usage Patterns Leon Hannig et.al. 2505.10490 null
2025-05-15 Parallel Scaling Law for Language Models Mouxiang Chen et.al. 2505.10475 link
2025-05-15 Large Language Models for Cancer Communication: Evaluating Linguistic Quality, Safety, and Accessibility in Generative AI Agnik Saha et.al. 2505.10472 null
2025-05-15 AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenge Ranjan Sapkota et.al. 2505.10468 null
2025-05-15 Superposition Yields Robust Neural Scaling Yizhou liu et.al. 2505.10465 link
2025-05-15 Vision language models have difficulty recognizing virtual objects Tyler Tran et.al. 2505.10453 null
2025-05-15 Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models Zemin Huang et.al. 2505.10446 null
2025-05-14 Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists? Anthony GX-Chen et.al. 2505.09614 null
2025-05-14 Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors Nicolas Dupuis et.al. 2505.09610 null
2025-05-14 Adversarial Suffix Filtering: a Defense Pipeline for LLMs David Khachaturov et.al. 2505.09602 null
2025-05-14 How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference Nidhal Jegham et.al. 2505.09598 null
2025-05-14 WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models Abdullah Mushtaq et.al. 2505.09595 null
2025-05-14 Variational Visual Question Answering Tobias Jan Wieczorek et.al. 2505.09591 null
2025-05-15 Beyond Likes: How Normative Feedback Complements Engagement Signals on Social Media Yuchen Wu et.al. 2505.09583 null
2025-05-14 VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation Chaofan Zhang et.al. 2505.09577 null
2025-05-14 Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach Shannon Lodoen et.al. 2505.09576 null
2025-05-14 MIGRATION-BENCH: Repository-Level Code Migration Benchmark from Java 8 Linbo Liu et.al. 2505.09569 link
2025-05-14 Using Foundation Models as Pseudo-Label Generators for Pre-Clinical 4D Cardiac CT Segmentation Anne-Marie Rickmann et.al. 2505.09564 null
2025-05-14 WavReward: Spoken Dialogue Models With Generalist Reward Evaluators Shengpeng Ji et.al. 2505.09558 link
2025-05-14 PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning Zongqian Li et.al. 2505.09519 link
2025-05-15 Towards Fair In-Context Learning with Tabular Foundation Models Patrik Kenfack et.al. 2505.09503 null
2025-05-14 Layered Unlearning for Adversarial Relearning Timothy Qian et.al. 2505.09500 link
2025-05-14 Flash-VL 2B: Optimizing Vision-Language Model Performance for Ultra-Low Latency and High Throughput Bo Zhang et.al. 2505.09498 null
2025-05-14 Card Sorting Simulator: Augmenting Design of Logical Information Architectures with Large Language Models Eduard Kuric et.al. 2505.09478 null
2025-05-14 Deploying Foundation Model-Enabled Air and Ground Robots in the Field: Challenges and Opportunities Zachary Ravichandran et.al. 2505.09477 null
2025-05-14 Evaluating GPT- and Reasoning-based Large Language Models on Physics Olympiad Problems: Surpassing Human Performance and Implications for Educational Assessment Paul Tschisgale et.al. 2505.09438 null
2025-05-14 CXMArena: Unified Dataset to benchmark performance in realistic CXM Scenarios Raghav Garg et.al. 2505.09436 link
2025-05-13 CodePDE: An Inference Framework for LLM-driven PDE Solver Generation Shanda Li et.al. 2505.08783 link
2025-05-13 HealthBench: Evaluating Large Language Models Towards Improved Human Health Rahul K. Arora et.al. 2505.08775 link
2025-05-14 Towards Autonomous UAV Visual Object Search in City Space: Benchmark and Agentic Methodology Yatai Ji et.al. 2505.08765 null
2025-05-13 Aya Vision: Advancing the Frontier of Multilingual Multimodality Saurabh Dash et.al. 2505.08751 null
2025-05-13 AC-Reason: Towards Theory-Guided Actual Causality Reasoning with Large Language Models Yanxi Zhang et.al. 2505.08750 link
2025-05-13 DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models Xiaoyang Chen et.al. 2505.08744 link
2025-05-13 Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies Xiaoliang Luo et.al. 2505.08739 link
2025-05-13 Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data James Giroux et.al. 2505.08736 link
2025-05-13 NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context Ben Yao et.al. 2505.08734 null
2025-05-13 Securing RAG: A Risk Assessment and Mitigation Framework Lukas Ammann et.al. 2505.08728 null
2025-05-13 Memorization-Compression Cycles Improve Generalization Fangyuan Yu et.al. 2505.08727 null
2025-05-13 Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving Zongchuang Zhao et.al. 2505.08725 link
2025-05-13 TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series Xiaolei Qin et.al. 2505.08723 link
2025-05-13 PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts Yang Su et.al. 2505.08719 null
2025-05-13 Controllable Image Colorization with Instance-aware Texts and Masks Yanru An et.al. 2505.08705 null
2025-05-13 LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs K M Sajjadul Islam et.al. 2505.08704 null
2025-05-14 Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities George Saon et.al. 2505.08699 null
2025-05-13 VizCV: AI-assisted visualization of researchers' publications tracks Vladimír Lazárik et.al. 2505.08691 null
2025-05-13 Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation Sheng Liang et.al. 2505.08690 null
2025-05-13 A Social Robot with Inner Speech for Dietary Guidance Valerio Belcamino et.al. 2505.08664 link
2025-05-12 DanceGRPO: Unleashing GRPO on Visual Generation Zeyue Xue et.al. 2505.07818 null
2025-05-12 Imagine, Verify, Execute: Memory-Guided Agentic Exploration with Vision-Language Models Seungjae Lee et.al. 2505.07815 null
2025-05-12 Learning Dynamics in Continual Pre-Training for Large Language Models Xingjin Wang et.al. 2505.07796 null
2025-05-12 Domain Regeneration: How well do LLMs match syntactic properties of text domains? Da Ju et.al. 2505.07784 null
2025-05-12 Relative Overfitting and Accept-Reject Framework Yanxin Liu et.al. 2505.07783 null
2025-05-12 MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering Rushi Qiang et.al. 2505.07782 link
2025-05-12 Must Read: A Systematic Survey of Computational Persuasion Nimet Beyza Bozdag et.al. 2505.07775 link
2025-05-12 Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving Xinji Mai et.al. 2505.07773 link
2025-05-12 Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding Yifeng Di et.al. 2505.07768 link
2025-05-12 BodyGPS: Anatomical Positioning System Halid Ziya Yerebakan et.al. 2505.07744 null
2025-05-12 Assessing the Chemical Intelligence of Large Language Models Nicholas T. Runcie et.al. 2505.07735 link
2025-05-12 Spoken Language Understanding on Unseen Tasks With In-Context Learning Neeraj Agrawal et.al. 2505.07731 null
2025-05-12 Reproducibility, Replicability, and Insights into Visual Document Retrieval with Late Interaction Jingfen Qiao et.al. 2505.07730 link
2025-05-12 Circuit Partitioning Using Large Language Models for Quantum Compilation and Simulations Pranav Sinha et.al. 2505.07711 null
2025-05-12 Through the Looking Glass: Common Sense Consistency Evaluation of Weird Images Elisei Rykov et.al. 2505.07704 null
2025-05-12 PatchTrack: A Comprehensive Analysis of ChatGPT's Influence on Pull Request Outcomes Daniel Ogenrwot et.al. 2505.07700 null
2025-05-12 Beyond CLIP Generalization: Against Forward&Backward Forgetting Adapter for Continual Learning of Vision-Language Models Songlin Dong et.al. 2505.07690 null
2025-05-12 S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models Muzhi Dai et.al. 2505.07686 null
2025-05-12 Multimodal Survival Modeling in the Age of Foundation Models Steven Song et.al. 2505.07683 link
2025-05-12 SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models Hang Wu et.al. 2505.07680 null
2025-05-09 Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks Christos Plachouras et.al. 2505.06224 link
2025-05-09 Adapting a Segmentation Foundation Model for Medical Image Classification Pengfei Gu et.al. 2505.06217 null
2025-05-09 From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling Vahid Rahimzadeh et.al. 2505.06184 null
2025-05-09 A Large Language Model-Enhanced Q-learning for Capacitated Vehicle Routing Problem with Time Windows Linjiang Cao et.al. 2505.06178 null
2025-05-09 MonetGPT: Solving Puzzles Enhances MLLMs' Image Retouching Skills Niladri Shekhar Dutt et.al. 2505.06176 null
2025-05-09 Turbo-ICL: In-Context Learning-Based Turbo Equalization Zihang Song et.al. 2505.06175 null
2025-05-09 MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from Textbooks Wenqi Zeng et.al. 2505.06152 link
2025-05-09 A Scaling Law for Token Efficiency in LLM Fine-Tuning Under Fixed Compute Budgets Ryan Lagasse et.al. 2505.06150 null
2025-05-09 Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study Faeze Ghorbanpour et.al. 2505.06149 null
2025-05-09 LLMs Get Lost In Multi-Turn Conversation Philippe Laban et.al. 2505.06120 link
2025-05-09 LLMs Outperform Experts on Challenging Biology Benchmarks Lennart Justen et.al. 2505.06108 null
2025-05-09 Free and Fair Hardware: A Pathway to Copyright Infringement-Free Verilog Generation using LLMs Sam Bush et.al. 2505.06096 null
2025-05-09 Assessing Tenstorrent's RISC-V MatMul Acceleration Capabilities Hiari Pizzini Cavagna et.al. 2505.06085 null
2025-05-09 Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information Joshua Harris et.al. 2505.06046 null
2025-05-09 Short-circuiting Shortcuts: Mechanistic Investigation of Shortcuts in Text Classification Leon Eshuijs et.al. 2505.06032 link
2025-05-09 Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation Stefan Vasilev et.al. 2505.06027 null
2025-05-09 ArtRAG: Retrieval-Augmented Generation with Structured Context for Visual Art Understanding Shuai Wang et.al. 2505.06020 null
2025-05-09 Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models Dawid Wisniewski et.al. 2505.06004 link
2025-05-09 Task-Adapter++: Task-specific Adaptation with Order-aware Alignment for Few-shot Action Recognition Congqi Cao et.al. 2505.06002 link
2025-05-09 Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models Lennart Stöpler et.al. 2505.05970 null
2025-05-08 Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation Chao Liao et.al. 2505.05472 null
2025-05-08 Generating Physically Stable and Buildable LEGO Designs from Text Ava Pun et.al. 2505.05469 link
2025-05-08 StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant Haibo Wang et.al. 2505.05467 null
2025-05-08 ComPO: Preference Alignment via Comparison Oracles Peter Chen et.al. 2505.05465 null
2025-05-08 Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging Shiqi Chen et.al. 2505.05464 link
2025-05-08 UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections Fatima Haouari et.al. 2505.05459 null
2025-05-08 SITE: towards Spatial Intelligence Thorough Evaluation Wenqi Wang et.al. 2505.05456 null
2025-05-08 Conversational Process Model Redesign Nataliia Klievtsova et.al. 2505.05453 null
2025-05-08 clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations Chalamalasetti Kranti et.al. 2505.05445 null
2025-05-08 GesPrompt: Leveraging Co-Speech Gestures to Augment LLM-Based Interaction in Virtual Reality Xiyun Hu et.al. 2505.05441 null
2025-05-09 EcoAgent: An Efficient Edge-Cloud Collaborative Multi-Agent Framework for Mobile Automation Biao Yi et.al. 2505.05440 null
2025-05-08 Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data Yudong Wang et.al. 2505.05427 null
2025-05-09 LiTransProQA: an LLM-based Literary Translation evaluation metric with Professional Question Answering Ran Zhang et.al. 2505.05423 link
2025-05-08 Crosslingual Reasoning through Test-Time Scaling Zheng-Xin Yong et.al. 2505.05408 link
2025-05-08 Frame In, Frame Out: Do LLMs Generate More Biased News Headlines than Humans? Valeria Pastorino et.al. 2505.05406 null
2025-05-08 A Pain Assessment Framework based on multimodal data and Deep Machine Learning methods Stefanos Gkikas et.al. 2505.05396 null
2025-05-08 DSDrive: Distilling Large Language Model for Lightweight End-to-End Autonomous Driving with Unified Reasoning and Planning Wenru Liu et.al. 2505.05360 null
2025-05-08 Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization Sooyoung Park et.al. 2505.05343 link
2025-05-08 FLAM: Frame-Wise Language-Audio Modeling Yusong Wu et.al. 2505.05335 null
2025-05-08 ICon: In-Context Contribution for Automatic Data Selection Yixin Yang et.al. 2505.05327 null
2025-05-07 EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning Zhenghao Xing et.al. 2505.04623 link
2025-05-07 On Path to Multimodal Generalist: General-Level and General-Bench Hao Fei et.al. 2505.04620 null
2025-05-07 OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution Lianghong Guo et.al. 2505.04606 link
2025-05-07 OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning Xianhang Li et.al. 2505.04601 null
2025-05-08 MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection Zhihao Zhang et.al. 2505.04594 null
2025-05-07 ZeroSearch: Incentivize the Search Capability of LLMs without Searching Hao Sun et.al. 2505.04588 link
2025-05-07 SlideItRight: Using AI to Find Relevant Slides and Provide Feedback for Open-Ended Questions Chloe Qianhui Zhao et.al. 2505.04584 link
2025-05-07 Fight Fire with Fire: Defending Against Malicious RL Fine-Tuning via Reward Neutralization Wenjun Cao et.al. 2505.04578 null
2025-05-07 Communication-Efficient Federated Fine-Tuning of Language Models via Dynamic Update Schedules Michail Theologitis et.al. 2505.04535 link
2025-05-07 Overcoming Data Scarcity in Generative Language Modelling for Low-Resource Languages: A Systematic Review Josh McGiff et.al. 2505.04531 null
2025-05-07 Comparative Analysis of Carbon Footprint in Manual vs. LLM-Assisted Code Development Kuen Sum Cheung et.al. 2505.04521 null
2025-05-07 Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs Yehui Tang et.al. 2505.04519 null
2025-05-07 "I Can See Forever!": Evaluating Real-time VideoLLMs for Assisting Individuals with Visual Impairments Ziyi Zhang et.al. 2505.04488 null
2025-05-07 CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation Jiahao Li et.al. 2505.04481 null
2025-05-07 TrajEvo: Designing Trajectory Prediction Heuristics via LLM-driven Evolution Zhikai Zhao et.al. 2505.04480 link
2025-05-07 Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration Shigeki Karita et.al. 2505.04457 link
2025-05-07 M2Rec: Multi-scale Mamba for Efficient Sequential Recommendation Qianru Zhang et.al. 2505.04445 null
2025-05-07 Towards Effectively Leveraging Execution Traces for Program Repair with Code LLMs Mirazul Haque et.al. 2505.04441 null
2025-05-07 OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models Xiaoyu Xu et.al. 2505.04416 null
2025-05-07 DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception Junjie Wang et.al. 2505.04410 link
2025-05-06 VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model Zuwei Long et.al. 2505.03739 link
2025-05-06 Decentralized Nonconvex Optimization under Heavy-Tailed Noise: Normalization and Optimal Convergence Shuhua Yu et.al. 2505.03736 null
2025-05-06 Meta-Optimization and Program Search using Language Models for Task and Motion Planning Denis Shcherba et.al. 2505.03725 null
2025-05-06 Fill the Gap: Quantifying and Reducing the Modality Gap in Image-Text Representation Learning François Role et.al. 2505.03703 null
2025-05-06 Fairness of Automatic Speech Recognition in Cleft Lip and Palate Speech Susmita Bhattacharjee et.al. 2505.03697 null
2025-05-06 Graph Drawing for LLMs: An Empirical Evaluation Walter Didimo et.al. 2505.03678 null
2025-05-06 Distribution-Conditional Generation: From Class Distribution to Creative Generation Fu Feng et.al. 2505.03667 null
2025-05-06 Binding threshold units with artificial oscillatory neurons Vladimir Fanaskov et.al. 2505.03648 link
2025-05-06 PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing Yiping Xie et.al. 2505.03621 null
2025-05-06 Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images Fangling Jiang et.al. 2505.03611 null
2025-05-06 Learning Knowledge-based Prompts for Robust 3D Mask Presentation Attack Detection Fangling Jiang et.al. 2505.03610 null
2025-05-06 DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes Sergey Linok et.al. 2505.03581 link
2025-05-06 LlamaFirewall: An open source guardrail system for building secure AI agents Sahana Chennabasappa et.al. 2505.03574 null
2025-05-06 Say It Another Way: A Framework for User-Grounded Paraphrasing Cléa Chataigner et.al. 2505.03563 null
2025-05-06 A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications and Challenges Feibo Jiang et.al. 2505.03556 link
2025-05-06 A Hashgraph-Inspired Consensus Mechanism for Reliable Multi-Model Reasoning Kolawole E. Ogunsina et.al. 2505.03553 null
2025-05-06 STORY2GAME: Generating (Almost) Everything in an Interactive Fiction Game Eric Zhou et.al. 2505.03547 null
2025-05-06 Faster MoE LLM Inference for Extremely Large Models Haoqi Yang et.al. 2505.03531 null
2025-05-06 Ruled by the Representation Space: On the University's Embrace of Large Language Models Katia Schwerzmann et.al. 2505.03513 null
2025-05-06 BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models Zihan Wang et.al. 2505.03501 null
2025-05-05 Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation Lu Ling et.al. 2505.02836 null
2025-05-05 R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning Yi-Fan Zhang et.al. 2505.02835 link
2025-05-05 No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves Dengyang Jiang et.al. 2505.02831 link
2025-05-05 LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery Jerome Quenum et.al. 2505.02829 null
2025-05-05 ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations Dmitriy Shopkhoev et.al. 2505.02819 link
2025-05-05 Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing Diji Yang et.al. 2505.02811 link
2025-05-05 Towards Quantifying the Hessian Structure of Neural Networks Zhaorui Dong et.al. 2505.02809 link
2025-05-05 Generating HomeAssistant Automations Using an LLM-based Chatbot Mathyas Giudici et.al. 2505.02802 null
2025-05-05 HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models Zheng Lin et.al. 2505.02795 null
2025-05-05 Giving Simulated Cells a Voice: Evolving Prompt-to-Intervention Models for Cellular Control Nam H. Le et.al. 2505.02766 null
2025-05-05 Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models Matthew Dahl et.al. 2505.02763 null
2025-05-05 Using Knowledge Graphs to harvest datasets for efficient CLIP model training Simon Ging et.al. 2505.02746 link
2025-05-06 Knowledge Graphs for Enhancing Large Language Models in Entity Disambiguation Gerard Pons et.al. 2505.02737 null
2025-05-05 FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models Zhouliang Yu et.al. 2505.02735 link
2025-05-05 Enhancing LLMs' Clinical Reasoning with Real-World Data from a Nationwide Sepsis Registry Junu Kim et.al. 2505.02722 link
2025-05-05 Less is More: Efficient Weight Farcasting with 1-Layer Neural Network Xiao Shou et.al. 2505.02714 null
2025-05-05 Technical Report: Evaluating Goal Drift in Language Model Agents Rauno Arike et.al. 2505.02709 null
2025-05-05 Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Yemin Shi et.al. 2505.02707 link
2025-05-05 AI Standardized Patient Improves Human Conversations in Advanced Cancer Care Kurtis Haut et.al. 2505.02694 link
2025-05-05 Predicting Movie Hits Before They Happen with LLMs Shaghayegh Agah et.al. 2505.02693 null
2025-05-02 How Effective are Large Time Series Models in Hydrology? A Study on Water Level Forecasting in Everglades Rahuul Rangaraj et.al. 2505.01415 null
2025-05-02 Dynamic Robot Tool Use with Vision Language Models Noah Trupin et.al. 2505.01399 null
2025-05-02 FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors Chenxi Li et.al. 2505.01322 null
2025-05-02 Helping Big Language Models Protect Themselves: An Enhanced Filtering and Summarization System Sheikh Samit Muhaimin et.al. 2505.01315 null
2025-05-02 Enhancing SPARQL Query Rewriting for Complex Ontology Alignments Anicet Lepetit Ondo et.al. 2505.01309 null
2025-05-02 Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments Regan Bolton et.al. 2505.01307 null
2025-05-02 FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing Gaoxiang Cong et.al. 2505.01263 null
2025-05-02 Digital Pathway Curation (DPC): a comparative pipeline to assess the reproducibility, consensus and accuracy across Gemini, PubMed, and scientific reviewers in biomedical research Flavio Lichtenstein et.al. 2505.01259 null
2025-05-02 Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging Elena Mulero Ayllón et.al. 2505.01239 null
2025-05-02 CaReAQA: A Cardiac and Respiratory Audio Question Answering Model for Open-Ended Diagnostic Reasoning Tsai-Ning Wang et.al. 2505.01199 null
2025-05-02 Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods Mahdi Dhaini et.al. 2505.01198 link
2025-05-02 TSTMotion: Training-free Scene-awarenText-to-motion Generation Ziyan Guo et.al. 2505.01182 null
2025-05-02 LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures Francisco Aguilera-Martínez et.al. 2505.01177 null
2025-05-02 On the Limitations of Steering in Language Model Alignment Chebrolu Niranjan et.al. 2505.01162 null
2025-05-02 Methodological Foundations for AI-Driven Survey Question Generation Ted K. Mburu et.al. 2505.01150 null
2025-05-02 Retrieval-Augmented Generation in Biomedicine: A Survey of Technologies, Datasets, and Clinical Applications Jiawei He et.al. 2505.01146 null
2025-05-02 MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning Murtadha Ahmed et.al. 2505.01110 null
2025-05-02 Self-Supervision Enhances Instance-based Multiple Instance Learning Methods in Digital Pathology: A Benchmark Study Ali Mammadov et.al. 2505.01109 link
2025-05-02 Nesterov Method for Asynchronous Pipeline Parallel Optimization Thalaiyasingam Ajanthan et.al. 2505.01099 link
2025-05-02 Evaluating Vision Language Model Adaptations for Radiology Report Generation in Low-Resource Languages Marco Salmè et.al. 2505.01096 null
2025-05-01 T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Dongzhi Jiang et.al. 2505.00703 link
2025-05-01 Robotic Visual Instruction Yanbang Li et.al. 2505.00693 null
2025-05-01 Visual Test-time Scaling for GUI Agent Grounding Tiange Luo et.al. 2505.00684 link
2025-05-01 Steering Large Language Models with Register Analysis for Arbitrary Style Transfer Xinchen Yang et.al. 2505.00679 null
2025-05-01 Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions Yiming Du et.al. 2505.00675 link
2025-05-01 DeepCritic: Deliberate Critique with Large Language Models Wenkai Yang et.al. 2505.00662 link
2025-05-01 On the generalization of language models from in-context learning and finetuning: a controlled study Andrew K. Lampinen et.al. 2505.00661 null
2025-05-01 Large Language Models Understanding: an Inherent Ambiguity Barrier Daniel N. Nissani et.al. 2505.00654 null
2025-05-01 Open-Source LLM-Driven Federated Transformer for Predictive IoV Management Yazan Otoum et.al. 2505.00651 null
2025-05-01 Investigating Task Arithmetic for Zero-Shot Information Retrieval Marco Braga et.al. 2505.00649 link
2025-05-01 Brain Foundation Models with Hypergraph Dynamic Adapter for Brain Disease Analysis Zhongying Deng et.al. 2505.00627 null
2025-05-01 The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them) Zihao Wang et.al. 2505.00626 null
2025-05-01 FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation Chaitali Bhattacharyya et.al. 2505.00624 null
2025-05-01 Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction Simon Giebenhain et.al. 2505.00615 null
2025-05-01 Combining LLMs with Logic-Based Framework to Explain MCTS Ziyan An et.al. 2505.00610 null
2025-05-01 Can LLMs Help Improve Analogical Reasoning For Strategic Decisions? Experimental Evidence from Humans and GPT-4 Phanish Puranam et.al. 2505.00603 null
2025-05-02 Fast and Low-Cost Genomic Foundation Models via Outlier Removal Haozheng Luo et.al. 2505.00598 link
2025-05-01 Block Circulant Adapter for Large Language Models Xinyu Ding et.al. 2505.00582 null
2025-05-01 Parameter-Efficient Fine-Tuning with Circulant and Diagonal Vectors Xinyu Ding et.al. 2505.00580 null
2025-05-01 FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension Jushi Kai et.al. 2505.00570 null
2025-04-30 TRUST: An LLM-Based Dialogue System for Trauma Understanding and Structured Assessments Sichang Tu et.al. 2504.21851 null
2025-04-30 COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning Xindi Wu et.al. 2504.21850 null
2025-04-30 Early Exit and Multi Stage Knowledge Distillation in VLMs for Video Summarization Anas Anwarul Haq Khan et.al. 2504.21831 null
2025-04-30 Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields Yixin Gao et.al. 2504.21814 null
2025-04-30 A simple and effective approach for body part recognition on CT scans based on projection estimation Franko Hrzic et.al. 2504.21810 null
2025-04-30 An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding Xiuwei Shang et.al. 2504.21803 null
2025-04-30 DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition Z. Z. Ren et.al. 2504.21801 link
2025-04-30 SWE-smith: Scaling Data for Software Engineering Agents John Yang et.al. 2504.21798 null
2025-04-30 MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness Junsheng Huang et.al. 2504.21773 null
2025-04-30 LASHED: LLMs And Static Hardware Analysis for Early Detection of RTL Bugs Baleegh Ahmad et.al. 2504.21770 null
2025-04-30 LLM-based Interactive Imitation Learning for Robotic Manipulation Jonas Werner et.al. 2504.21769 link
2025-04-30 Investigating Literary Motifs in Ancient and Medieval Novels with Large Language Models Emelie Hallenberg et.al. 2504.21742 null
2025-04-30 TheraQuest: A Gamified, LLM-Powered Simulation for Massage Therapy Training Shengqian Wang et.al. 2504.21735 null
2025-04-30 XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs Marco Arazzi et.al. 2504.21700 null
2025-04-30 Visual Text Processing: A Comprehensive Review and Unified Evaluation Yan Shu et.al. 2504.21682 link
2025-04-30 Hoist with His Own Petard: Inducing Guardrails to Facilitate Denial-of-Service Attacks on Retrieval-Augmented Generation of LLMs Pan Suo et.al. 2504.21680 null
2025-04-30 Traceback of Poisoning Attacks to Retrieval-Augmented Generation Baolei Zhang et.al. 2504.21668 null
2025-04-30 From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising Jingwen Cai et.al. 2504.21667 null
2025-04-30 AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization Haotian Luo et.al. 2504.21659 link
2025-04-30 Sadeed: Advancing Arabic Diacritization Through Small Language Model Zeina Aldallal et.al. 2504.21635 null
2025-04-29 Toward Efficient Exploration by Large Language Model Agents Dilip Arumugam et.al. 2504.20997 null
2025-04-29 X-Fusion: Introducing New Modality to Frozen Large Language Models Sicheng Mo et.al. 2504.20996 null
2025-04-29 ACE: A Security Architecture for LLM-Integrated App Systems Evan Li et.al. 2504.20984 null
2025-04-29 Real-Time Wayfinding Assistant for Blind and Low-Vision Users Dabbrata Das et.al. 2504.20976 null
2025-04-29 SetKE: Knowledge Editing for Knowledge Elements Overlap Yifan Wei et.al. 2504.20972 null
2025-04-29 OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification Shangyu Li et.al. 2504.20964 link
2025-04-29 Information Gravity: A Field-Theoretic Model for Token Selection in Large Language Models Maryna Vyshnyvetska et.al. 2504.20951 null
2025-04-29 Trace-of-Thought: Enhanced Arithmetic Problem Solving via Reasoning Distillation From Large to Small Language Models Tyler McDonald et.al. 2504.20946 null
2025-04-29 ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification Ziqing Fan et.al. 2504.20930 link
2025-04-29 An Empirical Study on the Capability of LLMs in Decomposing Bug Reports Zhiyuan Chen et.al. 2504.20911 null
2025-04-29 Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers Quentin Guimard et.al. 2504.20902 null
2025-04-29 LELANTE: LEveraging LLM for Automated ANdroid TEsting Shamit Fatin et.al. 2504.20896 null
2025-04-29 FedMVP: Federated Multi-modal Visual Prompt Tuning for Vision-Language Models Mainak Singha et.al. 2504.20860 null
2025-04-29 X-Cross: Dynamic Integration of Language Models for Cross-Domain Sequential Recommendation Guy Hadad et.al. 2504.20859 null
2025-04-29 JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry Anum Afzal et.al. 2504.20849 null
2025-04-29 Language Model for Large-Text Transmission in Noisy Quantum Communications Yuqi Li et.al. 2504.20842 null
2025-04-29 Universal language model with the intervention of quantum theory D. -F. Qin et.al. 2504.20839 null
2025-04-29 Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning Hongfei Xue et.al. 2504.20835 null
2025-04-29 Reinforcement Learning for LLM Reasoning Under Memory Constraints Alan Lee et.al. 2504.20834 null
2025-04-30 Ascendra: Dynamic Request Prioritization for Efficient LLM Serving Azam Ikram et.al. 2504.20828 null
2025-04-28 Learning Streaming Video Representation via Multitask Training Yibin Yan et.al. 2504.20041 null
2025-04-28 AutoJudge: Judge Decoding Without Manual Annotation Roman Garipov et.al. 2504.20039 null
2025-04-28 SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning Wufei Ma et.al. 2504.20024 null
2025-04-28 Better To Ask in English? Evaluating Factual Accuracy of Multilingual LLMs in English and Low-Resource Languages Pritika Rohera et.al. 2504.20022 null
2025-04-28 Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models Xin Wang et.al. 2504.20020 null
2025-04-28 LLM-Generated Fake News Induces Truth Decay in News Ecosystem: A Case Study on Neural News Recommendation Beizhe Hu et.al. 2504.20013 null
2025-04-28 Towards Automated Scoping of AI for Social Good Projects Jacob Emmerson et.al. 2504.20010 null
2025-04-28 Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom Rishika Sen et.al. 2504.20000 null
2025-04-28 HJRNO: Hamilton-Jacobi Reachability with Neural Operators Yankai Li et.al. 2504.19989 null
2025-04-28 TD-EVAL: Revisiting Task-Oriented Dialogue Evaluation by Combining Turn-Level Precision with Dialogue-Level Comparisons Emre Can Acikgoz et.al. 2504.19982 null
2025-04-28 Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets Adam Younsi et.al. 2504.19981 null
2025-04-29 From Concept to Practice: an Automated LLM-aided UVM Machine for RTL Verification Junhao Ye et.al. 2504.19959 null
2025-04-28 Enhancing Surgical Documentation through Multimodal Visual-Temporal Transformers and Generative AI Hugo Georgenthum et.al. 2504.19918 null
2025-04-28 Can AI Agents Design and Implement Drug Discovery Pipelines? Khachik Smbatyan et.al. 2504.19912 null
2025-04-28 GenCLS++: Pushing the Boundaries of Generative Classification in LLMs Through Comprehensive SFT and RL Studies Across Diverse Datasets Mingqian He et.al. 2504.19898 null
2025-04-28 CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition Quynh Phung et.al. 2504.19894 null
2025-04-28 semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage Ke Hong et.al. 2504.19867 null
2025-04-28 CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback Chenhan Jiang et.al. 2504.19860 null
2025-04-28 Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language Anastasia Zhukova et.al. 2504.19856 null
2025-04-29 The Automation Advantage in AI Red Teaming Rob Mulla et.al. 2504.19855 null
2025-04-25 Generalization Capability for Imitation Learning Yixiao Wang et.al. 2504.18538 null
2025-04-25 TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation Gwen Yidou Weng et.al. 2504.18535 null
2025-04-25 Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation Shivam Duggal et.al. 2504.18509 null
2025-04-25 Investigating Co-Constructive Behavior of Large Language Models in Explanation Dialogues Leandra Fichtel et.al. 2504.18483 null
2025-04-25 Generative Induction of Dialogue Task Schemas with Streaming Refinement and Simulated Interactions James D. Finch et.al. 2504.18474 null
2025-04-25 Fast-Slow Thinking for Large Vision-Language Model Reasoning Wenyi Xiao et.al. 2504.18458 null
2025-04-25 Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training Hiroki Naganuma et.al. 2504.18454 null
2025-04-25 Reason Like a Radiologist: Chain-of-Thought and Reinforcement Learning for Verifiable Report Generation Peiyuan Jing et.al. 2504.18453 null
2025-04-25 Kimi-Audio Technical Report KimiTeam et.al. 2504.18425 link
2025-04-25 LLMpatronous: Harnessing the Power of LLMs For Vulnerability Detection Rajesh Yarra et.al. 2504.18423 null
2025-04-25 BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs Hongyu Wang et.al. 2504.18415 null
2025-04-25 An Empirical Study of Evaluating Long-form Question Answering Ning Xian et.al. 2504.18413 link
2025-04-25 Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers Jared Moore et.al. 2504.18412 link
2025-04-25 HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding? Yusen Zhang et.al. 2504.18406 null
2025-04-25 Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization Kesen Zhao et.al. 2504.18397 link
2025-04-25 Bridge the Domains: Large Language Models Enhanced Cross-domain Sequential Recommendation Qidong Liu et.al. 2504.18383 null
2025-04-25 Pushing the boundary on Natural Language Inference Pablo Miralles-González et.al. 2504.18376 null
2025-04-25 Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal Assistant Lei Shen et.al. 2504.18373 link
2025-04-25 ThreMoLIA: Threat Modeling of Large Language Model-Integrated Applications Felix Viktor Jedrzejewski et.al. 2504.18369 null
2025-04-25 Testing Individual Fairness in Graph Neural Networks Roya Nasiri et.al. 2504.18353 null
2025-04-24 Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models Xu Ma et.al. 2504.17789 null
2025-04-24 Replay to Remember: Retaining Domain Knowledge in Streaming Language Models Sneh Pillai et.al. 2504.17780 null
2025-04-24 Conversational Assistants to support Heart Failure Patients: comparing a Neurosymbolic Architecture with ChatGPT Anuja Tayal et.al. 2504.17753 null
2025-04-24 Towards Robust LLMs: an Adversarial Robustness Measurement Framework Natan Levy et.al. 2504.17723 null
2025-04-24 Multilingual Performance Biases of Large Language Models in Education Vansh Gupta et.al. 2504.17720 null
2025-04-24 PICO: Reconstructing 3D People In Contact with Objects Alpár Cseke et.al. 2504.17695 null
2025-04-24 Ensemble Bayesian Inference: Leveraging Small Language Models to Achieve LLM-level Accuracy in Profile Matching Tasks Haru-Tada Sato et.al. 2504.17685 null
2025-04-24 INSIGHT: Bridging the Student-Teacher Gap in Times of Large Language Models Jarne Thys et.al. 2504.17677 null
2025-04-24 Energy Considerations of Large Language Model Inference and Efficiency Optimizations Jared Fernandez et.al. 2504.17674 null
2025-04-24 Cross-region Model Training with Communication-Computation Overlapping and Delay Compensation Ying Zhu et.al. 2504.17672 null
2025-04-25 Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction Yuanchang Ye et.al. 2504.17671 null
2025-04-24 Towards a HIPAA Compliant Agentic AI System in Healthcare Subash Neupane et.al. 2504.17669 null
2025-04-24 Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics Zena Al-Khalili et.al. 2504.17665 null
2025-04-24 Effortless, Simulation-Efficient Bayesian Inference using Tabular Foundation Models Julius Vetter et.al. 2504.17660 null
2025-04-24 Portability of Optimizations from SC to TSO Akshay Gopalakrishnan et.al. 2504.17646 null
2025-04-24 L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference Qingyuan Liu et.al. 2504.17584 null
2025-04-25 DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training Xiaoyu Tian et.al. 2504.17565 null
2025-04-24 When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars Rei Higuchi et.al. 2504.17562 null
2025-04-24 HalluLens: LLM Hallucination Benchmark Yejin Bang et.al. 2504.17550 null
2025-04-24 A Comprehensive Survey of Knowledge-Based Vision Question Answering Systems: The Lifecycle of Knowledge in Visual Reasoning Task Jiaqi Deng et.al. 2504.17547 null
2025-04-23 Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light Ali Hassani et.al. 2504.16922 link
2025-04-23 IberBench: LLM Evaluation on Iberian Languages José Ángel González et.al. 2504.16921 null
2025-04-23 Tracing Thought: Using Chain-of-Thought Reasoning to Identify the LLM Behind AI-Generated Text Shifali Agrahari et.al. 2504.16913 null
2025-04-23 Do Large Language Models know who did what to whom? Joseph M. Denning et.al. 2504.16884 null
2025-04-23 Enhancing Critical Thinking with AI: A Tailored Warning System for RAG Models Xuyang Zhu et.al. 2504.16883 null
2025-04-23 Context-Enhanced Vulnerability Detection Based on Large Language Model Yixin Yang et.al. 2504.16877 null
2025-04-23 Exploring How LLMs Capture and Represent Domain-Specific Knowledge Mirian Hipolito Garcia et.al. 2504.16871 null
2025-04-23 Common Functional Decompositions Can Mis-attribute Differences in Outcomes Between Populations Manuel Quintero et.al. 2504.16864 null
2025-04-23 Planning with Diffusion Models for Target-Oriented Dialogue Systems Hanwen Du et.al. 2504.16858 null
2025-04-23 Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification Alexander Shvets et.al. 2504.16856 null
2025-04-23 Monte Carlo Planning with Large Language Model for Text-Based Game Agents Zijing Shi et.al. 2504.16855 null
2025-04-23 Improving Significant Wave Height Prediction Using Chronos Models Yilin Zhai et.al. 2504.16834 null
2025-04-23 LRASGen: LLM-based RESTful API Specification Generation Sida Deng et.al. 2504.16833 null
2025-04-23 GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning Luu Quy Tung et.al. 2504.16832 null
2025-04-23 Decoupled Global-Local Alignment for Improving Compositional Understanding Xiaoxing Hu et.al. 2504.16801 null
2025-04-23 MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores Fengwei Zhou et.al. 2504.16786 null
2025-04-23 Graph2Nav: 3D Object-Relation Graph Generation to Robot Navigation Tixiao Shan et.al. 2504.16782 null
2025-04-23 How Effective are Generative Large Language Models in Performing Requirements Classification? Waad Alhoshan et.al. 2504.16768 null
2025-04-23 Lightweight Latent Verifiers for Efficient Meta-Generation Strategies Bartosz Piotrowski et.al. 2504.16760 null
2025-04-23 HEMA : A Hippocampus-Inspired Extended Memory Architecture for Long-Context AI Conversations Kwangseob Ahn et.al. 2504.16754 null
2025-04-22 TTRL: Test-Time Reinforcement Learning Yuxin Zuo et.al. 2504.16084 link
2025-04-22 MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention Yucheng Li et.al. 2504.16083 null
2025-04-22 MR. Video: "MapReduce" is the Principle for Long Video Understanding Ziqi Pang et.al. 2504.16082 null
2025-04-22 From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning Le Zhuo et.al. 2504.16080 null
2025-04-22 LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities Thomas Schmied et.al. 2504.16078 null
2025-04-22 PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models Shi Qiu et.al. 2504.16074 null
2025-04-22 Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation Zhiyuan Hu et.al. 2504.16073 null
2025-04-22 Describe Anything: Detailed Localized Image and Video Captioning Long Lian et.al. 2504.16072 null
2025-04-22 A Python Tool for Reconstructing Full News Text from GDELT A. Fronzetti Colladon et.al. 2504.16063 link
2025-04-22 Vision language models are unreliable at trivial spatial cognition Sangeet Khemlani et.al. 2504.16061 null
2025-04-22 Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation Ziqiao Ma et.al. 2504.16060 link
2025-04-22 Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach Penghui Li et.al. 2504.16057 null
2025-04-22 Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability Daniel Hendriks et.al. 2504.16056 null
2025-04-22 LongMamba: Enhancing Mamba's Long Context Capabilities via Training-Free Receptive Field Enlargement Zhifan Ye et.al. 2504.16053 link
2025-04-22 Evaluating Vision Language Models (VLMs) for Radiology: A Comprehensive Analysis Frank Li et.al. 2504.16047 null
2025-04-22 Certified Mitigation of Worst-Case LLM Copyright Infringement Jingyu Zhang et.al. 2504.16046 null
2025-04-22 LLMs meet Federated Learning for Scalable and Secure IoT Management Yazan Otoum et.al. 2504.16032 null
2025-04-22 LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale Joya Chen et.al. 2504.16030 null
2025-04-22 Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V3 Ahmed R. Sadik et.al. 2504.16027 null
2025-04-22 Efficient Temporal Consistency in Diffusion-Based Video Editing with Adaptor Modules: A Theoretical Framework Xinyuan Song et.al. 2504.16016 null
2025-04-21 Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs Chun-Hsiao Yeh et.al. 2504.15280 link
2025-04-21 VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models Weiye Xu et.al. 2504.15279 null
2025-04-21 Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning Jie Cheng et.al. 2504.15275 link
2025-04-21 Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Guo Chen et.al. 2504.15271 null
2025-04-21 Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction Vaishnavh Nagarajan et.al. 2504.15266 link
2025-04-21 Interpretable Locomotion Prediction in Construction Using a Memory-Driven LLM Agent With Chain-of-Thought Reasoning Ehsan Ahmadi et.al. 2504.15263 null
2025-04-21 Leveraging Language Models for Automated Patient Record Linkage Mohammad Beheshti et.al. 2504.15261 null
2025-04-21 CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation Anirudh Khatry et.al. 2504.15254 link
2025-04-21 Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators Yilun Zhou et.al. 2504.15253 link
2025-04-21 MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning Yahan Yang et.al. 2504.15241 null
2025-04-21 Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions Saffron Huang et.al. 2504.15236 null
2025-04-21 A Self-Improving Coding Agent Maxime Robeyns et.al. 2504.15228 null
2025-04-21 EvalAgent: Discovering Implicit Evaluation Criteria from the Web Manya Wadhwa et.al. 2504.15219 null
2025-04-21 Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs Marina Sakharova et.al. 2504.15210 null
2025-04-21 Compute-Optimal LLMs Provably Generalize Better With Scale Marc Finzi et.al. 2504.15208 null
2025-04-21 Support Evaluation for the TREC 2024 RAG Track: Comparing Human versus LLM Judges Nandan Thakur et.al. 2504.15205 null
2025-04-22 Synergistic Weak-Strong Collaboration by Aligning Preferences Yizhu Jiao et.al. 2504.15188 null
2025-04-21 DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution Miaomiao Cai et.al. 2504.15176 null
2025-04-21 The Synthetic Imputation Approach: Generating Optimal Synthetic Texts For Underrepresented Categories In Supervised Classification Tasks Joan C. Timoneda et.al. 2504.15160 null
2025-04-21 KGMEL: Knowledge Graph-Enhanced Multimodal Entity Linking Juyeon Kim et.al. 2504.15135 link
2025-04-18 Generative AI Act II: Test Time Scaling Drives Cognition Engineering Shijie Xia et.al. 2504.13828 link
2025-04-18 Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models Junjie Yang et.al. 2504.13825 null
2025-04-18 CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning Yang Yue et.al. 2504.13820 link
2025-04-18 Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning Yixuan Even Xu et.al. 2504.13818 null
2025-04-18 BadApex: Backdoor Attack Based on Adaptive Optimization Mechanism of Black-box Large Language Models Zhengxian Wu et.al. 2504.13775 null
2025-04-18 DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs Tamim Al Mahmud et.al. 2504.13774 link
2025-04-18 Detecting Malicious Source Code in PyPI Packages with LLMs: Does RAG Come in Handy? Motunrayo Ibiyo et.al. 2504.13769 null
2025-04-18 Decoding Vision Transformers: the Diffusion Steering Lens Ryota Takatsuki et.al. 2504.13763 link
2025-04-18 Scaling sparse feature circuit finding for in-context learning Dmitrii Kharlapenko et.al. 2504.13756 null
2025-04-18 Learning to Attribute with Attention Benjamin Cohen-Wang et.al. 2504.13752 link
2025-04-18 Controlled Territory and Conflict Tracking (CONTACT): (Geo-)Mapping Occupied Territory from Open Source Intelligence Paul K. Mandal et.al. 2504.13730 link
2025-04-18 OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation Yichen Wu et.al. 2504.13707 null
2025-04-18 Exploring Multimodal Prompt for Visualization Authoring with Large Language Models Zhen Wen et.al. 2504.13700 null
2025-04-18 Analysing the Robustness of Vision-Language-Models to Common Corruptions Muhammad Usama et.al. 2504.13690 null
2025-04-18 Intelligent Interaction Strategies for Context-Aware Cognitive Augmentation Xiangrong et.al. 2504.13684 null
2025-04-18 Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results Andrea Santilli et.al. 2504.13677 null
2025-04-18 Large Language Models Will Change The Way Children Think About Technology And Impact Every Interaction Paradigm Russell Beale et.al. 2504.13667 null
2025-04-18 Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code Antonio Della Porta et.al. 2504.13656 null
2025-04-18 EyecareGPT: Boosting Comprehensive Ophthalmology Understanding with Tailored Dataset, Benchmark and Model Sijing Li et.al. 2504.13650 link
2025-04-18 Exploring the Potential for Large Language Models to Demonstrate Rational Probabilistic Beliefs Gabriel Freedman et.al. 2504.13644 link
2025-04-17 Perception Encoder: The best visual embeddings are not at the output of the network Daniel Bolya et.al. 2504.13181 null
2025-04-17 PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding Jang Hyun Cho et.al. 2504.13180 link
2025-04-17 It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization Ali Behrouz et.al. 2504.13173 null
2025-04-17 Sleep-time Compute: Beyond Inference Scaling at Test-time Kevin Lin et.al. 2504.13171 link
2025-04-17 Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling Tsung-Han Wu et.al. 2504.13169 link
2025-04-17 CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Shizhe Diao et.al. 2504.13161 null
2025-04-17 Digital Twin Generation from Visual Data: A Survey Andrew Melnik et.al. 2504.13159 link
2025-04-17 MIB: A Mechanistic Interpretability Benchmark Aaron Mueller et.al. 2504.13151 link
2025-04-17 Exploring Expert Failures Improves LLM Agent Tuning Li-Cheng Lan et.al. 2504.13145 null
2025-04-17 Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo João Loula et.al. 2504.13139 null
2025-04-17 Energy-Based Reward Models for Robust Language Model Alignment Anamika Lochab et.al. 2504.13134 link
2025-04-17 LLMs Meet Finance: Fine-Tuning Foundation Models for the Open FinLLM Leaderboard Varun Rao et.al. 2504.13125 null
2025-04-17 Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training Xinsong Zhang et.al. 2504.13123 null
2025-04-17 VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models Haojian Huang et.al. 2504.13122 link
2025-04-17 Probing and Inducing Combinational Creativity in Vision-Language Models Yongqian Peng et.al. 2504.13120 null
2025-04-17 Object-Driven Narrative in AR: A Scenario-Metaphor Framework with VLM Integration Yusi Sun et.al. 2504.13119 null
2025-04-17 Uncertainty-Aware Trajectory Prediction via Rule-Regularized Heteroscedastic Deep Classification Kumar Manas et.al. 2504.13111 link
2025-04-17 EventVAD: Training-Free Event-Aware Video Anomaly Detection Yihua Shao et.al. 2504.13092 null
2025-04-17 Retrieval-Augmented Generation with Conflicting Evidence Han Wang et.al. 2504.13079 link
2025-04-18 SkyReels-V2: Infinite-length Film Generative Model Guibin Chen et.al. 2504.13074 link
2025-04-16 BitNet b1.58 2B4T Technical Report Shuming Ma et.al. 2504.12285 null
2025-04-16 HLS-Eval: A Benchmark and Framework for Evaluating LLMs on High-Level Synthesis Design Tasks Stefan Abi-Karam et.al. 2504.12268 link
2025-04-16 FLIP Reasoning Challenge Andreas Plesner et.al. 2504.12256 link
2025-04-16 AnomalyGen: An Automated Semantic Log Sequence Generation Framework with LLM for Anomaly Detection Xinyu Li et.al. 2504.12250 null
2025-04-16 MOS: Towards Effective Smart Contract Vulnerability Detection through Mixture-of-Experts Tuning of Large Language Models Hang Yuan et.al. 2504.12234 null
2025-04-16 Watermarking Needs Input Repetition Masking David Khachaturov et.al. 2504.12229 null
2025-04-16 d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning Siyan Zhao et.al. 2504.12216 null
2025-04-16 What Do Large Language Models Know? Tacit Knowledge as a Potential Causal-Explanatory Structure Céline Budding et.al. 2504.12187 null
2025-04-16 SALAD: Improving Robustness and Generalization through Contrastive Learning with Structure-Aware and LLM-Driven Augmented Data Suyoung Bae et.al. 2504.12185 null
2025-04-16 Trusting CHATGPT: how minor tweaks in the prompts lead to major differences in sentiment classification Jaime E. Cuellar et.al. 2504.12180 null
2025-04-16 Multilingual Contextualization of Large Language Models for Document-Level Machine Translation Miguel Moura Ramos et.al. 2504.12140 null
2025-04-16 Efficient Contrastive Decoding with Probabilistic Hallucination Detection - Mitigating Hallucinations in Large Vision Language Models - Laura Fieback et.al. 2504.12137 null
2025-04-16 Clarifying Ambiguities: on the Role of Ambiguity Types in Prompting Methods for Clarification Generation Anfu Tang et.al. 2504.12113 null
2025-04-16 Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation Shizhan Cai et.al. 2504.12108 null
2025-04-16 Logits DeConfusion with CLIP for Few-Shot Learning Shuo Li et.al. 2504.12104 link
2025-04-16 Gauging Overprecision in LLMs: An Empirical Study Adil Bahaj et.al. 2504.12098 null
2025-04-16 Reasoning-Based AI for Startup Evaluation (R.A.I.S.E.): A Memory-Augmented, Multi-Step Decision Framework Jack Preuveneers et.al. 2504.12090 null
2025-04-16 Self-alignment of Large Video Language Models with Refined Regularized Preference Optimization Pritam Sarkar et.al. 2504.12083 null
2025-04-16 Selective Demonstration Retrieval for Improved Implicit Hate Speech Detection Yumin Kim et.al. 2504.12082 null
2025-04-16 Subitizing-Inspired_Large_Language_Models_for_Floorplanning Shao-Chien Lu et.al. 2504.12076 null
2025-04-16 Elucidating the Design Space of Multimodal Protein Language Models Cheng-Yen Hsieh et.al. 2504.11454 null
2025-04-15 TextArena Leon Guertler et.al. 2504.11442 link
2025-04-15 Masculine Defaults via Gendered Discourse in Podcasts and Large Language Models Maria Teleki et.al. 2504.11431 link
2025-04-15 A Dual-Space Framework for General Knowledge Distillation of Large Language Models Xue Zhang et.al. 2504.11426 null
2025-04-15 Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts Quanyu Long et.al. 2504.11420 null
2025-04-15 Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning Ali Taghibakhshi et.al. 2504.11409 null
2025-04-15 DataDecide: How to Predict Best Pretraining Data with Small Experiments Ian Magnusson et.al. 2504.11393 null
2025-04-15 RankAlign: A Ranking View of the Generator-Validator Gap in Large Language Models Juan Diego Rodriguez et.al. 2504.11381 link
2025-04-15 Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions Wang Bill Zhu et.al. 2504.11373 link
2025-04-15 OpenTuringBench: An Open-Model-based Benchmark and Framework for Machine-Generated Text Detection and Attribution Lucio La Cava et.al. 2504.11369 null
2025-04-15 From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation Jingkun Chen et.al. 2504.11368 null
2025-04-15 Teaching Large Language Models to Reason through Learning and Forgetting Tianwei Ni et.al. 2504.11364 link
2025-04-15 Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning Haiming Wang et.al. 2504.11354 link
2025-04-15 Seedream 3.0 Technical Report Yu Gao et.al. 2504.11346 null
2025-04-15 A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce Wei Xiong et.al. 2504.11343 link
2025-04-15 REWARD CONSISTENCY: Improving Multi-Objective Alignment from a Data-Centric Perspective Zhihao Xu et.al. 2504.11337 null
2025-04-15 Looking beyond the next token Abitha Thankaraj et.al. 2504.11336 null
2025-04-15 Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints Ruicheng Ao et.al. 2504.11320 link
2025-04-15 Learning to Be A Doctor: Searching for Effective Medical Agent Architectures Yangyang Zhuang et.al. 2504.11301 null
2025-04-15 Automated Python Translation Joshua Otten et.al. 2504.11290 null
2025-04-14 InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Jinguo Zhu et.al. 2504.10479 link
2025-04-14 Weight Ensembling Improves Reasoning in Language Models Xingyu Dang et.al. 2504.10478 null
2025-04-14 MIEB: Massive Image Embedding Benchmark Chenghao Xiao et.al. 2504.10471 link
2025-04-14 Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding Tao Zhang et.al. 2504.10465 link
2025-04-14 The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer Weixian Lei et.al. 2504.10462 link
2025-04-14 GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents Xiaobo Xia et.al. 2504.10458 null
2025-04-14 M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models Junxiong Wang et.al. 2504.10449 link
2025-04-14 Multimodal Long Video Modeling Based on Temporal Dynamic Context Haoran Hao et.al. 2504.10443 link
2025-04-14 LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models Minqian Liu et.al. 2504.10430 null
2025-04-14 Foundation models for electronic health records: representation dynamics and transferability Michael C. Burkhart et.al. 2504.10422 link
2025-04-14 Can We Edit LLMs for Long-Tail Biomedical Knowledge? Xinhao Yi et.al. 2504.10421 link
2025-04-15 Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA Michał Turski et.al. 2504.10419 link
2025-04-14 CliniChat: A Multi-Source Knowledge-Driven Framework for Clinical Interview Dialogue Reconstruction and Evaluation Jing Chen et.al. 2504.10418 null
2025-04-14 LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models Parshin Shojaee et.al. 2504.10415 link
2025-04-14 Performance of Large Language Models in Supporting Medical Diagnosis and Treatment Diogo Sousa et.al. 2504.10405 null
2025-04-14 Satellite Federated Fine-Tuning for Foundation Models in Space Computing Power Networks Yan zhu et.al. 2504.10403 null
2025-04-14 Can LLMs Assist Expert Elicitation for Probabilistic Causal Modeling? Olha Shaposhnyk et.al. 2504.10397 null
2025-04-14 SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning Yiting Wang et.al. 2504.10369 null
2025-04-14 DICE: A Framework for Dimensional and Contextual Evaluation of Language Models Aryan Shrivastava et.al. 2504.10359 null
2025-04-14 Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis Yifan Yang et.al. 2504.10352 null
2025-04-11 Quantum Large Language Model Fine-Tuning Sang Hyub Kim et.al. 2504.08732 null
2025-04-11 DocAgent: A Multi-Agent System for Automated Code Documentation Generation Dayu Yang et.al. 2504.08725 link
2025-04-11 SWAN-GPT: An Efficient and Scalable Approach for Long-Context Language Modeling Krishna C. Puvvada et.al. 2504.08719 null
2025-04-11 SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents Muhammad Shihab Rashid et.al. 2504.08703 link
2025-04-11 Large Language Models as Span Annotators Zdeněk Kasner et.al. 2504.08697 null
2025-04-11 TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning Hang Ni et.al. 2504.08694 null
2025-04-11 Fast-Slow-Thinking: Complex Task Solving with Large Language Models Yiliu Sun et.al. 2504.08690 null
2025-04-11 Voice Interaction With Conversational AI Could Facilitate Thoughtful Reflection and Substantive Revision in Writing Jiho Kim et.al. 2504.08687 null
2025-04-11 Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Team Seawead et.al. 2504.08685 null
2025-04-11 Variability-Driven User-Story Generation using LLM and Triadic Concept Analysis Alexandre Bazin et.al. 2504.08666 null
2025-04-11 Quality evaluation of Tabby coding assistant using real source code snippets Marta Borek et.al. 2504.08650 link
2025-04-11 Do LLMs trust AI regulation? Emerging behaviour of game-theoretic LLM agents Alessio Buscemi et.al. 2504.08640 null
2025-04-11 Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging Gabriele Lozupone et.al. 2504.08635 link
2025-04-11 MooseAgent: A LLM Based Multi-agent Framework for Automating Moose Simulation Tao Zhang et.al. 2504.08621 link
2025-04-11 Analyzing 16,193 LLM Papers for Fun and Profits Zhiqiu Xia et.al. 2504.08619 null
2025-04-11 Playpen: An Environment for Exploring Learning Through Conversational Interaction Nicola Horst et.al. 2504.08590 link
2025-04-11 AstroLLaVA: towards the unification of astronomical data and natural language Sharaf Zaman et.al. 2504.08583 null
2025-04-11 UoB-NLP at SemEval-2025 Task 11: Leveraging Adapters for Multilingual and Cross-Lingual Emotion Detection Frances Laureano De Leon et.al. 2504.08543 null
2025-04-11 Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions Tommaso Galliena et.al. 2504.08531 null
2025-04-11 On The Landscape of Spoken Language Models: A Comprehensive Survey Siddhant Arora et.al. 2504.08528 null
2025-04-10 Cat, Rat, Meow: On the Alignment of Language Model and Human Term-Similarity Judgments Lorenz Linhardt et.al. 2504.07965 null
2025-04-10 C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing Zhongyang Li et.al. 2504.07964 link
2025-04-10 GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation Lang Lin et.al. 2504.07962 null
2025-04-10 Detect Anything 3D in the Wild Hanxue Zhang et.al. 2504.07958 link
2025-04-10 MM-IFEngine: Towards Multimodal Instruction Following Shengyuan Ding et.al. 2504.07957 link
2025-04-10 VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning Yukun Qi et.al. 2504.07956 null
2025-04-10 Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory Mirac Suzgun et.al. 2504.07952 link
2025-04-10 We Are All Creators: Generative AI, Collective Knowledge, and the Path Towards Human-AI Synergy Jordi Linares-Pellicer et.al. 2504.07936 null
2025-04-10 Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining Rosie Zhao et.al. 2504.07912 link
2025-04-10 Porting an LLM based Application from ChatGPT to an On-Premise Environment Teemu Paloniemi et.al. 2504.07907 null
2025-04-10 Redefining Machine Translation on Social Network Services with Large Language Models Hongcheng Guo et.al. 2504.07901 link
2025-04-10 How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective Qi Liu et.al. 2504.07898 link
2025-04-10 Fast Adaptation with Behavioral Foundation Models Harshit Sikchi et.al. 2504.07896 null
2025-04-10 Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge Riccardo Cantini et.al. 2504.07887 link
2025-04-11 An LLM-Driven Multi-Agent Debate System for Mendelian Diseases Xinyang Zhou et.al. 2504.07881 null
2025-04-10 Token Level Routing Inference System for Edge Devices Jianshu She et.al. 2504.07878 null
2025-04-10 SAMJAM: Zero-Shot Video Scene Graph Generation for Egocentric Kitchen Videos Joshua Li et.al. 2504.07867 null
2025-04-11 Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs Yichun Yin et.al. 2504.07866 null
2025-04-10 Robust Hallucination Detection in LLMs via Adaptive Token Selection Mengjia Niu et.al. 2504.07863 null
2025-04-10 2D-Curri-DPO: Two-Dimensional Curriculum Learning for Direct Preference Optimization Mengyang Li et.al. 2504.07856 null
2025-04-09 Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning Nikhil Shivakumar Nayak et.al. 2504.07097 link
2025-04-09 OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens Jiacheng Liu et.al. 2504.07096 null
2025-04-09 Are We Done with Object-Centric Learning? Alexander Rubinstein et.al. 2504.07092 link
2025-04-09 KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs Elan Markowitz et.al. 2504.07087 null
2025-04-09 A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility Andreas Hochlehnert et.al. 2504.07086 null
2025-04-09 Self-Steering Language Models Gabriel Grand et.al. 2504.07081 null
2025-04-09 DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning Atharva Pandey et.al. 2504.07080 null
2025-04-09 Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation Israfel Salazar et.al. 2504.07072 null
2025-04-09 A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models Zhouhang Xie et.al. 2504.07070 null
2025-04-09 HalluciNot: Hallucination Detection Through Context and Common Knowledge Verification Bibek Paudel et.al. 2504.07069 null
2025-04-09 Teaching pathology foundation models to accurately predict gene expression with parameter efficient knowledge transfer Shi Pan et.al. 2504.07061 null
2025-04-09 TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling Liang-Hsuan Tseng et.al. 2504.07053 link
2025-04-09 To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning Tian Qin et.al. 2504.07052 null
2025-04-09 Evaluating Retrieval Augmented Generative Models for Document Queries in Transportation Safety Chad Melton et.al. 2504.07022 null
2025-04-09 LLM-IFT: LLM-Powered Information Flow Tracking for Secure Hardware Nowfel Mashnoor et.al. 2504.07015 null
2025-04-09 Towards LLMs Robustness to Changes in Prompt Format Styles Lilian Ngweta et.al. 2504.06969 null
2025-04-09 Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation Thomas Kerdreux et.al. 2504.06962 null
2025-04-09 VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning Xinhao Li et.al. 2504.06958 null
2025-04-09 Adaptive Computation Pruning for the Forgetting Transformer Zhixuan Lin et.al. 2504.06949 null
2025-04-09 RuOpinionNE-2024: Extraction of Opinion Tuples from Russian News Texts Natalia Loukachevitch et.al. 2504.06947 link
2025-04-08 GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization Bojana Ranković et.al. 2504.06265 link
2025-04-08 OmniSVG: A Unified Scalable Vector Graphics Generation Model Yiying Yang et.al. 2504.06263 null
2025-04-08 Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Gleb Rodionov et.al. 2504.06261 link
2025-04-08 FEABench: Evaluating Language Models on Multiphysics Reasoning Ability Nayantara Mudur et.al. 2504.06260 link
2025-04-08 Orb-v3: atomistic simulation at scale Benjamin Rhodes et.al. 2504.06231 link
2025-04-08 LExT: Towards Evaluating Trustworthiness of Natural Language Explanations Krithi Shailya et.al. 2504.06227 null
2025-04-08 Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation Biao Zhang et.al. 2504.06225 null
2025-04-09 Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation Xiaoxing Hu et.al. 2504.06220 link
2025-04-08 Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs Dongyang Fan et.al. 2504.06219 null
2025-04-08 From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models Chejian Xu et.al. 2504.06214 null
2025-04-08 TxGemma: Efficient and Agentic LLMs for Therapeutics Eric Wang et.al. 2504.06196 null
2025-04-08 A Self-Supervised Framework for Space Object Behaviour Characterisation Ian Groves et.al. 2504.06176 null
2025-04-08 Assessing how hyperparameters impact Large Language Models' sarcasm detection performance Montgomery Gole et.al. 2504.06166 null
2025-04-09 Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups Rijul Magu et.al. 2504.06160 null
2025-04-08 A Large-Scale Analysis on Contextual Self-Supervised Video Representation Learning Akash Kumar et.al. 2504.06153 null
2025-04-08 V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Xiangxi Zheng et.al. 2504.06148 link
2025-04-08 ARLO: A Tailorable Approach for Transforming Natural Language Software Requirements into Architecture using LLMs Tooraj Helmi et.al. 2504.06143 null
2025-04-08 Adversarial Training of Reward Models Alexander Bukharin et.al. 2504.06141 null
2025-04-08 A Multimedia Analytics Model for the Foundation Model Era Marcel Worring et.al. 2504.06138 null
2025-04-08 QGen Studio: An Adaptive Question-Answer Generation, Training and Evaluation Platform Movina Moses et.al. 2504.06136 null
2025-04-07 URECA: Unique Region Caption Anything Sangbeom Lim et.al. 2504.05305 null
2025-04-07 InteractVLM: 3D Interaction Reasoning from 2D Foundational Models Sai Kumar Dwivedi et.al. 2504.05303 link
2025-04-07 SmolVLM: Redefining small and efficient multimodal models Andrés Marafioti et.al. 2504.05299 null
2025-04-07 Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations Pedro Ferreira et.al. 2504.05294 null
2025-04-07 The challenge of uncertainty quantification of large language models in medicine Zahra Atf et.al. 2504.05278 null
2025-04-07 Enhancing LLM-Based Short Answer Grading with Retrieval-Augmented Generation Yucheng Chu et.al. 2504.05276 null
2025-04-07 Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models Yang Yan et.al. 2504.05262 null
2025-04-07 Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models Adrián Bazaga et.al. 2504.05258 null
2025-04-07 Explaining Low Perception Model Competency with High-Competency Counterfactuals Sara Pohland et.al. 2504.05254 null
2025-04-07 LLM-based Automated Grading with Human-in-the-Loop Hang Li et.al. 2504.05239 null
2025-04-07 NoveltyBench: Evaluating Creativity and Diversity in Language Models Yiming Zhang et.al. 2504.05228 null
2025-04-07 A Reality Check of Vision-Language Pre-training in Radiology: Have We Progressed Using Text? Julio Silva-Rodríguez et.al. 2504.05227 null
2025-04-07 Vision-Language Model Predictive Control for Manipulation Planning and Trajectory Generation Jiaming Chen et.al. 2504.05225 link
2025-04-08 Leveraging LLMs for Utility-Focused Annotation: Reducing Manual Effort for Retrieval and RAG Hengran Zhang et.al. 2504.05220 null
2025-04-07 Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling Hengran Zhang et.al. 2504.05216 null
2025-04-07 Post-Training Language Models for Continual Relation Extraction Sefika Efeoglu et.al. 2504.05214 null
2025-04-07 Quantum Program Linting with LLMs: Emerging Results from a Comparative Study Seung Yeob Shin et.al. 2504.05204 null
2025-04-07 Training state-of-the-art pathology foundation models with orders of magnitude less data Mikhail Karasikov et.al. 2504.05186 null
2025-04-07 Concise Reasoning via Reinforcement Learning Mehdi Fatemi et.al. 2504.05185 link
2025-04-07 BRIDGES: Bridging Graph Modality and Large Language Models within EDA Tasks Wei Li et.al. 2504.05180 null
2025-04-04 Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions Ting-Hsuan Liao et.al. 2504.03639 null
2025-04-04 Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning Xinyi Wang et.al. 2504.03635 null
2025-04-04 Align to Structure: Aligning Large Language Models with Structural Information Zae Myung Kim et.al. 2504.03622 null
2025-04-04 VISTA-OCR: Towards generative and interactive end to end OCR models Laziz Hamdi et.al. 2504.03621 null
2025-04-04 Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task Leonardo Ranaldi et.al. 2504.03616 null
2025-04-04 AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset Bingxiang He et.al. 2504.03612 null
2025-04-04 MedSAM2: Segment Anything in 3D Medical Images and Videos Jun Ma et.al. 2504.03600 link
2025-04-04 EnrichIndex: Using LLMs to Enrich Retrieval Indices Offline Peter Baile Chen et.al. 2504.03598 null
2025-04-04 PF3Det: A Prompted Foundation Feature Assisted Visual LiDAR 3D Detector Kaidong Li et.al. 2504.03563 null
2025-04-04 Agentic Knowledgeable Self-awareness Shuofei Qiao et.al. 2504.03553 link
2025-04-04 RANa: Retrieval-Augmented Navigation Gianluca Monaci et.al. 2504.03524 null
2025-04-04 Neutralizing the Narrative: AI-Powered Debiasing of Online News Articles Chen Wei Kuo et.al. 2504.03520 null
2025-04-04 SpectR: Dynamically Composing LM Experts with Spectral Routing William Fleshman et.al. 2504.03454 null
2025-04-04 Optimizing Specific and Shared Parameters for Efficient Parameter Tuning Van-Anh Nguyen et.al. 2504.03450 null
2025-04-04 LLMSched: Uncertainty-Aware Workload Scheduling for Compound LLM Applications Botao Zhu et.al. 2504.03444 null
2025-04-04 Know What You do Not Know: Verbalized Uncertainty Estimation Robustness on Corrupted Images in Vision-Language Models Mirko Borszukovszki et.al. 2504.03440 null
2025-04-04 Locations of Characters in Narratives: Andersen and Persuasion Datasets Batuhan Ozyurt et.al. 2504.03434 link
2025-04-04 Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning Sanghwan Bae et.al. 2504.03380 null
2025-04-04 MultiClear: Multimodal Soft Exoskeleton Glove for Transparent Object Grasping Assistance Chen Hu et.al. 2504.03379 null
2025-04-04 Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency Erik Johannes Husom et.al. 2504.03360 null
2025-04-03 STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection Divya Velayudhan et.al. 2504.02823 null
2025-04-03 Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models Mateusz Pach et.al. 2504.02821 link
2025-04-03 Generative Evaluation of Complex Reasoning in Large Language Models Haowei Lin et.al. 2504.02810 link
2025-04-03 MegaMath: Pushing the Limits of Open Math Corpora Fan Zhou et.al. 2504.02807 link
2025-04-03 F-ViTA: Foundation Model Guided Visible to Thermal Translation Jay N. Paranjape et.al. 2504.02801 link
2025-04-04 A Survey of Large Language Models in Mental Health Disorder Detection on Social Media Zhuohan Ge et.al. 2504.02800 null
2025-04-03 Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence Anita Rau et.al. 2504.02799 null
2025-04-03 A Framework for Situating Innovations, Opportunities, and Challenges in Advancing Vertical Systems with Large AI Models Gaurav Verma et.al. 2504.02793 null
2025-04-03 Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets Chuning Zhu et.al. 2504.02792 null
2025-04-03 A Framework for Robust Cognitive Evaluation of LLMs Karin de Langis et.al. 2504.02789 null
2025-04-03 From Consumption to Collaboration: Measuring Interaction Patterns to Augment Human Cognition in Open-Ended Tasks Joshua Holstein et.al. 2504.02780 null
2025-04-03 BT-ACTION: A Test-Driven Approach for Modular Understanding of User Instruction Leveraging Behaviour Trees and LLMs Alexander Leszczynski et.al. 2504.02779 link
2025-04-03 How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices? Andres Algaba et.al. 2504.02767 link
2025-04-03 Robot-Led Vision Language Model Wellbeing Assessment of Children Nida Itrat Abbasi et.al. 2504.02765 null
2025-04-03 Enhancing LLM Robustness to Perturbed Instructions: An Empirical Study Aryan Agrawal et.al. 2504.02733 link
2025-04-04 Why do LLMs attend to the first token? Federico Barbero et.al. 2504.02732 null
2025-04-03 ERPO: Advancing Safety Alignment via Ex-Ante Reasoning Preference Optimization Kehua Feng et.al. 2504.02725 null
2025-04-03 TeleMoM: Consensus-Driven Telecom Intelligence via Mixture of Models Xinquan Wang et.al. 2504.02712 null
2025-04-03 The Hidden Space of Safety: Understanding Preference-Tuned LLMs in Multilingual context Nikhil Verma et.al. 2504.02708 null
2025-04-03 LLM for Complex Reasoning Task: An Exploratory Study in Fermi Problems Zishuo Liu et.al. 2504.02671 null
2025-04-02 Slot-Level Robotic Placement via Visual Imitation from Single Human Video Dandan Shan et.al. 2504.01959 null
2025-04-02 Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target Granularities Jing Liu et.al. 2504.01954 null
2025-04-02 The LLM Wears Prada: Analysing Gender Bias and Stereotypes through Online Shopping Data Massimiliano Luca et.al. 2504.01951 null
2025-04-02 Efficient Federated Learning Tiny Language Models for Mobile Network Feature Prediction Daniel Becking et.al. 2504.01947 null
2025-04-02 OpenCodeReasoning: Advancing Data Distillation for Competitive Coding Wasi Uddin Ahmad et.al. 2504.01943 null
2025-04-02 Critical Thinking: Which Kinds of Complexity Govern Optimal Reasoning Length? Celine Lee et.al. 2504.01935 link
2025-04-02 A thorough benchmark of automatic text classification: From traditional approaches to large language models Washington Cunha et.al. 2504.01930 link
2025-04-02 Gen-C: Populating Virtual Worlds with Generative Crowds Andreas Panayiotou et.al. 2504.01924 null
2025-04-02 Is Less Really More? Fake News Detection with Limited Information Zhaoyang Cao et.al. 2504.01922 link
2025-04-02 Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation Baban Gain et.al. 2504.01919 null
2025-04-02 FineLIP: Extending CLIP's Reach via Fine-Grained Alignment with Longer Text Inputs Mothilal Asokan et.al. 2504.01916 link
2025-04-02 Advancing AI-Scientist Understanding: Making LLM Think Like a Physicist with Interpretable Reasoning Yinggan Xu et.al. 2504.01911 null
2025-04-02 Is Temporal Prompting All We Need For Limited Labeled Action Recognition? Shreyank N Gowda et.al. 2504.01890 null
2025-04-02 TransientTables: Evaluating LLMs' Reasoning on Temporally Evolving Semi-structured Tables Abhilash Shankarampeta et.al. 2504.01879 null
2025-04-02 From Code Generation to Software Testing: AI Copilot with Context-Based RAG Yuchen Wang et.al. 2504.01866 null
2025-04-02 Cross-Lingual Consistency: A Novel Inference Framework for Advancing Reasoning in Large Language Models Zhiwei Yu et.al. 2504.01857 null
2025-04-02 Code Red! On the Harmfulness of Applying Off-the-shelf Large Language Models to Programming Tasks Ali Al-Kaswan et.al. 2504.01850 null
2025-04-02 LARGE: Legal Retrieval Augmented Generation Evaluation Tool Minhu Park et.al. 2504.01840 link
2025-04-02 Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images Nusrat Munia et.al. 2504.01838 link
2025-04-02 YourBench: Easy Custom Evaluation Sets for Everyone Sumuk Shashidhar et.al. 2504.01833 link
2025-03-31 Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Shengqiong Wu et.al. 2503.24379 null
2025-03-31 ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning Harsha Kokel et.al. 2503.24378 null
2025-03-31 Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models Rui Wang et.al. 2503.24377 link
2025-03-31 Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Yi Chen et.al. 2503.24376 link
2025-03-31 Effectively Controlling Reasoning Models through Thinking Intervention Tong Wu et.al. 2503.24370 null
2025-03-31 Adapting Vision Foundation Models for Real-time Ultrasound Image Segmentation Xiaoran Zhang et.al. 2503.24368 null
2025-03-31 ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion Rana Muhammad Shahroz Khan et.al. 2503.24354 null
2025-03-31 PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks Fang Yan et.al. 2503.24345 null
2025-03-31 Can Test-Time Scaling Improve World Foundation Model? Wenyan Cong et.al. 2503.24320 link
2025-03-31 BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models Alok Abhishek et.al. 2503.24310 null
2025-03-31 A Systematic Evaluation of LLM Strategies for Mental Health Text Analysis: Fine-tuning vs. Prompt Engineering vs. RAG Arshia Kermani et.al. 2503.24307 null
2025-03-31 Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning Jiacheng Lin et.al. 2503.24289 link
2025-03-31 Style Quantization for Data-Efficient GAN Training Jian Wang et.al. 2503.24282 null
2025-03-31 Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality Sewoong Lee et.al. 2503.24277 link
2025-03-31 Enhancing Large Language Models (LLMs) for Telecommunications using Knowledge Graphs and Retrieval-Augmented Generation Dun Yuan et.al. 2503.24245 null
2025-03-31 What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Qiyuan Zhang et.al. 2503.24235 link
2025-03-31 Synthetic News Generation for Fake News Classification Abdul Sittar et.al. 2503.24206 null
2025-03-31 TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers' Guidance Jingxian Xu et.al. 2503.24198 null
2025-03-31 Text2Tracks: Prompt-based Music Recommendation via Generative Retrieval Enrico Palumbo et.al. 2503.24193 null
2025-03-31 Output Constraints as Attack Surface: Exploiting Structured Generation to Bypass LLM Safety Mechanisms Shuoming Zhang et.al. 2503.24191 null
2025-03-28 Q-Insight: Understanding Image Quality via Visual Reinforcement Learning Weiqi Li et.al. 2503.22679 link
2025-03-28 QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks? Belinda Z. Li et.al. 2503.22674 link
2025-03-28 Exploring the Effectiveness of Multi-stage Fine-tuning for Cross-encoder Re-rankers Francesca Pezzuti et.al. 2503.22672 link
2025-03-28 Understanding Co-speech Gestures in-the-wild Sindhu B Hegde et.al. 2503.22668 null
2025-03-28 Unicorn: Text-Only Data Synthesis for Vision Language Model Training Xiaomin Yu et.al. 2503.22655 link
2025-03-28 Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users Antonia Karamolegkou et.al. 2503.22610 null
2025-03-28 On the Alignment of Post-Publication Reviews & Bibliometric and Altmetric Impact -- A Case Study on Expert Statements from the Science Media Center Germany Dirk Tunger et.al. 2503.22594 null
2025-03-28 LLM-enabled Instance Model Generation Fengjunjie Pan et.al. 2503.22587 null
2025-03-28 Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish Kevin Cohen et.al. 2503.22585 link
2025-03-28 Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation Sarubi Thillainathan et.al. 2503.22582 null
2025-03-28 Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization Iñigo Pikabea et.al. 2503.22577 null
2025-03-28 Niyama : Breaking the Silos of LLM Inference Serving Kanishk Goel et.al. 2503.22562 null
2025-03-28 Bridging the Dimensional Chasm: Uncover Layer-wise Dimensional Reduction in Transformers through Token Correlation Zhuo-Yang Song et.al. 2503.22547 null
2025-03-28 Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities Raman Dutt et.al. 2503.22517 null
2025-03-28 Assessing Foundation Models for Sea Ice Type Segmentation in Sentinel-1 SAR Imagery Samira Alkaee Taleghan et.al. 2503.22516 null
2025-03-28 Probabilistic Uncertain Reward Model: A Natural Generalization of Bradley-Terry Reward Model Wangtao Sun et.al. 2503.22480 null
2025-03-28 WorkTeam: Constructing Workflows from Natural Language with Multi-Agents Hanchao Liu et.al. 2503.22473 null
2025-03-28 Evaluating LLM-based Agents for Multi-Turn Conversations: A Survey Shengyue Guan et.al. 2503.22458 null
2025-03-28 Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning Abdullah Vanlioglu et.al. 2503.22456 null
2025-03-28 STADE: Standard Deviation as a Pruning Metric Diego Coello de Portugal Mecke et.al. 2503.22451 link
2025-03-27 Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model Abdelrahman Shaker et.al. 2503.21782 link
2025-03-27 Video-R1: Reinforcing Video Reasoning in MLLMs Kaituo Feng et.al. 2503.21776 link
2025-03-27 Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence Haolin Liu et.al. 2503.21766 null
2025-03-27 Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video David Yifan Yao et.al. 2503.21761 link
2025-03-27 MemInsight: Autonomous Memory Augmentation for LLM Agents Rana Salama et.al. 2503.21760 null
2025-03-27 Fwd2Bot: LVLM Visual Token Compression with Double Forward Bottleneck Adrian Bulat et.al. 2503.21757 null
2025-03-27 GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics Arsham Gholamzadeh Khoee et.al. 2503.21735 null
2025-03-27 Effective Skill Unlearning through Intervention and Abstention Yongce Li et.al. 2503.21730 link
2025-03-27 Collab: Controlled Decoding using Mixture of Agents for LLM Alignment Souradip Chakraborty et.al. 2503.21720 null
2025-03-27 Outlier dimensions favor frequent tokens in language model Iuri Macocco et.al. 2503.21718 null
2025-03-27 As easy as PIE: understanding when pruning causes language models to disagree Pietro Tropeano et.al. 2503.21714 link
2025-03-27 Enhancing Repository-Level Software Repair via Repository-Aware Knowledge Graphs Boyang Yang et.al. 2503.21710 null
2025-03-27 LLM-Gomoku: A Large Language Model-Based System for Strategic Gomoku with Self-Play and Reinforcement Learning Hui Wang et.al. 2503.21683 null
2025-03-27 JiraiBench: A Bilingual Benchmark for Evaluating Large Language Models' Detection of Human Self-Destructive Behavior Content in Jirai Community Yunze Xiao et.al. 2503.21679 null
2025-03-27 How do language models learn facts? Dynamics, curricula and hallucinations Nicolas Zucchet et.al. 2503.21676 null
2025-03-27 Intelligent IoT Attack Detection Design via ODLLM with Feature Ranking-based Knowledge Base Satvik Verma et.al. 2503.21674 link
2025-03-27 Model Assembly Learning with Heterogeneous Layer Weight Merging Yi-Kai Zhang et.al. 2503.21657 null
2025-03-27 UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning Zhengxi Lu et.al. 2503.21620 link
2025-03-27 Leveraging Language Models for Analyzing Longitudinal Experiential Data in Education Ahatsham Hayat et.al. 2503.21617 null
2025-03-27 Evaluating book summaries from internal knowledge in Large Language Models: a cross-model and semantic consistency approach Javier Coronado-Blázquez et.al. 2503.21613 null
2025-03-26 Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark Sondos Mahmoud Bsharat et.al. 2503.20786 link
2025-03-26 Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency Tianqi Liu et.al. 2503.20785 link
2025-03-26 Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Shijie Zhou et.al. 2503.20776 null
2025-03-26 ASGO: Adaptive Structured Gradient Optimization Kang An et.al. 2503.20762 null
2025-03-26 MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search Yunhai Hu et.al. 2503.20757 null
2025-03-27 Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning Huajie Tan et.al. 2503.20752 null
2025-03-26 UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines Chen Tang et.al. 2503.20748 null
2025-03-26 MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams Yanpeng Sun et.al. 2503.20745 null
2025-03-26 Dynamic Motion Blending for Versatile Motion Editing Nan Jiang et.al. 2503.20724 null
2025-03-26 From Annotation to Adaptation: Metrics, Synthetic Data, and Aspect Extraction for Aspect-Based Sentiment Analysis with Large Language Models Nikita Neveditsin et.al. 2503.20715 null
2025-03-26 MMMORRF: Multimodal Multilingual Modularized Reciprocal Rank Fusion Saron Samuel et.al. 2503.20698 null
2025-03-26 Graph-Enhanced Model-Free Reinforcement Learning Agents for Efficient Power Grid Topological Control Eloy Anguiano Batanero et.al. 2503.20688 null
2025-03-27 Flip Learning: Weakly Supervised Erase to Segment Nodules in Breast Ultrasound Yuhao Huang et.al. 2503.20685 null
2025-03-27 Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy Yinan Sun et.al. 2503.20673 null
2025-03-26 TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews Huimin Xu et.al. 2503.20666 null
2025-03-26 AutoRad-Lung: A Radiomic-Guided Prompting Autoregressive Vision-Language Model for Lung Nodule Malignancy Prediction Sadaf Khademi et.al. 2503.20662 null
2025-03-26 AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports Xiangwen Zhang et.al. 2503.20654 null
2025-03-26 Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging Han Wu et.al. 2503.20641 link
2025-03-26 Collaborative Storytelling and LLM: A Linguistic Analysis of Automatically-Generated Role-Playing Game Sessions Alessandro Maisto et.al. 2503.20623 null
2025-03-26 IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting Hao Fu et.al. 2503.20612 link
2025-03-25 SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining Xiang Xu et.al. 2503.19912 link
2025-03-25 CoLLM: A Large Language Model for Composed Image Retrieval Chuong Huynh et.al. 2503.19910 link
2025-03-25 FullDiT: Multi-Task Video Generative Foundation Model with Full Attention Xuan Ju et.al. 2503.19907 null
2025-03-25 CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning Hao Yu et.al. 2503.19900 link
2025-03-25 A Multi-Agent Framework Integrating Large Language Models and Generative AI for Accelerated Metamaterial Design Jie Tian et.al. 2503.19889 null
2025-03-25 CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation Nengbo Wang et.al. 2503.19878 null
2025-03-25 Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators Seungone Kim et.al. 2503.19877 null
2025-03-25 SLA-Awareness for AI-assisted coding Kishanthan Thangarajah et.al. 2503.19876 null
2025-03-25 Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking Xiaoyu Tian et.al. 2503.19855 null
2025-03-25 Towards Online Multi-Modal Social Interaction Understanding Xinpeng Li et.al. 2503.19851 link
2025-03-25 FALCONEye: Finding Answers and Localizing Content in ONE-hour-long videos with multi-modal LLMs Carlos Plou et.al. 2503.19850 null
2025-03-25 A Comparative Analysis of Word Segmentation, Part-of-Speech Tagging, and Named Entity Recognition for Historical Chinese Sources, 1900-1950 Zhao Fang et.al. 2503.19844 null
2025-03-25 FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model Jun Zhou et.al. 2503.19839 null
2025-03-25 Domain-incremental White Blood Cell Classification with Privacy-aware Continual Learning Pratibha Kumari et.al. 2503.19819 null
2025-03-25 SeLIP: Similarity Enhanced Contrastive Language Image Pretraining for Multi-modal Head MRI Zhiyang Liu et.al. 2503.19801 null
2025-03-25 SemEval-2025 Task 9: The Food Hazard Detection Challenge Korbinian Randl et.al. 2503.19800 null
2025-03-25 PAVE: Patching and Adapting Video Large Language Models Zhuoming Liu et.al. 2503.19794 link
2025-03-25 Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models Kartik Thakral et.al. 2503.19783 null
2025-03-25 LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation Vladan Stojnić et.al. 2503.19777 link
2025-03-25 OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations Christina Kassab et.al. 2503.19764 null
2025-03-24 DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation Karim Abou Zeid et.al. 2503.18944 link
2025-03-24 SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding Mingze Xu et.al. 2503.18943 null
2025-03-24 Video-T1: Test-Time Scaling for Video Generation Fangfu Liu et.al. 2503.18942 null
2025-03-24 Exploring Training and Inference Scaling Laws in Generative Retrieval Hongru Cai et.al. 2503.18941 link
2025-03-24 CoMP: Continual Multimodal Pre-training for Vision Foundation Models Yitong Chen et.al. 2503.18931 link
2025-03-24 Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training Brian R. Bartoldson et.al. 2503.18929 null
2025-03-24 Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models Meng Cao et.al. 2503.18923 null
2025-03-24 FFN Fusion: Rethinking Sequential Computation in Large Language Models Akhiad Bercovich et.al. 2503.18908 null
2025-03-24 xKV: Cross-Layer SVD for KV-Cache Compression Chi-Chih Chang et.al. 2503.18893 link
2025-03-24 AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration Zhexuan Wang et.al. 2503.18891 link
2025-03-24 Toward building next-generation Geocoding systems: a systematic review Zhengcong Yin et.al. 2503.18888 null
2025-03-24 I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Andrey Galichin et.al. 2503.18878 link
2025-03-24 Efficient Self-Supervised Adaptation for Medical Image Analysis Moein Sorkhei et.al. 2503.18873 link
2025-03-24 Reimagining Memory Access for LLM Inference: Compression-Aware Memory Controller Design Rui Xie et.al. 2503.18869 null
2025-03-24 Reasoning to Learn from Latent Thoughts Yangjun Ruan et.al. 2503.18866 null
2025-03-24 Structuring Scientific Innovation: A Framework for Modeling and Discovering Impactful Knowledge Combinations Junlan Chen et.al. 2503.18865 null
2025-03-24 MC-LLaVA: Multi-Concept Personalized Vision-Language Model Ruichuan An et.al. 2503.18854 link
2025-03-24 Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations Jeonghyeon Kim et.al. 2503.18817 link
2025-03-24 Defeating Prompt Injections by Design Edoardo Debenedetti et.al. 2503.18813 null
2025-03-24 SKDU at De-Factify 4.0: Vision Transformer with Data Augmentation for AI-Generated Image Detection Shrikant Malviya et.al. 2503.18812 link
2025-03-21 Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique Yansi Li et.al. 2503.17363 null
2025-03-21 HCAST: Human-Calibrated Autonomy Software Tasks David Rein et.al. 2503.17354 link
2025-03-21 NdLinear Is All You Need for Representation Learning Alex Reneau et.al. 2503.17353 link
2025-03-21 OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement Yihe Deng et.al. 2503.17352 link
2025-03-21 Beyond Semantics: Rediscovering Spatial Awareness in Vision-Language Models Jianing Qi et.al. 2503.17349 null
2025-03-21 Capturing Individual Human Preferences with Reward Features André Barreto et.al. 2503.17338 null
2025-03-21 Efficient Intent-Based Filtering for Multi-Party Conversations Using Knowledge Distillation from LLMs Reem Gody et.al. 2503.17336 null
2025-03-21 CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application Vulnerabilities Yuxuan Zhu et.al. 2503.17332 link
2025-03-21 LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language Kun Chu et.al. 2503.17309 link
2025-03-21 Bugdar: AI-Augmented Secure Code Review for GitHub Pull Requests John Naulty et.al. 2503.17302 null
2025-03-21 FastCuRL: Curriculum Reinforcement Learning with Progressive Context Extension for Efficient Training R1-like Reasoning Models Mingyang Song et.al. 2503.17287 link
2025-03-21 CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement Gaifan Zhang et.al. 2503.17279 null
2025-03-21 Revisiting End To End Sparse Autoencoder Training -- A Short Finetune is All You Need Adam Karvonen et.al. 2503.17272 link
2025-03-21 SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging Aladin Djuhera et.al. 2503.17239 link
2025-03-21 Slide-Level Prompt Learning with Vision Language Models for Few-Shot Multiple Instance Learning in Histopathology Devavrat Tomar et.al. 2503.17238 link
2025-03-21 FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs Albert Sawczyn et.al. 2503.17229 null
2025-03-21 Automating Adjudication of Cardiovascular Events Using Large Language Models Sonish Sivarajkumar et.al. 2503.17222 null
2025-03-21 A Language Anchor-Guided Method for Robust Noisy Domain Generalization Zilin Dai et.al. 2503.17211 null
2025-03-21 TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning Sheng Wang et.al. 2503.17195 null
2025-03-21 LLMs Love Python: A Study of LLMs' Bias for Programming Languages and Libraries Lukas Twist et.al. 2503.17181 link
2025-03-20 DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding Keyan Chen et.al. 2503.16426 link
2025-03-20 Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Yang Sui et.al. 2503.16419 link
2025-03-20 M3: 3D-Spatial MultiModal Memory Xueyan Zou et.al. 2503.16413 link
2025-03-20 The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination Yifan Sun et.al. 2503.16402 link
2025-03-20 Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them Guanyu Chen et.al. 2503.16401 null
2025-03-20 Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation Yijia Luo et.al. 2503.16385 link
2025-03-20 LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images Leyang Wang et.al. 2503.16376 null
2025-03-20 JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse Muyao Li et.al. 2503.16365 null
2025-03-20 CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners Yunzhi Yao et.al. 2503.16356 link
2025-03-20 Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences Krithik Ramesh et.al. 2503.16351 null
2025-03-20 LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates Ying Shen et.al. 2503.16334 null
2025-03-20 OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence Long Yuan et.al. 2503.16326 null
2025-03-20 Issue2Test: Generating Reproducing Test Cases from Issue Reports Noor Nashid et.al. 2503.16320 null
2025-03-20 Bridging Technology and Humanities: Evaluating the Impact of Large Language Models on Social Sciences Research with DeepSeek-R1 Peiran Gu et.al. 2503.16304 null
2025-03-20 Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model Zhaochong An et.al. 2503.16282 link
2025-03-20 Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens Shuqi Lu et.al. 2503.16278 link
2025-03-20 Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data Zijian Li et.al. 2503.16260 null
2025-03-20 Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models Keda Tao et.al. 2503.16257 null
2025-03-21 Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning Zhaowei Liu et.al. 2503.16252 link
2025-03-20 Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Quy-Anh Dang et.al. 2503.16219 link
2025-03-19 TULIP: Towards Unified Language-Image Pretraining Zineng Tang et.al. 2503.15485 null
2025-03-19 SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks Yifei Zhou et.al. 2503.15478 link
2025-03-19 What Makes a Reward Model a Good Teacher? An Optimization Perspective Noam Razin et.al. 2503.15477 link
2025-03-19 Cube: A Roblox View of 3D Intelligence Foundation AI Team et.al. 2503.15475 link
2025-03-19 EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining Boshen Xu et.al. 2503.15470 link
2025-03-19 From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment Jia-Nan Li et.al. 2503.15463 link
2025-03-19 SkyLadder: Better and Faster Pretraining via Context Window Scheduling Tongyao Zhu et.al. 2503.15450 link
2025-03-19 VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning Yang Tan et.al. 2503.15438 link
2025-03-19 Visual Position Prompt for MLLM based Visual Grounding Wei Tang et.al. 2503.15426 link
2025-03-19 Probing the topology of the space of tokens with structured prompts Michael Robinson et.al. 2503.15421 null
2025-03-19 Visual Persona: Foundation Model for Full-Body Human Customization Jisu Nam et.al. 2503.15406 null
2025-03-19 FedSCA: Federated Tuning with Similarity-guided Collaborative Aggregation for Heterogeneous Medical Image Segmentation Yumin Zhang et.al. 2503.15390 null
2025-03-19 EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models Yinan Liang et.al. 2503.15369 null
2025-03-19 SemEval-2025 Task 1: AdMIRe -- Advancing Multimodal Idiomaticity Representation Thomas Pickard et.al. 2503.15358 null
2025-03-19 SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models I-Fan Lin et.al. 2503.15351 null
2025-03-19 TruthLens:A Training-Free Paradigm for DeepFake Detection Ritabrata Chakraborty et.al. 2503.15342 null
2025-03-19 Uncertainty-Guided Chain-of-Thought for Code Generation with LLMs Yuqi Zhu et.al. 2503.15341 null
2025-03-19 Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context Junyi Ao et.al. 2503.15338 link
2025-03-19 Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport Hao Tan et.al. 2503.15337 link
2025-03-19 Euclid Quick Data Release (Q1) Exploring galaxy properties with a multi-modal foundation model Euclid Collaboration et.al. 2503.15312 link
2025-03-18 Aligning Multimodal LLM with Human Preference: A Survey Tao Yu et.al. 2503.14504 link
2025-03-18 Engineering Scientific Assistants using Interactive Structured Induction of Programs Shraddha Surana et.al. 2503.14488 null
2025-03-18 Gricean Norms as a Basis for Effective Collaboration Fardin Saad et.al. 2503.14484 link
2025-03-18 Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Xinyu Fang et.al. 2503.14478 link
2025-03-18 Characterizing Data Visualization Literacy: a Systematic Literature Review Sara Beschi et.al. 2503.14468 null
2025-03-18 RWKV-7 "Goose" with Expressive Dynamic State Evolution Bo Peng et.al. 2503.14456 link
2025-03-18 EnvBench: A Benchmark for Automated Environment Setup Aleksandra Eliseeva et.al. 2503.14443 link
2025-03-18 LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers Nikhil Abhyankar et.al. 2503.14434 link
2025-03-18 PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play Wei Fang et.al. 2503.14432 null
2025-03-18 ExDDV: A New Dataset for Explainable Deepfake Detection in Video Vlad Hondru et.al. 2503.14421 link
2025-03-18 Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models Siwei Zhang et.al. 2503.14411 null
2025-03-18 Large Language Models for Virtual Human Gesture Selection Parisa Ghanad Torshizi et.al. 2503.14408 null
2025-03-18 DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers Mert Bulent Sariyildiz et.al. 2503.14405 null
2025-03-18 From "Hallucination" to "Suture": Insights from Language Philosophy to Enhance Large Language Models Qiantong Wang et.al. 2503.14392 null
2025-03-18 How much do LLMs learn from negative examples? Shadi Hamdan et.al. 2503.14391 link
2025-03-18 Good/Evil Reputation Judgment of Celebrities by LLMs via Retrieval Augmented Generation Rikuto Tsuchida et.al. 2503.14382 null
2025-03-18 On the Standard Performance Criteria for Applied Control Design: PID, MPC or Machine Learning Controller? Pouria Sarhadi et.al. 2503.14379 link
2025-03-18 Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels Maximilian Beck et.al. 2503.14376 link
2025-03-18 MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts Runqi Meng et.al. 2503.14355 null
2025-03-19 MoonCast: High-Quality Zero-Shot Podcast Generation Zeqian Ju et.al. 2503.14345 link
2025-03-17 MetaScale: Test-Time Scaling with Evolving Meta-Thoughts Qin Liu et.al. 2503.13447 null
2025-03-17 MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation Zhenyu Wu et.al. 2503.13446 null
2025-03-17 Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance Noah Y. Siegel et.al. 2503.13445 null
2025-03-17 VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Ye Liu et.al. 2503.13444 link
2025-03-17 DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models Haoyang Li et.al. 2503.13443 link
2025-03-18 MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling Yingyue Li et.al. 2503.13440 link
2025-03-17 xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference Maximilian Beck et.al. 2503.13427 link
2025-03-17 SuperBPE: Space Travel for Language Models Alisa Liu et.al. 2503.13423 null
2025-03-17 A Comprehensive Survey on Multi-Agent Cooperative Decision-Making: Scenarios, Approaches, Challenges and Perspectives Weiqiang Jin et.al. 2503.13415 null
2025-03-18 DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective Dengyun Peng et.al. 2503.13413 link
2025-03-17 Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis Alexander Ku et.al. 2503.13401 null
2025-03-17 MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research James Burgess et.al. 2503.13399 link
2025-03-17 Aligned Probing: Relating Toxic Behavior and Model Internals Andreas Waldis et.al. 2503.13390 null
2025-03-17 Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning Mengyao Lyu et.al. 2503.13383 null
2025-03-17 Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions Wan Ju Kang et.al. 2503.13369 null
2025-03-17 Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning Hai-Long Sun et.al. 2503.13360 null
2025-03-17 Agents Play Thousands of 3D Video Games Zhongwen Xu et.al. 2503.13356 null
2025-03-17 Valid Text-to-SQL Generation with Unification-based DeepStochLog Ying Jiao et.al. 2503.13342 link
2025-03-17 LearnMate: Enhancing Online Education with LLM-Powered Personalized Learning Plans and Support Xinyu Jessica Wang et.al. 2503.13340 null
2025-03-17 Reliable and Efficient Amortized Model-based Evaluation Sang Truong et.al. 2503.13335 null
2025-03-14 Tit-for-Tat: Safeguarding Large Vision-Language Models Against Jailbreak Attacks via Adversarial Defense Shuyang Hao et.al. 2503.11619 null
2025-03-14 ASMA-Tune: Unlocking LLMs' Assembly Code Comprehension via Structural-Semantic Instruction Tuning Xinyi Wang et.al. 2503.11617 link
2025-03-14 Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages Matteo Farina et.al. 2503.11609 link
2025-03-14 Do Construction Distributions Shape Formal Language Learning In German BabyLMs? Bastian Bunzeck et.al. 2503.11593 null
2025-03-14 Pathology Image Compression with Pre-trained Autoencoders Srikar Yellapragada et.al. 2503.11591 null
2025-03-14 Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs using Semantic Space Zhiliang Chen et.al. 2503.11586 link
2025-03-14 SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Ahmed Nassar et.al. 2503.11576 null
2025-03-14 Synthesizing Access Control Policies using Large Language Models Adarsh Vatsa et.al. 2503.11573 null
2025-03-14 Implicit Bias-Like Patterns in Reasoning Models Messi H. J. Lee et.al. 2503.11572 null
2025-03-14 VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity Jing Bi et.al. 2503.11557 null
2025-03-14 Similarity-Aware Token Pruning: Your VLM but Faster Ahmadreza Jeddi et.al. 2503.11549 link
2025-03-14 Potential of large language model-powered nudges for promoting daily water and energy conservation Zonghan Li et.al. 2503.11531 null
2025-03-14 Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models Hao Cheng et.al. 2503.11519 null
2025-03-14 HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models Ziqin Zhou et.al. 2503.11513 null
2025-03-14 V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning Zixu Cheng et.al. 2503.11495 null
2025-03-14 A Review of DeepSeek Models' Key Innovative Techniques Chengen Wang et.al. 2503.11486 null
2025-03-14 Integrating LLMs in Gamified Systems Carlos J. Costa et.al. 2503.11458 null
2025-03-14 D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning Jia Zhang et.al. 2503.11441 null
2025-03-14 Text Compression for Efficient Language Generation David Gu et.al. 2503.11426 null
2025-03-14 Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models Xu Liu et.al. 2503.11411 null
2025-03-13 GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing Rongyao Fang et.al. 2503.10639 link
2025-03-13 A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1 Zhaoyi Li et.al. 2503.10635 link
2025-03-13 HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model Jiaming Liu et.al. 2503.10631 null
2025-03-13 UniGoal: Towards Universal Zero-shot Goal-oriented Navigation Hang Yin et.al. 2503.10630 null
2025-03-13 Transformers without Normalization Jiachen Zhu et.al. 2503.10622 null
2025-03-13 From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM Kshitij Ambilduke et.al. 2503.10620 link
2025-03-13 Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search Andy Zhou et.al. 2503.10619 null
2025-03-13 Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models Andy Zhou et.al. 2503.10617 null
2025-03-13 R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization Yi Yang et.al. 2503.10615 link
2025-03-13 CoSTA $\ast$ : Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Advait Gupta et.al. 2503.10613 link
2025-03-13 TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention Jinhao Duan et.al. 2503.10602 link
2025-03-13 GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding Rui Hu et.al. 2503.10596 link
2025-03-13 Unlock the Power of Unlabeled Data in Language Driving Model Chaoqun Wang et.al. 2503.10586 null
2025-03-13 VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search Yiming Jia et.al. 2503.10582 null
2025-03-13 Unveiling the Mathematical Reasoning in DeepSeek Models: A Comparative Study of Large Language Models Afrar Jahin et.al. 2503.10573 null
2025-03-13 ASIDE: Architectural Separation of Instructions and Data in Language Models Egor Zverev et.al. 2503.10566 null
2025-03-13 Short-term AI literacy intervention does not reduce over-reliance on incorrect ChatGPT recommendations Brett Puppart et.al. 2503.10556 null
2025-03-13 KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation Zixian Liu et.al. 2503.10546 null
2025-03-13 DP-GPL: Differentially Private Graph Prompt Learning Jing Xu et.al. 2503.10544 null
2025-03-13 Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More Arvid Frydenlund et.al. 2503.10542 null
2025-03-12 MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System Jihao Zhao et.al. 2503.09600 link
2025-03-12 How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation Ruohao Guo et.al. 2503.09598 link
2025-03-12 SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment Katrin Renz et.al. 2503.09594 null
2025-03-12 BIMBA: Selective-Scan Compression for Long-Range Video Question Answering Md Mohaiminul Islam et.al. 2503.09590 link
2025-03-12 Cost-Optimal Grouped-Query Attention for Long-Context LLMs Yingfa Chen et.al. 2503.09579 link
2025-03-12 Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Marianne Arriola et.al. 2503.09573 link
2025-03-12 Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks Lutfi Eren Erdogan et.al. 2503.09572 null
2025-03-13 Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models Qiguang Chen et.al. 2503.09567 null
2025-03-12 PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs Oskar van der Wal et.al. 2503.09543 link
2025-03-13 Large Language Models for Multi-Facility Location Mechanism Design Nguyen Thach et.al. 2503.09533 null
2025-03-13 SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability Adam Karvonen et.al. 2503.09532 null
2025-03-12 Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Bowen Jin et.al. 2503.09516 link
2025-03-12 Reinforcement Learning is all You Need Yongsheng Lian et.al. 2503.09512 null
2025-03-12 ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning Ziyu Wan et.al. 2503.09501 link
2025-03-12 MindGYM: Enhancing Vision-Language Models via Synthetic Self-Challenging Questions Zhe Xu et.al. 2503.09499 link
2025-03-12 Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection Romain Thoreau et.al. 2503.09493 null
2025-03-12 Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness Beier Zhu et.al. 2503.09487 null
2025-03-12 BAMBI: Developing Baby Language Models for Italian Alice Suozzi et.al. 2503.09481 null
2025-03-12 SurgicalVLM-Agent: Towards an Interactive AI Co-Pilot for Pituitary Surgery Jiayuan Huang et.al. 2503.09474 null
2025-03-12 Explicit Learning and the LLM in Machine Translation Malik Marmonier et.al. 2503.09454 link
2025-03-11 QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension Yongdong Luo et.al. 2503.08689 link
2025-03-11 Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs Ariba Khan et.al. 2503.08688 link
2025-03-11 Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents Haoyu Wang et.al. 2503.08684 link
2025-03-11 Self-Taught Self-Correction for Small Language Models Viktor Moskvoretskii et.al. 2503.08681 null
2025-03-11 Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields Tobias Kreiman et.al. 2503.08674 null
2025-03-11 Generating Robot Constitutions & Benchmarks for Semantic Safety Pierre Sermanet et.al. 2503.08663 null
2025-03-11 Exploring the Word Sense Disambiguation Capabilities of Large Language Models Pierpaolo Basile et.al. 2503.08662 null
2025-03-11 YuE: Scaling Open Foundation Models for Long-Form Music Generation Ruibin Yuan et.al. 2503.08638 link
2025-03-11 LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization Xianfeng Wu et.al. 2503.08619 link
2025-03-11 EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments Dongping Li et.al. 2503.08604 link
2025-03-11 NSF-SciFy: Mining the NSF Awards Database for Scientific Claims Delip Rao et.al. 2503.08600 null
2025-03-11 Proc4Gem: Foundation models for physical agency through procedural generation Yixin Lin et.al. 2503.08593 null
2025-03-11 BiasEdit: Debiasing Stereotyped Language Models via Model Editing Xin Xu et.al. 2503.08588 link
2025-03-11 HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding Shehreen Azad et.al. 2503.08585 null
2025-03-11 RAG-Adapter: A Plug-and-Play RAG-enhanced Framework for Long Video Understanding Xichen Tan et.al. 2503.08576 null
2025-03-11 DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process Minjun Zhu et.al. 2503.08569 null
2025-03-11 Reasoning and Sampling-Augmented MCQ Difficulty Prediction via LLMs Wanyong Feng et.al. 2503.08551 null
2025-03-11 Transferring Extreme Subword Style Using Ngram Model-Based Logit Scaling Craig Messner et.al. 2503.08550 null
2025-03-11 Graph of AI Ideas: Leveraging Knowledge Graphs and LLMs for AI Research Idea Generation Xian Gao et.al. 2503.08549 null
2025-03-11 TLA: Tactile-Language-Action Model for Contact-Rich Manipulation Peng Hao et.al. 2503.08548 null
2025-03-10 Robusto-1 Dataset: Comparing Humans and VLMs on real out-of-distribution Autonomous Driving VQA from Peru Dunant Cusipuma et.al. 2503.07587 null
2025-03-10 Talking to GDELT Through Knowledge Graphs Audun Myers et.al. 2503.07584 null
2025-03-10 VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models Jen-tse Huang et.al. 2503.07575 link
2025-03-10 AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning Yangzhe Kong et.al. 2503.07557 null
2025-03-10 Junior Software Developers' Perspectives on Adopting LLMs for Software Engineering: a Systematic Literature Review Samuel Ferino et.al. 2503.07556 null
2025-03-10 KSOD: Knowledge Supplement for LLMs On Demand Haoran Li et.al. 2503.07550 null
2025-03-10 Bi-Directional Mental Model Reconciliation for Human-Robot Interaction with Large Language Models Nina Moorman et.al. 2503.07547 null
2025-03-10 Queueing, Predictions, and LLMs: Challenges and Open Problems Michael Mitzenmacher et.al. 2503.07545 null
2025-03-10 XIFBench: Evaluating Large Language Models on Multilingual Instruction Following Zhenyu Li et.al. 2503.07539 null
2025-03-10 Building English ASR model with regional language support Purvi Agrawal et.al. 2503.07522 null
2025-03-10 GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval Justus-Jonas Erker et.al. 2503.07519 link
2025-03-10 TokenButler: Token Importance is Predictable Yash Akhauri et.al. 2503.07518 link
2025-03-10 Language Models Fail to Introspect About Their Knowledge of Language Siyuan Song et.al. 2503.07513 link
2025-03-10 Plume: Scaffolding Text Composition in Dashboards Maxim Lisnic et.al. 2503.07512 null
2025-03-10 Sometimes the Model doth Preach: Quantifying Religious Bias in Open LLMs through Demographic Analysis in Asian Nations Hari Shankar et.al. 2503.07510 link
2025-03-10 Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts Shiu-hong Kao et.al. 2503.07503 null
2025-03-10 V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation Guiwei Zhang et.al. 2503.07493 link
2025-03-10 LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition? Bangyan Li et.al. 2503.07487 null
2025-03-10 Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction Zongzheng Zhang et.al. 2503.07485 link
2025-03-10 VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models Jiacheng Ruan et.al. 2503.07478 link
2025-03-10 Advancing Vietnamese Information Retrieval with Learning Objective and Benchmark Phu-Vinh Nguyen et.al. 2503.07470 null
2025-03-10 YOLOE: Real-Time Seeing Anything Ao Wang et.al. 2503.07465 link
2025-03-10 GenAIReading: Augmenting Human Cognition with Interactive Digital Textbooks Using Large Language Models and Image Generation Models Ryugo Morita et.al. 2503.07463 null
2025-03-10 MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning Xiangru Tang et.al. 2503.07459 link
2025-03-10 LLMs syntactically adapt their language use to their conversational partner Florian Kandra et.al. 2503.07457 null
2025-03-10 Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration Dylan J. Foster et.al. 2503.07453 null
2025-03-10 From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development -- An Opinion Paper Sargam Yadav et.al. 2503.07450 null
2025-03-10 From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics Jaewook Lee et.al. 2503.07429 null
2025-03-10 RePO: ReLU-based Preference Optimization Junkang Wu et.al. 2503.07426 link
2025-03-10 REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding Yan Tai et.al. 2503.07413 link
2025-03-10 Towards Safe Robot Foundation Models Maximilian Tölle et.al. 2503.07404 null
2025-03-10 Keeping Representation Similarity in Finetuning for Medical Image Analysis Wenqiang Zu et.al. 2503.07399 null
2025-03-10 Revisiting Noise in Natural Language Processing for Computational Social Science Nadav Borenstein et.al. 2503.07395 null
2025-03-10 Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs Gonzalo Mancera et.al. 2503.07384 null
2025-03-10 Process-Supervised LLM Recommenders via Flow-guided Tuning Chongming Gao et.al. 2503.07377 link
2025-03-10 Artificial Utopia: Simulation and Intelligent Agents for a Democratised Future Yannick Oswald et.al. 2503.07364 null
2025-03-07 Fairness-Aware Low-Rank Adaptation Under Demographic Privacy Constraints Parameswaran Kamalaruban et.al. 2503.05684 null
2025-03-07 Understanding the Limits of Lifelong Knowledge Editing in LLMs Lukas Thede et.al. 2503.05683 null
2025-03-07 A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval Yu Zhang et.al. 2503.05659 link
2025-03-07 Learning LLM Preference over Intra-Dialogue Pairs: A Framework for Utterance-level Understandings Xuanqing Liu et.al. 2503.05620 null
2025-03-07 A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models Dong Shu et.al. 2503.05613 null
2025-03-07 From Theory to Application: A Practical Introduction to Neural Operators in Scientific Computing Prashant K. Jha et.al. 2503.05598 link
2025-03-07 R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Huatong Song et.al. 2503.05592 null
2025-03-07 Quantifying the Robustness of Retrieval-Augmented Language Models Against Spurious Features in Grounding Data Shiping Yang et.al. 2503.05587 null
2025-03-07 Evaluating open-source Large Language Models for automated fact-checking Nicolo' Fontana et.al. 2503.05565 null
2025-03-07 Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance Bryan Etzine et.al. 2503.05551 null
2025-03-07 Leveraging Approximate Caching for Faster Retrieval-Augmented Generation Shai Bergman et.al. 2503.05530 null
2025-03-07 PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs Roberto Cerina et.al. 2503.05529 null
2025-03-07 Cognitive Bias Detection Using Advanced Prompt Engineering Frederic Lemieux et.al. 2503.05516 null
2025-03-07 Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs? Qingyuan Liang et.al. 2503.05507 null
2025-03-07 Statistical Guarantees of Correctness Coverage for Medical Multiple-Choice Question Answering Yusong Ke et.al. 2503.05505 null
2025-03-07 Benchmarking LLMs in Recommendation Tasks: A Comparative Evaluation with Conventional Recommenders Qijiong Liu et.al. 2503.05493 null
2025-03-07 Maximum Hallucination Standards for Domain-Specific Large Language Models Tingmingke Lu et.al. 2503.05481 null
2025-03-07 The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence Noah Mamie et.al. 2503.05473 null
2025-03-07 Soft Policy Optimization: Online Off-Policy RL for Sequence Models Taco Cohen et.al. 2503.05453 null
2025-03-07 LLM-based Iterative Approach to Metamodeling in Automotive Nenad Petrovic et.al. 2503.05449 null
2025-03-06 L $^2$ M: Mutual Information Scaling Law for Long-Context Language Modeling Zhuo Chen et.al. 2503.04725 link
2025-03-06 LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Sambal Shikhar et.al. 2503.04724 null
2025-03-07 Shifting Long-Context LLMs Research from Input to Output Yuhao Wu et.al. 2503.04723 null
2025-03-06 Enough Coin Flips Can Make LLMs Act Bayesian Ritwik Gupta et.al. 2503.04722 null
2025-03-06 Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities Guan-Ting Lin et.al. 2503.04721 link
2025-03-06 Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining Houyi Li et.al. 2503.04715 null
2025-03-06 Scaling Rich Style-Prompted Text-to-Speech Datasets Anuj Diwan et.al. 2503.04713 link
2025-03-06 Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size Alireza Behtash et.al. 2503.04704 null
2025-03-06 L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning Pranjal Aggarwal et.al. 2503.04697 null
2025-03-06 UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets Wenyu Wang et.al. 2503.04693 null
2025-03-06 Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases Pengcheng Qiu et.al. 2503.04691 null
2025-03-06 LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue Sangyeop Kim et.al. 2503.04675 null
2025-03-06 An Information-theoretic Multi-task Representation Learning Framework for Natural Language Understanding Dou Hu et.al. 2503.04667 link
2025-03-06 CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models Shengzhuang Chen et.al. 2503.04655 link
2025-03-06 Transferable Foundation Models for Geometric Tasks on Point Cloud Representations: Geometric Neural Operators Blaine Quackenbush et.al. 2503.04649 link
2025-03-06 Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment Wen Yang et.al. 2503.04647 link
2025-03-06 Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation Aishik Konwer et.al. 2503.04639 null
2025-03-06 Mark Your LLM: Detecting the Misuse of Open-Source Large Language Models via Watermarking Yijie Xu et.al. 2503.04636 null
2025-03-06 Better Process Supervision with Bi-directional Rewarding Signals Wenxiang Chen et.al. 2503.04618 null
2025-03-06 Towards Data-Efficient Language Models: A Child-Inspired Approach to Language Learning Mohammad Amin Ghanizadeh et.al. 2503.04611 null
2025-03-05 The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems Richard Ren et.al. 2503.03750 null
2025-03-05 Process-based Self-Rewarding Language Models Shimao Zhang et.al. 2503.03746 link
2025-03-05 CHOP: Mobile Operating Assistant with Constrained High-frequency Optimized Subtask Planning Yuqi Zhou et.al. 2503.03743 link
2025-03-05 Towards Understanding Distilled Reasoning Models: A Representational Approach David D. Baek et.al. 2503.03730 null
2025-03-05 Improving LLM Safety Alignment with Dual-Objective Optimization Xuandong Zhao et.al. 2503.03710 link
2025-03-05 Effective LLM Knowledge Learning via Model Generalization Mingkang Zhu et.al. 2503.03705 null
2025-03-05 A Practical Memory Injection Attack against LLM Agents Shen Dong et.al. 2503.03704 null
2025-03-05 Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models Jiyue Jiang et.al. 2503.03702 null
2025-03-05 Addressing Overprescribing Challenges: Fine-Tuning Large Language Models for Medication Recommendation Tasks Zihao Zhao et.al. 2503.03687 link
2025-03-05 Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models Bar Karov et.al. 2503.03669 link
2025-03-05 Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction Gustaw Opiełka et.al. 2503.03666 link
2025-03-05 Robust Learning of Diverse Code Edits Tushar Aggarwal et.al. 2503.03656 null
2025-03-05 Improving Neutral Point of View Text Generation through Parameter-Efficient Reinforcement Learning and a Small-Scale High-Quality Dataset Jessica Hoffmann et.al. 2503.03654 null
2025-03-05 Token-Level Privacy in Large Language Models Re'em Harel et.al. 2503.03652 null
2025-03-05 Psy-Copilot: Visual Chain of Thought for Counseling Keqi Chen et.al. 2503.03645 null
2025-03-05 Large language models in finance: estimating financial sentiment for stock prediction Kemal Kirtac et.al. 2503.03612 null
2025-03-05 Enhancing the Accuracy and Comprehensibility in Architectural Tactics Detection via Small Model-Augmented Prompt Engineering Lingli Cao et.al. 2503.03609 link
2025-03-05 Psy-Insight: Explainable Multi-turn Bilingual Dataset for Mental Health Counseling Keqi Chen et.al. 2503.03607 null
2025-03-05 Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Kristian Kuznetsov et.al. 2503.03601 null
2025-03-05 Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs Haoran Fan et.al. 2503.03594 link
2025-03-04 Wikipedia in the Era of LLMs: Evolution and Risks Siming Huang et.al. 2503.02879 link
2025-03-04 Language Models can Self-Improve at State-Value Estimation for Better Search Ethan Mendes et.al. 2503.02878 link
2025-03-04 SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models Dmitry Nechaev et.al. 2503.02876 link
2025-03-04 The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models Ke Ji et.al. 2503.02875 null
2025-03-04 Prompting Generative AI with Interaction-Augmented Instructions Leixian Shen et.al. 2503.02874 null
2025-03-04 FairSense-AI: Responsible AI Meets Sustainability Shaina Raza et.al. 2503.02865 null
2025-03-04 Calibrating LLM Confidence with Semantic Steering: A Multi-Prompt Aggregation Framework Ziang Zhou et.al. 2503.02863 null
2025-03-04 Privacy and Accuracy-Aware AI/ML Model Deduplication Hong Guan et.al. 2503.02862 null
2025-03-04 (How) Do Language Models Track State? Belinda Z. Li et.al. 2503.02854 null
2025-03-04 Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs' Decoding Layers Zicong He et.al. 2503.02851 link
2025-03-04 Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs Yuzhe Gu et.al. 2503.02846 link
2025-03-04 Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training Paul Janson et.al. 2503.02844 null
2025-03-04 AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation Songming Zhang et.al. 2503.02832 null
2025-03-04 Developing a PET/CT Foundation Model for Cross-Modal Anatomical and Functional Imaging Yujin Oh et.al. 2503.02824 null
2025-03-04 "What If Smart Homes Could See Our Homes?": Exploring DIY Smart Home Building Experiences with VLM-Based Camera Sensors Sojeong Yun et.al. 2503.02816 null
2025-03-04 Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression Nathan Godey et.al. 2503.02812 link
2025-03-04 RAAD-LLM: Adaptive Anomaly Detection Using LLMs and RAG Integration Alicia Russell-Gilbert et.al. 2503.02800 null
2025-03-04 Multimodal AI predicts clinical outcomes of drug combinations from preclinical data Yepeng Huang et.al. 2503.02781 link
2025-03-04 Implicit Bias in LLMs: A Survey Xinru Lin et.al. 2503.02776 null
2025-03-04 InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training Dingdong Wang et.al. 2503.02769 null
2025-02-28 LLM Post-Training: A Deep Dive into Reasoning Large Language Models Komal Kumar et.al. 2502.21321 link
2025-02-28 Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos Zhiyu Tan et.al. 2502.21314 null
2025-02-28 FANformer: Improving Large Language Models Through Effective Periodicity Modeling Yihong Dong et.al. 2502.21309 link
2025-02-28 Contextualizing biological perturbation experiments through language Menghua Wu et.al. 2502.21290 link
2025-02-28 Adaptive Keyframe Sampling for Long Video Understanding Xi Tang et.al. 2502.21271 null
2025-03-03 Foundation Models -- A Panacea for Artificial Intelligence in Pathology? Nita Mulliqi et.al. 2502.21264 null
2025-02-28 Modeling Human Beliefs about AI Behavior for Scalable Oversight Leon Lang et.al. 2502.21262 null
2025-02-28 PET Image Denoising via Text-Guided Diffusion: Integrating Anatomical Priors through Text Prompts Boxiao Yu et.al. 2502.21260 null
2025-02-28 RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete Yuheng Ji et.al. 2502.21257 null
2025-02-28 TimesBERT: A BERT-Style Foundation Model for Time Series Understanding Haoran Zhang et.al. 2502.21245 null
2025-03-04 Semantic Volume: Quantifying and Detecting both External and Internal Uncertainty in LLMs Xiaomin Li et.al. 2502.21239 null
2025-02-28 Transforming Tuberculosis Care: Optimizing Large Language Models For Enhanced Clinician-Patient Communication Daniil Filienko et.al. 2502.21236 null
2025-02-28 ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs Hao Ge et.al. 2502.21231 null
2025-03-03 ECLeKTic: a Novel Challenge Set for Evaluation of Cross-Lingual Knowledge Transfer Omer Goldman et.al. 2502.21228 null
2025-02-28 Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought Jianhao Huang et.al. 2502.21212 null
2025-02-28 Chronologically Consistent Large Language Models Songrun He et.al. 2502.21206 null
2025-02-28 $Δ$ -model correction of Foundation Model based on the models own understanding Mads-Peter Verner Christiansen et.al. 2502.21179 null
2025-03-03 Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models Ruta Binkyte et.al. 2502.21123 null
2025-02-28 Optimizing Large Language Models for ESG Activity Detection in Financial Texts Mattia Birti et.al. 2502.21112 link
2025-02-28 Large Language Model-Based Benchmarking Experiment Settings for Evolutionary Multi-Objective Optimization Lie Meng Pang et.al. 2502.21108 null
2025-02-27 R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts Zhongyang Li et.al. 2502.20395 link
2025-02-27 Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis Jeffrey Yang Fan Chiang et.al. 2502.20383 null
2025-02-27 Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers Shalev Lifshitz et.al. 2502.20379 null
2025-02-27 PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation Albert Gong et.al. 2502.20377 link
2025-02-27 Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization Ryan C. Barron et.al. 2502.20364 link
2025-02-27 Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs Kuan Lok Zhou et.al. 2502.20356 null
2025-02-27 KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model Kai Zhang et.al. 2502.20350 null
2025-02-27 Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models Yi Jing et.al. 2502.20344 null
2025-02-27 Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners Daniele Paliotta et.al. 2502.20339 null
2025-02-27 Expertise Is What We Want Alan Ashworth et.al. 2502.20335 null
2025-02-27 Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models Yukang Yang et.al. 2502.20332 null
2025-02-27 Long-Context Inference with Retrieval-Augmented Speculative Decoding Guanzheng Chen et.al. 2502.20330 link
2025-02-27 LangProBe: a Language Programs Benchmark Shangyin Tan et.al. 2502.20315 null
2025-02-27 EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants Franck Cappello et.al. 2502.20309 link
2025-02-27 M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging Jinghao Feng et.al. 2502.20301 null
2025-02-27 An exploration of features to improve the generalisability of fake news detection models Nathaniel Hoy et.al. 2502.20299 null
2025-02-27 Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription Benjamin Gutteridge et.al. 2502.20295 link
2025-02-27 Visual Adaptive Prompting for Compositional Zero-Shot Learning Kyle Stein et.al. 2502.20292 null
2025-02-27 Conformal Tail Risk Control for Large Language Model Alignment Catherine Yu-Chi Chen et.al. 2502.20285 null
2025-02-27 Evaluating Human Trust in LLM-Based Planners: A Preliminary Study Shenghui Chen et.al. 2502.20284 null
2025-02-26 Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models Lucy Xiaoyang Shi et.al. 2502.19417 null
2025-02-26 Norm Growth and Stability Challenges in Localized Sequential Knowledge Editing Akshat Gupta et.al. 2502.19416 null
2025-02-26 Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation Shiven Sinha et.al. 2502.19414 link
2025-02-26 Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs Christoph Schuhmann et.al. 2502.19413 null
2025-02-26 Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs Dayu Yang et.al. 2502.19411 link
2025-02-26 Less or More: Towards Glanceable Explanations for LLM Recommendations Using Ultra-Small Devices Xinru Wang et.al. 2502.19410 null
2025-02-26 ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models Danae Sánchez Villegas et.al. 2502.19409 null
2025-02-26 Learning Code-Edit Embedding to Model Student Debugging Behavior Hasnain Heickal et.al. 2502.19407 null
2025-02-26 General Reasoning Requires Learning to Reason from the Get-go Seungwook Han et.al. 2502.19402 null
2025-02-26 TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding Max Ku et.al. 2502.19400 null
2025-02-26 LiDAR Registration with Visual Foundation Models Niclas Vödisch et.al. 2502.19374 null
2025-02-26 Deep Learning For Time Series Analysis With Application On Human Motion Ali Ismail-Fawaz et.al. 2502.19364 null
2025-02-26 DataMan: Data Manager for Pre-training Large Language Models Ru Peng et.al. 2502.19363 null
2025-02-26 Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? Yancheng He et.al. 2502.19361 link
2025-02-26 Controlled Diversity: Length-optimized Natural Language Generation Diana Marie Schenke et.al. 2502.19347 null
2025-02-26 Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets Tohida Rehman et.al. 2502.19339 null
2025-02-26 I Know What I Don't Know: Improving Model Cascades Through Confidence Tuning Stephan Rabanser et.al. 2502.19335 null
2025-02-26 Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems Hao Peng et.al. 2502.19328 link
2025-02-26 Shh, don't say that! Domain Certification in LLMs Cornelius Emde et.al. 2502.19320 null
2025-02-26 Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond Qizhou Wang et.al. 2502.19301 null
2025-02-25 DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers Xueguang Ma et.al. 2502.18460 link
2025-02-25 LLM-Based Design Pattern Detection Christian Schindler et.al. 2502.18458 null
2025-02-25 Evaluating the Effectiveness of Small Language Models in Detecting Refactoring Bugs Rohit Gheyi et.al. 2502.18454 null
2025-02-25 FRIDA to the Rescue! Analyzing Synthetic Data Effectiveness in Object-Based Common Sense Reasoning for Disaster Response Mollie Shichman et.al. 2502.18452 null
2025-02-25 SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Yuxiang Wei et.al. 2502.18449 null
2025-02-25 olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models Jake Poznanski et.al. 2502.18443 link
2025-02-25 MAPoRL: Multi-Agent Post-Co-Training for Collaborative Large Language Models with Reinforcement Learning Chanwoo Park et.al. 2502.18439 null
2025-02-25 Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions Yizhe Zhang et.al. 2502.18435 null
2025-02-25 Exploring Gender Disparities in Automatic Speech Recognition Technology Hend ElGhazaly et.al. 2502.18434 null
2025-02-25 TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning Frederikus Hudi et.al. 2502.18431 link
2025-02-25 PyEvalAI: AI-assisted evaluation of Jupyter Notebooks for immediate personalized feedback Nils Wandel et.al. 2502.18425 null
2025-02-25 Compressing Language Models for Specialized Domains Miles Williams et.al. 2502.18424 null
2025-02-25 Rank1: Test-Time Compute for Reranking in Information Retrieval Orion Weller et.al. 2502.18418 link
2025-02-25 OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Xiangyu Zhao et.al. 2502.18411 link
2025-02-25 Enhancing DNA Foundation Models to Address Masking Inefficiencies Monireh Safari et.al. 2502.18405 null
2025-02-25 Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods Nicola Cecere et.al. 2502.18389 null
2025-02-25 How Far are LLMs from Real Search? A Comprehensive Study on Efficiency, Completeness, and Inherent Capabilities Minhua Lin et.al. 2502.18387 null
2025-02-25 MindMem: Multimodal for Predicting Advertisement Memorability Using LLMs and Deep Learning Sepehr Asgarian et.al. 2502.18371 null
2025-02-25 Responsible AI Agents Deven R. Desai et.al. 2502.18359 null
2025-02-25 Which Contributions Deserve Credit? Perceptions of Attribution in Human-AI Co-Creation Jessica He et.al. 2502.18357 null
2025-02-24 Introducing Visual Perception Token into Multimodal Large Language Model Runpeng Yu et.al. 2502.17425 link
2025-02-24 MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs Jiarui Zhang et.al. 2502.17422 link
2025-02-24 LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification Penghui Yang et.al. 2502.17421 link
2025-02-24 The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence Tom Wollschläger et.al. 2502.17420 null
2025-02-24 From System 1 to System 2: A Survey of Reasoning Large Language Models Zhong-Zhi Li et.al. 2502.17419 link
2025-02-24 Reasoning with Latent Thoughts: On the Power of Looped Transformers Nikunj Saunshi et.al. 2502.17416 null
2025-02-24 COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs Liming Liu et.al. 2502.17410 link
2025-02-24 Large Language Models are Powerful EHR Encoders Stefan Hegselmann et.al. 2502.17403 link
2025-02-24 Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models Alon Albalak et.al. 2502.17387 link
2025-02-24 Bridging Gaps in Natural Language Processing for Yorùbá: A Systematic Review of a Decade of Progress and Prospects Toheeb A. Jimoh et.al. 2502.17364 null
2025-02-24 A Closer Look at TabPFN v2: Strength, Limitation, and Extension Han-Jia Ye et.al. 2502.17361 null
2025-02-24 RELICT: A Replica Detection Framework for Medical Image Generation Orhun Utku Aydin et.al. 2502.17360 link
2025-02-24 DIS-CO: Discovering Copyrighted Content in VLMs Training Data André V. Duarte et.al. 2502.17358 link
2025-02-24 Distributional Scaling Laws for Emergent Capabilities Rosie Zhao et.al. 2502.17356 null
2025-02-24 On Relation-Specific Neurons in Large Language Models Yihong Liu et.al. 2502.17355 link
2025-02-24 How Scientists Use Large Language Models to Program Gabrielle O'Brien et.al. 2502.17348 null
2025-02-24 Time series forecasting based on optimized LLM for fault prediction in distribution power grid insulators João Pedro Matos-Carvalho et.al. 2502.17341 null
2025-02-24 Tokenized SAEs: Disentangling SAE Reconstructions Thomas Dooms et.al. 2502.17332 null
2025-02-24 HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization Zhenghao Liu et.al. 2502.17315 link
2025-02-24 `Generalization is hallucination' through the lens of tensor completions Liang Ze Wong et.al. 2502.17305 null
2025-02-21 ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval Guanqi Zhan et.al. 2502.15682 null
2025-02-21 Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training Jaydeep Borkar et.al. 2502.15680 link
2025-02-21 BOSS: Benchmark for Observation Space Shift in Long-Horizon Task Yue Yang et.al. 2502.15679 null
2025-02-21 Testing the limits of fine-tuning to improve reasoning in vision language models Luca M. Schulze Buschoff et.al. 2502.15678 null
2025-02-21 FLEKE: Federated Locate-then-Edit Knowledge Editing Zongkai Zhao et.al. 2502.15677 link
2025-02-21 AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind Zhining Zhang et.al. 2502.15676 link
2025-02-21 Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing Shoumik Saha et.al. 2502.15666 link
2025-02-21 Machine-generated text detection prevents language model collapse George Drayson et.al. 2502.15654 link
2025-02-21 Empowering LLMs with Logical Reasoning: A Comprehensive Survey Fengxiang Cheng et.al. 2502.15652 null
2025-02-21 Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models Anirudh Sundar et.al. 2502.15639 null
2025-02-21 Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification Vasilii Feofanov et.al. 2502.15637 link
2025-02-21 The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer Marthe Ballon et.al. 2502.15631 link
2025-02-21 Extraction multi-étiquettes de relations en utilisant des couches de Transformer Ngoc Luyen Le et.al. 2502.15619 null
2025-02-21 Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing Qi Le et.al. 2502.15618 link
2025-02-21 PDeepPP:A Deep learning framework with Pretrained Protein language for peptide classification Jixiu Zhai et.al. 2502.15610 link
2025-02-21 On the Robustness of Transformers against Context Hijacking for Linear Classification Tianle Li et.al. 2502.15609 null
2025-02-21 Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance Akos Nagy et.al. 2502.15604 null
2025-02-21 Do Multilingual LLMs Think In English? Lisa Schut et.al. 2502.15603 null
2025-02-21 WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents Xinhang Liu et.al. 2502.15601 null
2025-02-21 SafeInt: Shielding Large Language Models from Jailbreak Attacks via Safety-Aware Representation Intervention Jiaqi Wu et.al. 2502.15594 null
2025-02-20 LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention Shang Yang et.al. 2502.14866 link
2025-02-20 Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning Shuyue Stella Li et.al. 2502.14860 link
2025-02-20 FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling Weilin Zhao et.al. 2502.14856 null
2025-02-20 Prompt-to-Leaderboard Evan Frick et.al. 2502.14855 link
2025-02-20 GATE: Graph-based Adaptive Tool Evolution Across Diverse Tasks Jianwen Luo et.al. 2502.14848 link
2025-02-20 Red-Teaming LLM Multi-Agent Systems via Communication Attacks Pengfei He et.al. 2502.14847 null
2025-02-20 Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Yue Yang et.al. 2502.14846 null
2025-02-20 Revealing and Mitigating Over-Attention in Knowledge Editing Pinzheng Wang et.al. 2502.14838 link
2025-02-20 LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models Shangqing Tu et.al. 2502.14834 link
2025-02-20 Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs Danni Liu et.al. 2502.14830 link
2025-02-20 Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps Martin Tutek et.al. 2502.14829 link
2025-02-20 Exploring Advanced Techniques for Visual Question Answering: A Comprehensive Comparison Aiswarya Baby et.al. 2502.14827 null
2025-02-20 A Survey of Model Architectures in Information Retrieval Zhichao Xu et.al. 2502.14822 null
2025-02-20 eC-Tab2Text: Aspect-Based Text Generation from e-Commerce Product Tables Luis Antonio Gutiérrez Guanilo et.al. 2502.14820 null
2025-02-20 Dynamic Low-Rank Sparse Adaptation for Large Language Models Weizhong Huang et.al. 2502.14816 link
2025-02-20 FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis Fadillah Maani et.al. 2502.14807 link
2025-02-20 From RAG to Memory: Non-Parametric Continual Learning for Large Language Models Bernal Jiménez Gutiérrez et.al. 2502.14802 link
2025-02-20 A Multi-Agent Perspective on Modern Information Retrieval Haya Nachimovsky et.al. 2502.14796 null
2025-02-20 Rapid Word Learning Through Meta In-Context Learning Wentao Wang et.al. 2502.14791 null
2025-02-20 SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Michael Tschannen et.al. 2502.14786 link
2025-02-19 Where's the Bug? Attention Probing for Scalable Fault Localization Adam Stein et.al. 2502.13966 null
2025-02-19 Autellix: An Efficient Serving Engine for LLM Agents as General Programs Michael Luo et.al. 2502.13965 null
2025-02-19 MuDAF: Long-Context Multi-Document Attention Focusing through Contrastive Learning on Attention Heads Weihao Liu et.al. 2502.13963 link
2025-02-19 Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering William Jurayj et.al. 2502.13962 null
2025-02-19 LIDDIA: Language-based Intelligent Drug Discovery Agent Reza Averly et.al. 2502.13959 null
2025-02-19 Neurosymbolic artificial intelligence via large language models and coherence-driven inference Steve Huntsman et.al. 2502.13953 null
2025-02-19 Why Safeguarded Ships Run Aground? Aligned Large Language Models' Safety Mechanisms Tend to Be Anchored in The Template Region Chak Tou Leong et.al. 2502.13946 null
2025-02-19 A Chain-of-Thought Subspace Meta-Learning for Few-shot Image Captioning with Large Vision and Language Models Hao Huang et.al. 2502.13942 null
2025-02-19 Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images Shengguang Wu et.al. 2502.13928 null
2025-02-19 Beyond Single Frames: Can LMMs Comprehend Temporal and Contextual Narratives in Image Sequences? Xiaochen Wang et.al. 2502.13925 null
2025-02-19 LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Guanzheng Chen et.al. 2502.13922 link
2025-02-19 Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis Jiahao Gai et.al. 2502.13921 null
2025-02-19 Exploring Personalized Health Support through Data-Driven, Theory-Guided LLMs: A Case Study in Sleep Health Xingbo Wang et.al. 2502.13920 link
2025-02-19 TESS 2: A Large-Scale Generalist Diffusion Language Model Jaesung Tae et.al. 2502.13917 link
2025-02-19 How Do LLMs Perform Two-Hop Reasoning in Context? Tianyu Guo et.al. 2502.13913 null
2025-02-19 Lost in Sequence: Do Large Language Models Understand Sequential Recommendation? Sein Kim et.al. 2502.13909 link
2025-02-19 Judging the Judges: A Collection of LLM-Generated Relevance Judgements Hossein A. Rahmani et.al. 2502.13908 link
2025-02-19 DataSciBench: An LLM Agent Benchmark for Data Science Dan Zhang et.al. 2502.13897 link
2025-02-19 NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants Yiran Qin et.al. 2502.13894 null
2025-02-19 Refining embeddings with fill-tuning: data-efficient generalised performance improvements for materials foundation models Matthew P. Wilson et.al. 2502.13886 link
2025-02-18 Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization Shuo Xing et.al. 2502.13146 link
2025-02-18 Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Bencheng Liao et.al. 2502.13145 link
2025-02-18 Pre-training Auto-regressive Robotic Models with 4D Representations Dantong Niu et.al. 2502.13142 null
2025-02-18 UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models Huawei Lin et.al. 2502.13141 link
2025-02-18 AIDE: AI-Driven Exploration in the Space of Code Zhengyao Jiang et.al. 2502.13138 link
2025-02-18 Theorem Prover as a Judge for Synthetic Data Generation Joshua Ong Jun Leang et.al. 2502.13137 null
2025-02-18 Sleepless Nights, Sugary Days: Creating Synthetic Users with Health Conditions for Realistic Coaching Agent Interactions Taedong Yun et.al. 2502.13135 null
2025-02-18 Learning to Defer for Causal Discovery with Imperfect Experts Oscar Clivio et.al. 2502.13132 null
2025-02-18 Rethinking Diverse Human Preference Learning through Principal Component Analysis Feng Luo et.al. 2502.13131 null
2025-02-18 Magma: A Foundation Model for Multimodal AI Agents Jianwei Yang et.al. 2502.13130 link
2025-02-18 Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning Jingyang Lin et.al. 2502.13127 null
2025-02-18 RuozhiBench: Evaluating LLMs with Logical Fallacies and Misleading Premises Zenan Zhai et.al. 2502.13125 link
2025-02-18 Adapting Psycholinguistic Research for LLMs: Gender-inclusive Language in a Coreference Context Marion Bartl et.al. 2502.13120 null
2025-02-18 STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models Narun Raman et.al. 2502.13119 null
2025-02-18 Performance Evaluation of Large Language Models in Statistical Programming Xinyi Song et.al. 2502.13117 link
2025-02-18 MatterChat: A Multi-Modal LLM for Material Science Yingheng Tang et.al. 2502.13107 null
2025-02-18 Understanding and Rectifying Safety Perception Distortion in VLMs Xiaohan Zou et.al. 2502.13095 null
2025-02-18 Text2World: Benchmarking Large Language Models for Symbolic World Model Generation Mengkang Hu et.al. 2502.13092 null
2025-02-18 KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits Xin Xia et.al. 2502.13076 null
2025-02-18 Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Yuri Kuratov et.al. 2502.13063 link
2025-02-17 Idiosyncrasies in Large Language Models Mingjie Sun et.al. 2502.12150 link
2025-02-17 HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation Ling Yang et.al. 2502.12148 link
2025-02-17 Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control Jinyan Su et.al. 2502.12145 link
2025-02-17 Small Models Struggle to Learn from Strong Reasoners Yuetai Li et.al. 2502.12143 null
2025-02-17 SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs Yige Xu et.al. 2502.12134 link
2025-02-17 Transformer Dynamics: A neuroscientific approach to interpretability of large language models Jesseba Fernando et.al. 2502.12131 null
2025-02-17 Scaling Autonomous Agents via Automatic Reward Modeling And Planning Zhenfang Chen et.al. 2502.12130 null
2025-02-17 On the Query Complexity of Verifier-Assisted Language Generation Edoardo Botta et.al. 2502.12123 null
2025-02-17 Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA Patryk Marszałek et.al. 2502.12122 link
2025-02-17 LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws Prasanna Mayilvahanan et.al. 2502.12120 null
2025-02-17 PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection Jinhe Bi et.al. 2502.12119 null
2025-02-17 A-MEM: Agentic Memory for LLM Agents Wujiang Xu et.al. 2502.12110 link
2025-02-17 Personality Structured Interview for Large Language Model Simulation in Personality Research Pengda Wang et.al. 2502.12109 null
2025-02-17 Relational Norms for Human-AI Cooperation Brian D. Earp et.al. 2502.12102 null
2025-02-17 Token Communications: A Unified Framework for Cross-modal Context-aware Semantic Communications Li Qiao et.al. 2502.12096 null
2025-02-17 Descriminative-Generative Custom Tokens for Vision-Language Models Pramuditha Perera et.al. 2502.12095 null
2025-02-17 Meta-Statistical Learning: Supervised Learning of Statistical Inference Maxime Peyrard et.al. 2502.12088 null
2025-02-17 APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs Yuxiang Huang et.al. 2502.12085 link
2025-02-17 VLM $^2$ -Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues Jianshu Zhang et.al. 2502.12084 null
2025-02-17 AdaSplash: Adaptive Sparse Flash Attention Nuno Gonçalves et.al. 2502.12082 link
2025-02-14 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Yi-Fan Zhang et.al. 2502.10391 null
2025-02-14 Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction WonJin Yoon et.al. 2502.10388 null
2025-02-14 Unknown Word Detection for English as a Second Language (ESL) Learners Using Gaze and Pre-trained Language Models Jiexin Ding et.al. 2502.10378 null
2025-02-14 Robustness tests for biomedical foundation models should tailor to specification R. Patrick Xian et.al. 2502.10374 link
2025-02-14 Enhancing Multilingual LLM Pretraining with Model-Based Data Selection Bettina Messmer et.al. 2502.10361 null
2025-02-14 Organize the Web: Constructing Domains Enhances Pre-Training Data Curation Alexander Wettig et.al. 2502.10341 null
2025-02-14 Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering Nick Ferguson et.al. 2502.10338 null
2025-02-14 LLM-Powered Preference Elicitation in Combinatorial Assignment Ermis Soumalias et.al. 2502.10308 null
2025-02-14 SPIRIT: Short-term Prediction of solar IRradIance for zero-shot Transfer learning using Foundation Models Aditya Mishra et.al. 2502.10307 null
2025-02-14 Open-Source AI-Powered Optimization in Scalene: Advancing Python Performance Profiling with DeepSeek-R1 and LLaMA 3.2 Saem Hasan et.al. 2502.10299 null
2025-02-14 DeltaProduct: Increasing the Expressivity of DeltaNet Through Products of Householders Julien Siems et.al. 2502.10297 link
2025-02-14 Probing Perceptual Constancy in Large Vision Language Models Haoran Sun et.al. 2502.10273 null
2025-02-14 Are Large Language Models the future crowd workers of Linguistics? Iris Ferrazzo et.al. 2502.10266 null
2025-02-14 Large Language Models and Synthetic Data for Monitoring Dataset Mentions in Research Papers Aivin V. Solatorio et.al. 2502.10263 link
2025-02-14 VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models Gokul Karthik Kumar et.al. 2502.10250 null
2025-02-14 Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Guoqing Ma et.al. 2502.10248 link
2025-02-14 Efficient Zero-Order Federated Finetuning of Language Models for Resource-Constrained Devices Mohamed Aboelenien Ahmed et.al. 2502.10239 null
2025-02-14 AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting Abdelhakim Benechehab et.al. 2502.10235 link
2025-02-14 Do Large Language Models Reason Causally Like Us? Even Better? Hanna M. Dettki et.al. 2502.10215 null
2025-02-14 Can Post-Training Quantization Benefit from an Additional QLoRA Integration? Xiliang Zhu et.al. 2502.10202 null
2025-02-13 Theoretical Benefit and Limitation of Diffusion Language Model Guhao Feng et.al. 2502.09622 null
2025-02-13 MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Dongzhi Jiang et.al. 2502.09621 null
2025-02-13 Exploring the Potential of Encoder-free Architectures in 3D LMMs Yiwen Tang et.al. 2502.09620 link
2025-02-13 Human-LLM Coevolution: Evidence from Academic Writing Mingmeng Geng et.al. 2502.09606 null
2025-02-13 SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Yung-Sung Chuang et.al. 2502.09604 link
2025-02-13 GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis Angelos Zavras et.al. 2502.09598 link
2025-02-13 Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs Siyan Zhao et.al. 2502.09597 link
2025-02-13 KIMAs: A Configurable Knowledge Integrated Multi-Agent System Zitao Li et.al. 2502.09596 null
2025-02-13 Logical forms complement probability in understanding language model (and human) performance Yixuan Wang et.al. 2502.09589 null
2025-02-13 Polymind: Parallel Visual Diagramming with Large Language Models to Support Prewriting Through Microtasks Qian Wan et.al. 2502.09577 null
2025-02-13 MorphNLI: A Stepwise Approach to Natural Language Inference Using Text Morphing Vlad Andrei Negru et.al. 2502.09567 null
2025-02-13 Zero-shot generation of synthetic neurosurgical data with large language models Austin A. Barr et.al. 2502.09566 link
2025-02-13 MDCrow: Automating Molecular Dynamics Workflows with Large Language Models Quintina Campbell et.al. 2502.09565 link
2025-02-13 EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents Rui Yang et.al. 2502.09560 null
2025-02-13 Explainable AI-assisted Optimization for Feynman Integral Reduction Zhuo-Yang Song et.al. 2502.09544 null
2025-02-13 Mind the Gap! Choice Independence in Using Multilingual LLMs for Persuasive Co-Writing Tasks in Different Languages Shreyan Biswas et.al. 2502.09532 null
2025-02-13 When and How Does CLIP Enable Domain and Compositional Generalization? Elias Kempf et.al. 2502.09507 link
2025-02-13 Improve LLM-based Automatic Essay Scoring with Linguistic Features Zhaoyi Joey Hou et.al. 2502.09497 null
2025-02-13 Foundation Neural-Network Quantum States Riccardo Rende et.al. 2502.09488 null
2025-02-13 Objective quantification of mood states using large language models Jakub Onysk et.al. 2502.09487 null
2025-02-12 SwiftSketch: A Diffusion Model for Image-to-Vector Sketch Generation Ellie Arar et.al. 2502.08642 null
2025-02-12 Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples Andrianos Michail et.al. 2502.08638 null
2025-02-12 Ensemble based approach to quantifying uncertainty of LLM based classifications Srijith Rajamohan et.al. 2502.08631 null
2025-02-12 Continuous Cardiac Arrest Prediction in ICU using PPG Foundation Model Saurabh Kataria et.al. 2502.08612 null
2025-02-12 Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors Vishwanath Pratap Singh et.al. 2502.08587 null
2025-02-12 Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks Ang Li et.al. 2502.08586 null
2025-02-12 COAST: Intelligent Time-Adaptive Neural Operators Zhikai Wu et.al. 2502.08574 null
2025-02-12 QA-Expand: Multi-Question Answer Generation for Enhanced Query Expansion in Information Retrieval Wonduk Seo et.al. 2502.08557 null
2025-02-12 Human-Centric Foundation Models: Perception, Generation and Agentic Modeling Shixiang Tang et.al. 2502.08556 link
2025-02-12 Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies Sunnie S. Y. Kim et.al. 2502.08554 null
2025-02-12 LLMs can implicitly learn from mistakes in-context Lisa Alazraki et.al. 2502.08550 null
2025-02-12 Representation Learning to Advance Multi-institutional Studies with Electronic Health Record Data Doudou Zhou et.al. 2502.08547 null
2025-02-12 Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval Kevin Flanagan et.al. 2502.08544 link
2025-02-12 LLM Pretraining with Continuous Concepts Jihoon Tack et.al. 2502.08524 null
2025-02-12 The Paradox of Stochasticity: Limited Creativity and Computational Decoupling in Temperature-Varied LLM Outputs of Structured Fictional Data Evgenii Evstafev et.al. 2502.08515 null
2025-02-12 Faithful, Unfaithful or Ambiguous? Multi-Agent Debate with Initial Stance for Summary Evaluation Mahnaz Koupaee et.al. 2502.08514 link
2025-02-12 Measuring Diversity in Synthetic Datasets Yuchang Zhu et.al. 2502.08512 link
2025-02-12 Explanation based In-Context Demonstrations Retrieval for Multilingual Grammatical Error Correction Wei Li et.al. 2502.08507 link
2025-02-12 Salamandra Technical Report Aitor Gonzalez-Agirre et.al. 2502.08489 link
2025-02-12 One-Shot Federated Learning with Classifier-Free Diffusion Models Obaidullah Zaland et.al. 2502.08488 null
2025-02-11 DarwinLM: Evolutionary Structured Pruning of Large Language Models Shengkun Tang et.al. 2502.07780 link
2025-02-11 Auditing Prompt Caching in Language Model APIs Chenchen Gu et.al. 2502.07776 link
2025-02-11 Automatic Robot Task Planning by Integrating Large Language Model with Genetic Programming Azizjon Kobilov et.al. 2502.07772 null
2025-02-11 Breaking Down Bias: On The Limits of Generalizable Pruning Strategies Sibo Ma et.al. 2502.07771 null
2025-02-11 Great Power Brings Great Responsibility: Personalizing Conversational AI for Diverse Problem-Solvers Italo Santos et.al. 2502.07763 null
2025-02-11 Scalable Fingerprinting of Large Language Models Anshul Nasery et.al. 2502.07760 null
2025-02-11 Towards Efficient Optimizer Design for LLM via Structured Fisher Approximation with a Low-Rank Extension Wenbo Gong et.al. 2502.07752 null
2025-02-11 WHODUNIT: Evaluation benchmark for culprit detection in mystery stories Kshitij Gupta et.al. 2502.07747 link
2025-02-11 The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing Dirk Bergemann et.al. 2502.07736 null
2025-02-11 Economics of Sourcing Human Data Sebastin Santy et.al. 2502.07732 null
2025-02-11 Verifying LLM-Generated Code in the Context of Software Verification with Ada/SPARK Marcos Cramer et.al. 2502.07728 null
2025-02-11 Making Language Models Robust Against Negation MohammadHossein Rezaei et.al. 2502.07717 link
2025-02-11 Magic 1-For-1: Generating One Minute Video Clips within One Minute Hongwei Yi et.al. 2502.07701 link
2025-02-11 A Framework for LLM-powered Design Assistants Swaroop Panda et.al. 2502.07698 null
2025-02-11 Large Language Models as Proxies for Theories of Human Linguistic Cognition Imry Ziv et.al. 2502.07687 null
2025-02-11 SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models Shihao Xia et.al. 2502.07644 null
2025-02-11 FoQA: A Faroese Question-Answering Dataset Annika Simonsen et.al. 2502.07642 null
2025-02-11 Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving Yong Lin et.al. 2502.07640 link
2025-02-11 Exploring Mobile Touch Interaction with Large Language Models Tim Zindulka et.al. 2502.07629 null
2025-02-11 Scaling Pre-training to One Hundred Billion Data for Vision Language Models Xiao Wang et.al. 2502.07617 null
2025-02-10 EVEv2: Improved Baselines for Encoder-Free Vision-Language Models Haiwen Diao et.al. 2502.06788 link
2025-02-10 Visual Agentic AI for Spatial Reasoning with a Dynamic API Damiano Marsili et.al. 2502.06787 null
2025-02-10 DeepCrossAttention: Supercharging Transformer Residual Connections Mike Heddes et.al. 2502.06785 null
2025-02-10 Towards Internet-Scale Training For Agents Brandon Trabucco et.al. 2502.06776 null
2025-02-10 Enhancing Trust in Language Model-Based Code Optimization through RLHF: A Research Design Jingzhi Gong et.al. 2502.06769 null
2025-02-10 Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs Ryan Synk et.al. 2502.06766 link
2025-02-10 Rationalization Models for Text-to-SQL Gaetano Rossiello et.al. 2502.06759 null
2025-02-10 Accelerating Data Processing and Benchmarking of AI Models for Pathology Andrew Zhang et.al. 2502.06750 link
2025-02-10 Gradient Multi-Normalization for Stateless and Scalable LLM Training Meyer Scetbon et.al. 2502.06742 null
2025-02-10 VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data Thomas Zeng et.al. 2502.06737 null
2025-02-10 Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining Daouda Sow et.al. 2502.06733 null
2025-02-10 Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Runze Liu et.al. 2502.06703 link
2025-02-10 EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks Michael Arbel et.al. 2502.06684 null
2025-02-10 Boosting Self-Efficacy and Performance of Large Language Models via Verbal Efficacy Stimulations Rui Chen et.al. 2502.06669 null
2025-02-10 Automatic Evaluation of Healthcare LLMs Beyond Question-Answering Anna Arias-Duart et.al. 2502.06666 null
2025-02-10 Evaluation of Deep Audio Representations for Hearables Fabian Gröger et.al. 2502.06664 null
2025-02-10 EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models Xingrun Xing et.al. 2502.06663 null
2025-02-10 Unbiased Evaluation of Large Language Models from a Causal Perspective Meilin Chen et.al. 2502.06655 null
2025-02-10 In-Context Learning (and Unlearning) of Length Biases Stephanie Schoch et.al. 2502.06653 null
2025-02-10 Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A Anna Leschanowsky et.al. 2502.06652 null
2025-02-07 Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray Yunhang Shen et.al. 2502.05177 link
2025-02-07 Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Jonas Geiping et.al. 2502.05171 link
2025-02-07 NoLiMa: Long-Context Evaluation Beyond Literal Matching Ali Modarressi et.al. 2502.05167 link
2025-02-07 Multitwine: Multi-Object Compositing with Text and Layout Control Gemma Canet Tarrés et.al. 2502.05165 null
2025-02-07 DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails Yihe Deng et.al. 2502.05163 link
2025-02-07 A Lightweight Method to Disrupt Memorized Sequences in LLM Parjanya Prajakta Prashant et.al. 2502.05159 null
2025-02-07 Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation Steffen Eger et.al. 2502.05151 link
2025-02-07 CodeSCM: Causal Analysis for Multi-Modal Code Generation Mukur Gupta et.al. 2502.05150 link
2025-02-07 An Annotated Reading of 'The Singer of Tales' in the LLM Era Kush R. Varshney et.al. 2502.05148 null
2025-02-07 Chest X-ray Foundation Model with Global and Local Representations Integration Zefan Yang et.al. 2502.05142 link
2025-02-07 Refining Integration-by-Parts Reduction of Feynman Integrals with Machine Learning Matt von Hippel et.al. 2502.05121 null
2025-02-07 Flexible and Efficient Grammar-Constrained Decoding Kanghee Park et.al. 2502.05111 null
2025-02-07 Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs Rohit Saxena et.al. 2502.05092 null
2025-02-07 DCFormer: Efficient 3D Vision-Language Modeling with Decomposed Convolutions Gorkem Can Ates et.al. 2502.05091 null
2025-02-07 Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs Thierry Bossy et.al. 2502.05087 link
2025-02-07 Causality can systematically address the monsters under the bench(marks) Felix Leeb et.al. 2502.05085 null
2025-02-07 ChallengeMe: An Adversarial Learning-enabled Text Summarization Framework Xiaoyu Deng et.al. 2502.05084 null
2025-02-07 Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures Tushar Pandey et.al. 2502.05078 link
2025-02-07 nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow Geliang Ouyang et.al. 2502.05036 link
2025-02-07 EnseSmells: Deep ensemble and programming language models for automated code smells detection Anh Ho et.al. 2502.05012 link
2025-02-06 Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment Zuyan Liu et.al. 2502.04328 link
2025-02-06 Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions Yik Siu Chan et.al. 2502.04322 link
2025-02-06 ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features Alec Helbling et.al. 2502.04320 link
2025-02-06 sshELF: Single-Shot Hierarchical Extrapolation of Latent Features for 3D Reconstruction from Sparse-Views Eyvaz Najafli et.al. 2502.04318 null
2025-02-06 ChamaleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters Kamer Ali Yuksel et.al. 2502.04315 link
2025-02-06 Great Models Think Alike and this Undermines AI Oversight Shashwat Goel et.al. 2502.04313 link
2025-02-06 ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization Yinjie Wang et.al. 2502.04306 link
2025-02-06 Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization Yuanye Liu et.al. 2502.04295 link
2025-02-06 PILAF: Optimal Human Preference Sampling for Reward Modeling Yunzhen Feng et.al. 2502.04270 null
2025-02-06 How does a Multilingual LM Handle Multiple Languages? Santhosh Kakarla et.al. 2502.04269 null
2025-02-06 Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion Marco Mistretta et.al. 2502.04263 link
2025-02-06 Efficient Randomized Experiments Using Foundation Models Piersilvio De Bartolomeis et.al. 2502.04262 link
2025-02-06 MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion Xintong Hao et.al. 2502.04235 null
2025-02-06 Can LLMs Hack Enterprise Networks? Autonomous Assumed Breach Penetration-Testing Active Directory Networks Andreas Happe et.al. 2502.04227 link
2025-02-06 Keep It Light! Simplifying Image Clustering Via Text-Free Adapters Yicen Li et.al. 2502.04226 null
2025-02-06 Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents Ilia Karmanov et.al. 2502.04223 null
2025-02-06 Sports and Women's Sports: Gender Bias in Text Generation with Olympic Data Laura Biester et.al. 2502.04218 null
2025-02-06 Algorithmic causal structure emerging through compression Liang Wendong et.al. 2502.04210 null
2025-02-06 "Short-length" Adversarial Training Helps LLMs Defend "Long-length" Jailbreak Attacks: Theoretical and Empirical Evidence Shaopeng Fu et.al. 2502.04204 link
2025-02-06 The Best Instruction-Tuning Data are Those That Fit Dylan Zhang et.al. 2502.04194 null
2025-02-05 Do Large Language Model Benchmarks Test Reliability? Joshua Vendrow et.al. 2502.03461 link
2025-02-05 Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training Boyao Wang et.al. 2502.03460 null
2025-02-05 SKI Models: Skeleton Induced Vision-Language Embeddings for Understanding Activities of Daily Living Arkaprava Sinha et.al. 2502.03459 null
2025-02-05 A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs) Yiye Chen et.al. 2502.03450 null
2025-02-05 BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving Ran Xin et.al. 2502.03438 null
2025-02-05 On Fairness of Unified Multimodal Large Language Model for Image Generation Ming Liu et.al. 2502.03429 null
2025-02-05 Harnessing Large Language Models for Curated Code Reviews Oussama Ben Sghaier et.al. 2502.03425 link
2025-02-05 Think or Step-by-Step? UnZIPping the Black Box in Zero-Shot Prompts Nikta Gohari Sadr et.al. 2502.03418 null
2025-02-05 SPRI: Aligning Large Language Models with Context-Situated Principles Hongli Zhan et.al. 2502.03397 null
2025-02-05 Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications Issar Arab et.al. 2502.03395 null
2025-02-05 LIMO: Less is More for Reasoning Yixin Ye et.al. 2502.03387 link
2025-02-05 Transformers and Their Roles as Time Series Foundation Models Dennis Wu et.al. 2502.03383 null
2025-02-05 High-Fidelity Simultaneous Speech-To-Speech Translation Tom Labiausse et.al. 2502.03382 link
2025-02-05 Demystifying Long Chain-of-Thought Reasoning in LLMs Edward Yeo et.al. 2502.03373 link
2025-02-05 PalimpChat: Declarative and Interactive AI analytics Chunwei Liu et.al. 2502.03368 null
2025-02-05 Minerva: A Programmable Memory Test Benchmark for Language Models Menglin Xia et.al. 2502.03358 null
2025-02-05 RadVLM: A Multitask Conversational Vision-Language Model for Radiology Nicolas Deperrois et.al. 2502.03333 null
2025-02-05 ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model Qiguang Chen et.al. 2502.03325 null
2025-02-05 Out-of-Distribution Detection using Synthetic Data Generation Momin Abbas et.al. 2502.03323 null
2025-02-05 Simplifying Formal Proof-Generating Models with ChatGPT and Basic Searching Techniques Sangjun Han et.al. 2502.03321 null
2025-02-04 Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling Xiaowen Qiu et.al. 2502.02590 null
2025-02-04 COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation Xueqing Deng et.al. 2502.02589 null
2025-02-04 A comparison of translation performance between DeepL and Supertext Alex Flückiger et.al. 2502.02577 link
2025-02-04 Are Language Models Up to Sequential Optimization Problems? From Evaluation to a Hegelian-Inspired Enhancement Soheil Abbasloo et.al. 2502.02573 null
2025-02-04 Learning the RoPEs: Better 2D and 3D Position Encodings with STRING Connor Schenck et.al. 2502.02562 null
2025-02-04 Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation Junha Lee et.al. 2502.02548 null
2025-02-04 LLMs for Generation of Architectural Components: An Exploratory Empirical Study in the Serverless World Shrikara Arun et.al. 2502.02539 null
2025-02-04 Adaptive Self-improvement LLM Agentic System for ML Library Development Genghan Zhang et.al. 2502.02534 link
2025-02-04 Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies Han Zhou et.al. 2502.02533 null
2025-02-04 Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search Maohao Shen et.al. 2502.02508 null
2025-02-04 Analyzing Similarity Metrics for Data Selection for Language Model Pretraining Dylan Sam et.al. 2502.02494 null
2025-02-04 EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization Yize Wu et.al. 2502.02493 null
2025-02-04 Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study Menglong Cui et.al. 2502.02481 null
2025-02-04 Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification Valentina Vadori et.al. 2502.02471 link
2025-02-04 Modular Training of Neural Networks aids Interpretability Satvik Golechha et.al. 2502.02470 null
2025-02-04 SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency Qianhao Yuan et.al. 2502.02458 link
2025-02-04 IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning Quan Zhang et.al. 2502.02454 null
2025-02-04 Personalization Toolkit: Training Free Personalization of Large Vision Language Models Soroush Seifi et.al. 2502.02452 null
2025-02-04 Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study Calvin Yixiang Cheng et.al. 2502.02451 link
2025-02-04 Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models Haoran Ye et.al. 2502.02444 null
2025-01-31 Low-Rank Adapting Models for Sparse Autoencoders Matthew Chen et.al. 2501.19406 link
2025-01-31 Vintix: Action Model via In-Context Reinforcement Learning Andrey Polubarov et.al. 2501.19400 link
2025-01-31 Scalable-Softmax Is Superior for Attention Ken M. Nakanishi et.al. 2501.19399 null
2025-01-31 Do LLMs Strategically Reveal, Conceal, and Infer Information? A Theoretical and Empirical Analysis in The Chameleon Game Mustafa O. Karabag et.al. 2501.19398 link
2025-02-03 s1: Simple test-time scaling Niklas Muennighoff et.al. 2501.19393 link
2025-01-31 Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models Alina Shutova et.al. 2501.19392 link
2025-01-31 Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models Wenzhi Fang et.al. 2501.19389 link
2025-01-31 Decoding-based Regression Xingyou Song et.al. 2501.19383 link
2025-01-31 TableMaster: A Recipe to Advance Table Understanding with Language Models Lang Cao et.al. 2501.19378 null
2025-02-03 SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions Dominik Wagner et.al. 2501.19377 null
2025-01-31 We're Different, We're the Same: Creative Homogeneity Across LLMs Emily Wenger et.al. 2501.19361 null
2025-01-31 Mechanical Properties of the Meninges: Large Language Model Assisted Systematic Review of over 25,000 Studies Brandon P. Chelstrom et.al. 2501.19359 null
2025-01-31 The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking Yuchun Miao et.al. 2501.19358 null
2025-01-31 Towards Adaptive Self-Improvement for Smarter Energy Systems Alexander Sommer et.al. 2501.19340 null
2025-01-31 PixelWorld: Towards Perceiving Everything as Pixels Zhiheng Lyu et.al. 2501.19339 null
2025-01-31 Homogeneity Bias as Differential Sampling Uncertainty in Language Models Messi H. J. Lee et.al. 2501.19337 null
2025-01-31 Reward-Guided Speculative Decoding for Efficient LLM Reasoning Baohao Liao et.al. 2501.19324 null
2025-01-31 MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems Anirudh Chari et.al. 2501.19318 null
2025-01-31 LLM-based Affective Text Generation Quality Based on Different Quantization Values Yarik Menchaca Resendiz et.al. 2501.19317 null
2025-01-31 An Efficient Approach for Machine Translation on Low-resource Languages: A Case Study in Vietnamese-Chinese Tran Ngoc Son et.al. 2501.19314 null
2025-01-30 Foundational Models for 3D Point Clouds: A Survey and Outlook Vishal Thengane et.al. 2501.18594 null
2025-01-30 Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models Hao Dong et.al. 2501.18592 link
2025-01-30 Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Yue Wang et.al. 2501.18585 null
2025-01-30 Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling Dan M. Kluger et.al. 2501.18577 link
2025-01-30 Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH Evgenii Evstafev et.al. 2501.18576 null
2025-01-30 BounTCHA: A CAPTCHA Utilizing Boundary Identification in AI-extended Videos Lehao Lin et.al. 2501.18565 null
2025-01-30 SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation Haoquan Fang et.al. 2501.18564 link
2025-01-30 Semantic Web and Creative AI -- A Technical Report from ISWS 2023 Raia Abu Ahmad et.al. 2501.18542 null
2025-01-30 Loss Functions and Operators Generated by f-Divergences Vincent Roulet et.al. 2501.18537 null
2025-01-30 Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges Manveer Singh Tamber et.al. 2501.18536 link
2025-01-30 Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models Yi Ding et.al. 2501.18533 null
2025-01-30 Differentially Private Steering for Large Language Model Alignment Anmol Goel et.al. 2501.18532 link
2025-01-30 Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models Guanqun Cao et.al. 2501.18516 null
2025-01-30 Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch Arthur Douillard et.al. 2501.18512 null
2025-01-30 WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Benjamin Feuer et.al. 2501.18511 link
2025-01-30 CLEAR: Cue Learning using Evolution for Accurate Recognition Applied to Sustainability Data Extraction Peter J. Bentley et.al. 2501.18504 null
2025-01-30 A Tool for In-depth Analysis of Code Execution Reasoning of Large Language Models Changshu Liu et.al. 2501.18482 null
2025-01-30 CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization Yanxia Deng et.al. 2501.18475 null
2025-01-30 Tuning Vision Foundation Model via Test-Time Prompt-Guided Training for VFSS Segmentations Chengxi Zeng et.al. 2501.18474 null
2025-01-30 A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models Shiho Noda et.al. 2501.18463 link
2025-01-29 Learning Beyond the Surface: How Far Can Continual Pre-Training with LoRA Enhance LLMs' Domain-Specific Insight Learning? Pouya Pezeshkpour et.al. 2501.17840 link
2025-01-29 Matrix Product Sketching via Coordinated Sampling Majid Daliri et.al. 2501.17836 null
2025-01-29 Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology Sobhan Hemati et.al. 2501.17822 null
2025-01-29 Leveraging Multimodal LLM for Inspirational User Interface Search Seokhyeon Park et.al. 2501.17799 link
2025-01-29 BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights Chan-Jan Hsu et.al. 2501.17790 null
2025-01-29 Reasoning Over the Glyphs: Evaluation of LLM's Decipherment of Rare Scripts Yu-Fei Shih et.al. 2501.17785 null
2025-01-29 AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing Peter Pak et.al. 2501.17784 null
2025-01-29 2SSP: A Two-Stage Framework for Structured Pruning of LLMs Fabrizio Sandri et.al. 2501.17771 link
2025-01-29 Hybrid Graphs for Table-and-Text based Question Answering using LLMs Ankush Agarwal et.al. 2501.17767 null
2025-01-29 On the Partitioning of GPU Power among Multi-Instances Tirth Vamja et.al. 2501.17752 null
2025-01-29 Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation Aitor Arrieta et.al. 2501.17749 null
2025-01-29 A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches Ana R. Baião et.al. 2501.17729 null
2025-01-29 Using Code Generation to Solve Open Instances of Combinatorial Design Problems Christopher D. Rosin et.al. 2501.17725 link
2025-01-29 RICoTA: Red-teaming of In-the-wild Conversation with Test Attempts Eujeong Choi et.al. 2501.17715 link
2025-01-29 Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Yubo Wang et.al. 2501.17703 null
2025-01-29 Planning with Vision-Language Models and a Use Case in Robot-Assisted Teaching Xuzhe Dang et.al. 2501.17665 null
2025-01-29 Exploring Vision Language Models for Multimodal and Multilingual Stance Detection Jake Vasilakes et.al. 2501.17654 null
2025-01-29 Tonguescape: Exploring Language Models Understanding of Vowel Articulation Haruki Sakajo et.al. 2501.17643 link
2025-01-29 Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation Lin Chen et.al. 2501.17642 null
2025-01-29 In-Context Meta LoRA Generation Yihua Shao et.al. 2501.17635 null
2025-01-28 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Tianzhe Chu et.al. 2501.17161 null
2025-01-28 AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders Zhengxuan Wu et.al. 2501.17148 link
2025-01-28 FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data Deren Lei et.al. 2501.17144 link
2025-01-28 ASTRAL: Automated Safety Testing of Large Language Models Miriam Ugarte et.al. 2501.17132 null
2025-01-28 Scenario Understanding of Traffic Scenes Through Large Visual Language Models Rivera Esteban et.al. 2501.17131 null
2025-01-28 Histoires Morales: A French Dataset for Assessing Moral Alignment Thibaud Leteno et.al. 2501.17117 link
2025-01-28 Optimizing Large Language Model Training Using FP4 Quantization Ruizhe Wang et.al. 2501.17116 null
2025-01-28 Unlocking Transparent Alignment Through Enhanced Inverse Constitutional AI for Principle Extraction Carl-Leander Henneking et.al. 2501.17112 null
2025-01-28 COS(M+O)S: Curiosity and RL-Enhanced MCTS for Exploring Story Space via Language Models Tobias Materzok et.al. 2501.17104 null
2025-01-28 Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving Evgenii Evstafev et.al. 2501.17084 null
2025-01-28 Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding Akash Kumar et.al. 2501.17053 null
2025-01-28 How Linguistics Learned to Stop Worrying and Love the Language Models Richard Futrell et.al. 2501.17047 null
2025-01-28 Enhanced Retrieval of Long Documents: Leveraging Fine-Grained Block Representations with Large Language Models Minghan Li et.al. 2501.17039 null
2025-01-28 Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies Manojkumar Parmar et.al. 2501.17030 null
2025-01-28 Automated Refactoring of Non-Idiomatic Python Code: A Differentiated Replication with LLMs Alessandro Midolo et.al. 2501.17024 link
2025-01-28 Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement Kei Katsumata et.al. 2501.17022 link
2025-01-28 Large Language Models for Code Generation: The Practitioners Perspective Zeeshan Rasheed et.al. 2501.16998 link
2025-01-28 Artificial Intelligence Clones Annie Liang et.al. 2501.16996 null
2025-01-28 FedEFM: Federated Endovascular Foundation Model with Unseen Data Tuong Do et.al. 2501.16992 null
2025-01-28 Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection Xiangyu Gao et.al. 2501.16981 null
2025-01-27 LUCY: Linguistic Understanding and Control Yielding Early Stage of Her Heting Gao et.al. 2501.16327 link
2025-01-27 Evaluating The Performance of Using Large Language Models to Automate Summarization of CT Simulation Orders in Radiation Oncology Meiyun Cao et.al. 2501.16309 null
2025-01-27 RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval Long Nguyen et.al. 2501.16303 null
2025-01-27 Matryoshka Re-Ranker: A Flexible Re-Ranking Architecture With Configurable Depth and Width Zheng Liu et.al. 2501.16302 null
2025-01-27 Large Models in Dialogue for Active Perception and Anomaly Detection Tzoulio Chamiti et.al. 2501.16300 link
2025-01-27 FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers Renshan Zhang et.al. 2501.16297 null
2025-01-27 Brain-Adapter: Enhancing Neurological Disorder Analysis with Adapter-Tuning Multimodal Large Language Models Jing Zhang et.al. 2501.16282 null
2025-01-27 Do LLMs Have Visualization Literacy? An Evaluation on Modified Visualizations to Test Generalization in Data Interpretation Jiayi Hong et.al. 2501.16277 link
2025-01-27 URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT Long Nguyen et.al. 2501.16276 null
2025-01-27 Return of the Encoder: Maximizing Parameter Efficiency for SLMs Mohamed Elfeki et.al. 2501.16273 link
2025-01-27 A foundation model for human-AI collaboration in medical literature mining Zifeng Wang et.al. 2501.16255 null
2025-01-27 Multi-Agent Geospatial Copilots for Remote Sensing Workflows Chaehong Lee et.al. 2501.16254 null
2025-01-27 Zero-Shot Decision Tree Construction via Large Language Models Lucas Carrasco et.al. 2501.16247 null
2025-01-27 CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation Xiaochuan Ma et.al. 2501.16246 null
2025-01-27 Phase Transitions in Large Language Models and the $O(N)$ Model Youran Sun et.al. 2501.16241 null
2025-01-27 AiGet: Transforming Everyday Moments into Hidden Knowledge Discovery with AI Assistance on Smart Glasses Runze Cai et.al. 2501.16240 link
2025-01-27 Distilling foundation models for robust and efficient models in digital pathology Alexandre Filiot et.al. 2501.16239 null
2025-01-27 Language-Based Bayesian Optimization Research Assistant (BORA) Abdoulatif Cissé et.al. 2501.16224 null
2025-01-27 Enhancing Visual Inspection Capability of Multi-Modal Large Language Models on Medical Time Series with Supportive Conformalized and Interpretable Small Specialized Models Huayu Li et.al. 2501.16215 link
2025-01-27 Provence: efficient and robust context pruning for retrieval-augmented generation Nadezhda Chirkova et.al. 2501.16214 null
2025-01-24 HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation Xin Zhou et.al. 2501.14729 link
2025-01-24 Do LLMs Provide Consistent Answers to Health-Related Questions across Languages? Ipek Baris Schlicht et.al. 2501.14719 null
2025-01-24 Towards Better Understanding Table Instruction Tuning: Decoupling the Effects from Data versus Models Naihao Deng et.al. 2501.14717 null
2025-01-24 FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing James Seale Smith et.al. 2501.14713 null
2025-01-24 The Karp Dataset Mason DiCicco et.al. 2501.14705 null
2025-01-24 Rethinking Table Instruction Tuning Naihao Deng et.al. 2501.14693 null
2025-01-24 Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST Fuping Wu et.al. 2501.14685 null
2025-01-24 An Empirical Study on LLM-based Classification of Requirements-related Provisions in Food-safety Regulations Shabnam Hassani et.al. 2501.14683 null
2025-01-24 Diffusion based Text-to-Music Generationwith Global and Local Text based Conditioning Jisi Zhang et.al. 2501.14680 null
2025-01-24 MedAgentBench: Dataset for Benchmarking LLMs as Agents in Medical Applications Yixing Jiang et.al. 2501.14654 link
2025-01-24 Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion Ziyao Xu et.al. 2501.14649 link
2025-01-24 Recommending Actionable Strategies: A Semantic Approach to Integrating Analytical Frameworks with Decision Heuristics Renato Ghisellini et.al. 2501.14634 null
2025-01-24 Extracting Problem Structure with LLMs for Optimized SAT Local Search André Schilder et.al. 2501.14630 null
2025-01-24 ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations Tianming Liang et.al. 2501.14607 null
2025-01-24 Knowledge Graphs Construction from Criminal Court Appeals: Insights from the French Cassation Court Alexander V. Belikov et.al. 2501.14579 null
2025-01-24 ZETA: Leveraging Z-order Curves for Efficient Top-k Attention Qiuhao Zeng et.al. 2501.14577 null
2025-01-24 Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding Zhongyi Shui et.al. 2501.14548 link
2025-01-24 Leveraging ChatGPT's Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research Hamid Sarmadi et.al. 2501.14546 null
2025-01-24 VERUS-LM: a Versatile Framework for Combining LLMs with Symbolic Reasoning Benjamin Callewaert et.al. 2501.14540 null
2025-01-24 Design and Implementation of a Psychiatry Resident Training System Based on Large Language Models Zhenguang Zhong et.al. 2501.14530 link
2025-01-23 CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation Guofeng Cui et.al. 2501.13927 null
2025-01-23 The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities Chan-Jan Hsu et.al. 2501.13921 link
2025-01-23 Analysis of Indic Language Capabilities in LLMs Aatman Vaidya et.al. 2501.13912 null
2025-01-23 Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models Linh Tran et.al. 2501.13904 null
2025-01-23 Exploring Finetuned Audio-LLM on Heart Murmur Features Adrian Florea et.al. 2501.13884 null
2025-01-23 The machine learning platform for developers of large systems Alexey Naikov et.al. 2501.13881 null
2025-01-23 A RAG-Based Institutional Assistant Gustavo Kuratomi et.al. 2501.13880 null
2025-01-23 Dual-Modal Prototype Joint Learning for Compositional Zero-Shot Learning Shiyu Zhang et.al. 2501.13859 null
2025-01-23 Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes Shiling Deng et.al. 2501.13851 link
2025-01-23 Think Outside the Data: Colonial Biases and Systemic Issues in Automated Moderation Pipelines for Low-Resource Languages Farhana Shahid et.al. 2501.13836 null
2025-01-23 On the Reasoning Capacity of AI Models and How to Quantify It Santosh Kumar Radha et.al. 2501.13833 null
2025-01-23 Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing Hao Zhang et.al. 2501.13831 null
2025-01-23 Hallucinations Can Improve Large Language Models in Drug Discovery Shuzhou Yuan et.al. 2501.13824 null
2025-01-23 Large Language Model driven Policy Exploration for Recommender Systems Jie Wang et.al. 2501.13816 null
2025-01-23 Enhancing LLMs for Governance with Human Oversight: Evaluating and Aligning LLMs on Expert Classification of Climate Misinformation for Detecting False or Misleading Claims about Climate Change Mowafak Allaham et.al. 2501.13802 null
2025-01-23 PromptMono: Cross Prompting Attention for Self-Supervised Monocular Depth Estimation in Challenging Environments Changhao Wang et.al. 2501.13796 null
2025-01-23 Training-Free Zero-Shot Temporal Action Detection with Vision-Language Models Chaolei Han et.al. 2501.13795 link
2025-01-23 Parameter-Efficient Fine-Tuning for Foundation Models Dan Zhang et.al. 2501.13787 link
2025-01-23 Not Every AI Problem is a Data Problem: We Should Be Intentional About Data Scaling Tanya Rodchenko et.al. 2501.13779 null
2025-01-23 Explainable XR: Understanding User Behaviors of XR Environments using LLM-assisted Analytics Framework Yoonsang Kim et.al. 2501.13778 link
2025-01-22 VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Boqiang Zhang et.al. 2501.13106 link
2025-01-22 Refining Input Guardrails: Enhancing LLM-as-a-Judge Efficiency Through Chain-of-Thought Fine-Tuning and Alignment Melissa Kazemi Rad et.al. 2501.13080 null
2025-01-22 Autonomy-of-Experts Models Ang Lv et.al. 2501.13074 null
2025-01-22 Does Table Source Matter? Benchmarking and Improving Multimodal Scientific Table Understanding and Reasoning Bohao Yang et.al. 2501.13042 link
2025-01-22 Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament Yantao Liu et.al. 2501.13007 link
2025-01-22 Large Language Model-Based Semantic Communication System for Image Transmission Soheyb Ribouh et.al. 2501.12988 null
2025-01-22 LLM4WM: Adapting LLM for Wireless Multi-Tasking Xuanyu Liu et.al. 2501.12983 null
2025-01-22 OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language Models Chongren Sun et.al. 2501.12975 link
2025-01-22 Accessible Smart Contracts Verification: Synthesizing Formal Models with Tamed LLMs Jan Corazza et.al. 2501.12972 link
2025-01-22 It's complicated. The relationship of algorithmic fairness and non-discrimination regulations in the EU AI Act Kristof Meding et.al. 2501.12962 null
2025-01-22 Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference Weizhi Fei et.al. 2501.12959 null
2025-01-22 GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models Pengxiang Zhao et.al. 2501.12956 null
2025-01-22 Correctness Assessment of Code Generated by Large Language Models Using Internal Representations Tuan-Dung Bui et.al. 2501.12934 link
2025-01-22 DynamicEarth: How Far are We from Open-Vocabulary Change Detection? Kaiyu Li et.al. 2501.12931 null
2025-01-22 A Functional Software Reference Architecture for LLM-Integrated Systems Alessio Bucaioni et.al. 2501.12904 null
2025-01-22 Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration Offa Kingsleigh et.al. 2501.12901 null
2025-01-22 Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Yafu Li et.al. 2501.12895 link
2025-01-22 Generative AI Misuse Potential in Cyber Security Education: A Case Study of a UK Degree Program Carlton Shepherd et.al. 2501.12883 null
2025-01-22 WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge Jingyuan Chen et.al. 2501.12877 null
2025-01-22 HierPromptLM: A Pure PLM-based Framework for Representation Learning on Heterogeneous Text-rich Networks Qiuyu Zhu et.al. 2501.12857 null
2025-01-21 InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling Yi Wang et.al. 2501.12386 link
2025-01-21 MMVU: Measuring Expert-Level Multi-Discipline Video Understanding Yilun Zhao et.al. 2501.12380 link
2025-01-21 Expertise elevates AI usage: experimental evidence comparing laypeople and professional artists Thomas F. Eisenmann et.al. 2501.12374 link
2025-01-21 Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL Yeounoh Chung et.al. 2501.12372 link
2025-01-21 Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models Samira Abnar et.al. 2501.12370 null
2025-01-21 InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model Yuhang Zang et.al. 2501.12368 link
2025-01-21 Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2 Md. Rakibul Islam et.al. 2501.12356 null
2025-01-21 Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration Thomas Walshe et.al. 2501.12332 null
2025-01-21 Cinepro: Robust Training of Foundation Models for Cancer Detection in Prostate Ultrasound Cineloops Mohamed Harmanani et.al. 2501.12331 link
2025-01-21 VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model Xianwei Zhuang et.al. 2501.12327 link
2025-01-21 LLM-Assisted Knowledge Graph Completion for Curriculum and Domain Modelling in Personalized Higher Education Recommendations Hasan Abu-Rasheed et.al. 2501.12300 null
2025-01-21 MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing Networks Qishen Zhou et.al. 2501.12281 link
2025-01-21 Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement Maosong Cao et.al. 2501.12273 link
2025-01-21 CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification Cristiano Patrício et.al. 2501.12266 null
2025-01-21 FOCUS: First Order Concentrated Updating Scheme Yizhou Liu et.al. 2501.12243 null
2025-01-21 InsTALL: Context-aware Instructional Task Assistance with Multi-modal Large Language Models Pha Nguyen et.al. 2501.12231 null
2025-01-21 CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning Yuanheng Fang et.al. 2501.12226 null
2025-01-21 Leveraging Large Language Models for Realizing Truly Intelligent User Interfaces Allard Oelen et.al. 2501.12221 null
2025-01-21 You Can't Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense Wuyuao Mai et.al. 2501.12210 null
2025-01-21 Fixing Imbalanced Attention to Mitigate In-Context Hallucination of Large Vision-Language Model Kazi Hasan Ibn Arif et.al. 2501.12206 link
2025-01-17 FaceXBench: Evaluating Multimodal LLMs on Face Understanding Kartik Narayan et.al. 2501.10360 link
2025-01-17 Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems Weibo Gao et.al. 2501.10332 link
2025-01-17 BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation Suvodip Dey et.al. 2501.10328 link
2025-01-17 Large language models for automated scholarly paper review: A survey Zhenzhen Zhuang et.al. 2501.10326 null
2025-01-17 Hierarchical Autoregressive Transformers: Combining Byte-~and Word-Level Processing for Robust, Adaptable Language Models Pit Neitemeier et.al. 2501.10322 null
2025-01-17 HiMix: Reducing Computational Complexity in Large Vision-Language Models Xuange Zhang et.al. 2501.10318 null
2025-01-17 Addressing Popularity Bias in Third-Party Library Recommendations Using LLMs Claudio Di Sipio et.al. 2501.10313 null
2025-01-17 Computational Protein Science in the Era of Large Language Models (LLMs) Wenqi Fan et.al. 2501.10282 null
2025-01-17 Test Wars: A Comparative Study of SBST, Symbolic Execution, and LLM-Based Approaches to Unit Test Generation Azat Abdullin et.al. 2501.10200 null
2025-01-17 Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education William Hersh et.al. 2501.10186 null
2025-01-17 Multi-stage Training of Bilingual Islamic LLM for Neural Passage Retrieval Vera Pavlova et.al. 2501.10175 null
2025-01-17 Dual Debiasing: Remove Stereotypes and Keep Factual Gender for Fair Language Modeling and Translation Tomasz Limisiewicz et.al. 2501.10150 null
2025-01-17 A Vision-Language Framework for Multispectral Scene Representation Using Language-Grounded Features Enes Karanfil et.al. 2501.10144 null
2025-01-17 Exploring the Impact of Generative Artificial Intelligence in Education: A Thematic Analysis Abhishek Kaushik et.al. 2501.10134 null
2025-01-17 ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario Lucen Zhong et.al. 2501.10132 link
2025-01-17 PaSa: An LLM Agent for Comprehensive Academic Paper Search Yichen He et.al. 2501.10120 link
2025-01-17 LLM Reasoner and Automated Planner: A new NPC approach Israel Puerta-Merino et.al. 2501.10106 null
2025-01-17 Universal Actions for Enhanced Embodied Foundation Models Jinliang Zheng et.al. 2501.10105 link
2025-01-17 Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks Michael Schwingshackl et.al. 2501.10080 link
2025-01-17 SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning Yuecheng Liu et.al. 2501.10074 null
2025-01-16 Distilling Multi-modal Large Language Models for Autonomous Driving Deepti Hegde et.al. 2501.09757 null
2025-01-16 Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues Youngjoon Jang et.al. 2501.09754 null
2025-01-16 OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Zekun Xi et.al. 2501.09751 link
2025-01-16 Enhancing Lexicon-Based Text Embeddings with Large Language Models Yibin Lei et.al. 2501.09749 null
2025-01-16 Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models Bihui Jin et.al. 2501.09745 null
2025-01-16 Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Nanye Ma et.al. 2501.09732 null
2025-01-16 A Simple Aerial Detection Baseline of Multimodal Language Models Qingyun Li et.al. 2501.09720 link
2025-01-16 CyberMentor: AI Powered Learning Tool Platform to Address Diverse Student Needs in Cybersecurity Education Tianyu Wang et.al. 2501.09709 link
2025-01-16 Domain Adaptation of Foundation LLMs for e-Commerce Christian Herold et.al. 2501.09706 null
2025-01-16 Cueless EEG imagined speech for subject identification: dataset and benchmarks Ali Derakhshesh et.al. 2501.09700 link
2025-01-16 Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key Zhihe Yang et.al. 2501.09695 link
2025-01-16 Simulated Interactive Debugging Yannic Noller et.al. 2501.09694 null
2025-01-16 Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Fengli Xu et.al. 2501.09686 null
2025-01-16 Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review Masatoshi Uehara et.al. 2501.09685 null
2025-01-16 Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark Alexis Roger et.al. 2501.09672 null
2025-01-16 A Survey of Research in Large Language Models for Electronic Design Automation Jingyu Pan et.al. 2501.09655 null
2025-01-16 The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models Jonathan Katzy et.al. 2501.09653 null
2025-01-16 CarMem: Enhancing Long-Term Memory in LLM Voice Assistants through Category-Bounding Johannes Kirmayr et.al. 2501.09645 link
2025-01-16 LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading Kuan-Ming Liu et.al. 2501.09636 null
2025-01-16 Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework Yushen Lin et.al. 2501.09631 null
2025-01-15 Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians Ishan Amin et.al. 2501.09009 link
2025-01-15 Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails Shaona Ghosh et.al. 2501.09004 null
2025-01-15 Vision Foundation Models for Computed Tomography Suraj Pai et.al. 2501.09001 link
2025-01-15 CityLoc: 6 DoF Localization of Text Descriptions in Large-Scale Scenes with Gaussian Representation Qi Ma et.al. 2501.08982 null
2025-01-15 Development and Validation of the Provider Documentation Summarization Quality Instrument for Large Language Models Emma Croxford et.al. 2501.08977 null
2025-01-15 Learning to Extract Cross-Domain Aspects and Understanding Sentiments Using Large Language Models Karukriti Kaushik Ghosh et.al. 2501.08974 null
2025-01-15 Analyzing the Ethical Logic of Six Large Language Models W. Russell Neuman et.al. 2501.08951 null
2025-01-15 Applying General Turn-taking Models to Conversational Human-Robot Interaction Gabriel Skantze et.al. 2501.08946 null
2025-01-15 Disentangling Exploration of Large Language Models by Optimal Exploitation Tim Grams et.al. 2501.08925 null
2025-01-15 GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge Liam Dugan et.al. 2501.08913 link
2025-01-15 Leveraging Large Language Models as Knowledge-Driven Agents for Reliable Retrosynthesis Planning Qinyu Ma et.al. 2501.08897 link
2025-01-15 Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving Tengpeng Li et.al. 2501.08861 link
2025-01-15 Exploring Task-Level Optimal Prompts for Visual In-Context Learning Yan Zhu et.al. 2501.08841 null
2025-01-15 IDEA: Image Description Enhanced CLIP-Adapter Zhipeng Ye et.al. 2501.08816 link
2025-01-15 How Developers Interact with AI: A Taxonomy of Human-AI Collaboration in Software Engineering Christoph Treude et.al. 2501.08774 null
2025-01-15 Admitting Ignorance Helps the Video Question Answering Models to Answer Haopeng Li et.al. 2501.08771 null
2025-01-15 Enhanced Large Language Models for Effective Screening of Depression and Anxiety June M. Liu et.al. 2501.08769 null
2025-01-15 Leveraging LLM Agents for Translating Network Configurations Yunze Wei et.al. 2501.08760 null
2025-01-15 Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models Hong-Viet Tran et.al. 2501.08758 null
2025-01-15 The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities Irina Bigoulaeva et.al. 2501.08716 link
2025-01-14 PokerBench: Training Large Language Models to become Professional Poker Players Richard Zhuang et.al. 2501.08328 link
2025-01-14 Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks Miran Heo et.al. 2501.08326 null
2025-01-14 ADAM-1: AI and Bioinformatics for Alzheimer's Detection and Microbiome-Clinical Data Integrations Ziyuan Huang et.al. [2501.08324](http://arxiv.org/abs/2501.0

About

Automatically update arXiv papers about SOT & VLT, Multi-modal Learning, LLM and Video Understanding using Github Actions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages