Table of Contents
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2025-07-22 | Explicit Context Reasoning with Supervision for Visual Tracking | Fansheng Zeng et.al. | 2507.16191 | null |
| 2025-07-21 | Is Tracking really more challenging in First Person Egocentric Vision? | Matteo Dunnhofer et.al. | 2507.16015 | null |
| 2025-07-23 | EndoControlMag: Robust Endoscopic Vascular Motion Magnification with Periodic Reference Resetting and Hierarchical Tissue-aware Dual-Mask Contro | An Wang et.al. | 2507.15292 | null |
| 2025-07-11 | SAM2RL: Towards Reinforcement Learning Memory Control in Segment Anything Model 2 | Alen Adamyan et.al. | 2507.08548 | null |
| 2025-07-10 | Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking | Qiangqiang Wu et.al. | 2507.07483 | null |
| 2025-07-09 | Token Bottleneck: One Token to Remember Dynamics | Taekyung Kim et.al. | 2507.06543 | null |
| 2025-07-08 | What You Have is What You Track: Adaptive and Robust Multimodal Tracking | Yuedong Tan et.al. | 2507.05899 | null |
| 2025-07-08 | Stable Tracking-in-the-Loop Control of Cable-Driven Surgical Manipulators under Erroneous Kinematic Chains | Neelay Joglekar et.al. | 2507.05663 | null |
| 2025-07-07 | Spatial and Semantic Embedding Integration for Stereo Sound Event Localization and Detection in Regular Videos | Davide Berghi et.al. | 2507.04845 | null |
| 2025-07-05 | Sensitive and accurate femtosecond pulse characterization via two-photon absorption in Fabry-Pérot laser diodes | Adrian F. Chlebowski et.al. | 2507.03978 | null |
| 2025-07-01 | UMDATrack: Unified Multi-Domain Adaptive Tracking Under Adverse Weather Conditions | Siyuan Yao et.al. | 2507.00648 | null |
| 2025-07-01 | ATSTrack: Enhancing Visual-Language Tracking by Aligning Temporal and Spatial Scales | Yihao Zhen et.al. | 2507.00454 | null |
| 2025-06-30 | Mamba-FETrack V2: Revisiting State Space Model for Frame-Event based Visual Object Tracking | Shiao Wang et.al. | 2506.23783 | null |
| 2025-07-22 | R1-Track: Direct Application of MLLMs to Visual Object Tracking via Reinforcement Learning | Biao Wang et.al. | 2506.21980 | null |
| 2025-06-25 | Exploiting Lightweight Hierarchical ViT and Dynamic Framework for Efficient Visual Tracking | Ben Kang et.al. | 2506.20381 | null |
| 2025-06-17 | Comparison of Two Methods for Stationary Incident Detection Based on Background Image | Deepak Ghimire et.al. | 2506.14256 | null |
| 2025-06-03 | MVTD: A Benchmark Dataset for Maritime Visual Object Tracking | Ahsan Baidar Bakht et.al. | 2506.02866 | null |
| 2025-05-31 | Towards Effective and Efficient Adversarial Defense with Diffusion Models for Robust Visual Tracking | Long Xu et.al. | 2506.00325 | link |
| 2025-05-29 | CLDTracker: A Comprehensive Language Description for Visual Tracking | Mohamad Alansari et.al. | 2505.23704 | link |
| 2025-05-29 | TrackVLA: Embodied Visual Tracking in the Wild | Shaoan Wang et.al. | 2505.23189 | null |
| 2025-05-28 | TwinTrack: Bridging Vision and Contact Physics for Real-Time Tracking of Unknown Dynamic Objects | Wen Yang et.al. | 2505.22882 | null |
| 2025-05-27 | Fully Spiking Neural Networks for Unified Frame-Event Object Tracking | Jingjun Yang et.al. | 2505.20834 | null |
| 2025-05-28 | VLM Can Be a Good Assistant: Enhancing Embodied Visual Tracking with Self-Improving Vision-Language Models | Kui Wu et.al. | 2505.20718 | null |
| 2025-05-27 | Hierarchical Instruction-aware Embodied Visual Tracking | Kui Wu et.al. | 2505.20710 | null |
| 2025-06-01 | HAND Me the Data: Fast Robot Adaptation via Hand Path Retrieval | Matthew Hong et.al. | 2505.20455 | null |
| 2025-05-28 | Progressive Scaling Visual Object Tracking | Jack Hong et.al. | 2505.19990 | null |
| 2025-05-23 | Adapting SAM 2 for Visual Object Tracking: 1st Place Solution for MMVPR Challenge Multi-Modal Tracking | Cheng-Yen Yang et.al. | 2505.18111 | null |
| 2025-05-22 | Efficient Motion Prompt Learning for Robust Visual Tracking | Jie Zhao et.al. | 2505.16321 | link |
| 2025-05-19 | Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach | Shiao Wang et.al. | 2505.12903 | link |
| 2025-05-13 | Towards Adaptive Meta-Gradient Adversarial Examples for Visual Tracking | Wei-Long Tian et.al. | 2505.08999 | link |
| 2025-05-11 | DeepSORT-Driven Visual Tracking Approach for Gesture Recognition in Interactive Systems | Tong Zhang et.al. | 2505.07110 | null |
| 2025-05-09 | CGTrack: Cascade Gating Network with Hierarchical Feature Aggregation for UAV Tracking | Weihong Li et.al. | 2505.05936 | link |
| 2025-05-07 | Predicting Road Surface Anomalies by Visual Tracking of a Preceding Vehicle | Petr Jahoda et.al. | 2505.04392 | null |
| 2025-04-19 | Adversarial Attack for RGB-Event based Visual Object Tracking | Qiang Chen et.al. | 2504.14423 | link |
| 2025-05-05 | SAM2MOT: A Novel Paradigm of Multi-Object Tracking by Segmentation | Junjie Jiang et.al. | 2504.04519 | link |
| 2025-03-24 | SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking | Wenrui Cai et.al. | 2503.18338 | link |
| 2025-03-22 | MUST: The First Dataset and Unified Framework for Multispectral UAV Single Object Tracking | Haolin Qin et.al. | 2503.17699 | link |
| 2025-03-21 | Dynamic Attention Mechanism in Spatiotemporal Memory Networks for Object Tracking | Meng Zhou et.al. | 2503.16768 | null |
| 2025-03-17 | UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network | Siyuan Yao et.al. | 2503.12888 | link |
| 2025-03-16 | A Plug-and-Play Learning-based IMU Bias Factor for Robust Visual-Inertial Odometry | Yang Yi et.al. | 2503.12527 | null |
| 2025-03-14 | Towards General Multimodal Visual Tracking | Andong Lu et.al. | 2503.11218 | null |
| 2025-03-09 | Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking | Chaocan Xue et.al. | 2503.06625 | link |
| 2025-03-09 | Dynamic Updates for Language Adaptation in Visual-Language Tracking | Xiaohai Li et.al. | 2503.06621 | link |
| 2025-02-28 | Technical Report for ReID-SAM on SkiTB Visual Tracking Challenge 2025 | Kunjun Li et.al. | 2503.01907 | null |
| 2025-03-01 | Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking | Jiawen Zhu et.al. | 2503.00516 | link |
| 2025-02-27 | MITracker: Multi-View Integration for Visual Object Tracking | Mengjie Xu et.al. | 2502.20111 | null |
| 2025-02-27 | CFTrack: Enhancing Lightweight Visual Tracking through Contrastive Learning and Feature Matching | Juntao Liang et.al. | 2502.19705 | null |
| 2025-02-26 | Enhanced Transformer-Based Tracking for Skiing Events: Overcoming Multi-Camera Challenges, Scale Variations and Rapid Motion -- SkiTB Visual Tracking Challenge 2025 | Akhil Penta et.al. | 2502.18867 | null |
| 2025-02-25 | UASTrack: A Unified Adaptive Selection Framework with Modality-Customization in Single Object Tracking | He Wang et.al. | 2502.18220 | null |
| 2025-02-08 | Event Stream-based Visual Object Tracking: HDETrack V2 and A High-Definition Benchmark | Shiao Wang et.al. | 2502.05574 | link |
| 2025-01-13 | Robust Single Object Tracking in LiDAR Point Clouds under Adverse Weather Conditions | Xiantong Zhao et.al. | 2501.07133 | null |
| 2025-01-05 | DeTrack: In-model Latent Denoising Learning for Visual Object Tracking | Xinyu Zhou et.al. | 2501.02467 | null |
| 2025-01-13 | FusionSORT: Fusion Methods for Online Multi-object Visual Tracking | Nathanael L. Baisa et.al. | 2501.00843 | link |
| 2025-01-01 | Less is More: Token Context-aware Learning for Object Tracking | Chenlong Xu et.al. | 2501.00758 | link |
| 2024-12-28 | Learning Adaptive and View-Invariant Vision Transformer with Multi-Teacher Knowledge Distillation for Real-Time UAV Tracking | You Wu et.al. | 2412.20002 | link |
| 2024-12-26 | SUTrack: Towards Simple and Unified Single Object Tracking | Xin Chen et.al. | 2412.19138 | link |
| 2024-12-15 | Exploring Enhanced Contextual Information for Video-Level Object Tracking | Ben Kang et.al. | 2412.11023 | link |
| 2024-12-13 | Visual Object Tracking across Diverse Data Modalities: A Review | Mengmeng Wang et.al. | 2412.09991 | null |
| 2025-03-07 | MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues | Zhaofeng Hu et.al. | 2412.02734 | link |
| 2024-12-03 | GSOT3D: Towards Generic 3D Single Object Tracking in the Wild | Yifan Jiao et.al. | 2412.02129 | link |
| 2025-02-06 | Improving Accuracy and Generalization for Efficient Visual Tracking | Ram Zaveri et.al. | 2411.18855 | null |
| 2024-11-27 | A comparison of extended object tracking with multi-modal sensors in indoor environment | Jiangtao Shuai et.al. | 2411.18476 | null |
| 2024-12-04 | A Distractor-Aware Memory for Visual Object Tracking with SAM2 | Jovana Videnovic et.al. | 2411.17576 | link |
| 2024-11-23 | How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking | Xuchen Li et.al. | 2411.15600 | null |
| 2024-11-24 | ClickTrack: Towards Real-time Interactive Single Object Tracking | Kuiran Wang et.al. | 2411.13183 | null |
| 2024-11-30 | SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory | Cheng-Yen Yang et.al. | 2411.11922 | link |
| 2024-12-09 | Vision Eagle Attention: a new lens for advancing image classification | Mahmudul Hasan et.al. | 2411.10564 | link |
| 2024-11-14 | MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation | Jonas Serych et.al. | 2411.09551 | link |
| 2024-11-12 | Visual Tracking with Intermittent Visibility: Switched Control Design and Implementation | Yangge Li et.al. | 2411.08144 | null |
| 2024-12-16 | ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model | Yiming Sun et.al. | 2411.01756 | null |
| 2024-10-30 | IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking | Run Luo et.al. | 2410.23907 | null |
| 2024-10-27 | NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object Tracking | Yu Liu et.al. | 2410.20421 | link |
| 2024-10-19 | The Solution for Single Object Tracking Task of Perception Test Challenge 2024 | Zhiqiang Zhong et.al. | 2410.16329 | null |
| 2024-10-13 | Gaussian Splatting Visual MPC for Granular Media Manipulation | Wei-Cheng Tseng et.al. | 2410.09740 | null |
| 2024-10-09 | DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM | Xuchen Li et.al. | 2410.02492 | null |
| 2024-09-30 | Opt-in Camera: Person Identification in Video via UWB Localization and Its Application to Opt-in Systems | Matthew Ishige et.al. | 2409.19891 | null |
| 2024-09-27 | Improving Visual Object Tracking through Visual Prompting | Shih-Fang Chen et.al. | 2409.18901 | link |
| 2024-09-26 | General Compression Framework for Efficient Transformer Object Tracking | Lingyi Hong et.al. | 2409.17564 | null |
| 2024-09-25 | Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2 | Chunhui Zhang et.al. | 2409.16902 | link |
| 2024-09-25 | Conditional Generative Denoiser for Nighttime UAV Tracking | Yucheng Wang et.al. | 2409.16834 | link |
| 2024-09-25 | Progressive Representation Learning for Real-Time UAV Tracking | Changhong Fu et.al. | 2409.16652 | link |
| 2024-09-25 | Enhancing Nighttime UAV Tracking with Light Distribution Suppression | Liangliang Yao et.al. | 2409.16631 | link |
| 2024-09-19 | WeHelp: A Shared Autonomy System for Wheelchair Users | Abulikemu Abuduweili et.al. | 2409.12159 | link |
| 2024-09-18 | Distilling Channels for Efficient Deep Tracking | Shiming Ge et.al. | 2409.11785 | null |
| 2024-09-13 | Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark | Xuchen Li et.al. | 2409.08887 | null |
| 2024-09-10 | VBIT: Towards Enhancing Privacy Control Over IoT Devices | Jad Al Aaraj et.al. | 2409.06233 | null |
| 2024-09-03 | Ultra-broadband room-temperature Fourier transform spectrometer with watt-level power consumption | Jakub Mnich et.al. | 2409.01875 | null |
| 2024-08-25 | Camouflaged_Object_Tracking__A_Benchmark | Xiaoyu Guo et.al. | 2408.13877 | link |
| 2024-08-21 | Low-Light Object Tracking: A Benchmark | Pengzhi Zhong et.al. | 2408.11463 | link |
| 2024-08-20 | MambaEVT: Event Stream based Visual Object Tracking using State Space Model | Xiao Wang et.al. | 2408.10487 | link |
| 2024-08-05 | VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking | Yuxuan Lu et.al. | 2408.02263 | null |
| 2024-09-06 | 3D Single-object Tracking in Point Clouds with High Temporal Variation | Qiao Wu et.al. | 2408.02049 | null |
| 2024-09-09 | SiamMo: Siamese Motion-Centric 3D Object Tracking | Yuxiang Yang et.al. | 2408.01688 | link |
| 2024-08-02 | Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach | Yabin Zhu et.al. | 2408.00969 | link |
| 2024-08-06 | Broadband THz wave generation and detection in organic crystal PNPA at MHz repetition rates | Lukasz A. Sterczewski et.al. | 2407.20745 | null |
| 2024-07-16 | Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers | Zhengbo Zhang et.al. | 2407.08394 | null |
| 2024-07-11 | PINN-Ray: A Physics-Informed Neural Network to Model Soft Robotic Fin Ray Fingers | Xing Wang et.al. | 2407.08222 | null |
| 2024-07-07 | Addressing single object tracking in satellite imagery through prompt-engineered solutions | Athena Psalta et.al. | 2407.05518 | null |
| 2024-07-07 | Learning Motion Blur Robust Vision Transformers with Dynamic Early Exit for Real-Time UAV Tracking | You Wu et.al. | 2407.05383 | null |
| 2024-07-09 | P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds | Jiahao Nie et.al. | 2407.05238 | link |
| 2024-07-07 | Tracking Reflected Objects: A Benchmark | Xiaoyu Guo et.al. | 2407.05235 | null |
| 2024-07-04 | TrackPGD: A White-box Attack using Binary Masks against Robust Transformer Trackers | Fatemeh Nourilenjan Nokabadi et.al. | 2407.03946 | link |
| 2024-07-02 | FlowTrack: Point-level Flow Network for 3D Single Object Tracking | Shuo Li et.al. | 2407.01959 | null |
| 2024-09-07 | eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking | Yucheng Chen et.al. | 2406.20024 | null |
| 2024-06-14 | Constrained Motion Planning for a Robotic Endoscope Holder based on Hierarchical Quadratic Programming | Jacinto Colan et.al. | 2406.09982 | null |
| 2024-06-14 | Robust compressive tracking via online weighted multiple instance learning | Sandeep Singh Sengar et.al. | 2406.09914 | null |
| 2024-07-01 | Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking | Xiangyang Yang et.al. | 2406.08037 | null |
| 2024-06-07 | Multi-Granularity Language-Guided Multi-Object Tracking | Yuhao Li et.al. | 2406.04844 | link |
| 2024-06-02 | Robust Visual Tracking via Iterative Gradient Descent and Threshold Selection | Zhuang Qi et.al. | 2406.00589 | null |
| 2024-05-28 | Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion | Hongze Sun et.al. | 2405.17903 | link |
| 2024-05-27 | LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking | Shaohua Dong et.al. | 2405.17660 | null |
| 2024-05-31 | Awesome Multi-modal Object Tracking | Chunhui Zhang et.al. | 2405.14200 | link |
| 2024-05-20 | DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM | Xuchen Li et.al. | 2405.12139 | null |
| 2024-05-16 | A Novel Bounding Box Regression Method for Single Object Tracking | Omar Abdelaziz et.al. | 2405.10444 | null |
| 2024-05-16 | Beyond Traditional Single Object Tracking: A Survey | Omar Abdelaziz et.al. | 2405.10439 | null |
| 2024-05-08 | TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking | Pengcheng Shao et.al. | 2405.05004 | link |
| 2024-04-22 | 360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos | Yinzhe Xu et.al. | 2404.13953 | link |
| 2024-05-25 | An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training | Jin Gao et.al. | 2404.12210 | link |
| 2024-04-16 | Attention-Aware Visualization: Tracking and Responding to User Perception Over Time | Arvind Srinivasan et.al. | 2404.10732 | null |
| 2024-04-15 | Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL | Fangwei Zhong et.al. | 2404.09857 | null |
| 2024-04-15 | Learning Tracking Representations from Single Point Annotations | Qiangqiang Wu et.al. | 2404.09504 | null |
| 2024-04-11 | PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds | Weisheng Xu et.al. | 2404.07495 | link |
| 2024-05-02 | Longitudinal Analysis and Quantitative Assessment of Child Development through Mobile Interaction | Juan Carlos Ruiz-Garcia et.al. | 2404.06919 | link |
| 2024-04-09 | LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks | Jianlang Chen et.al. | 2404.06247 | link |
| 2024-04-08 | Semi-Supervised Novelty Detection for Precise Ultra-Wideband Error Signal Prediction | Umberto Albertin et.al. | 2404.05351 | null |
| 2024-03-29 | Context-Aware Integration of Language and Visual References for Natural Language Tracking | Yanyan Shao et.al. | 2403.19975 | null |
| 2024-03-27 | TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes | Liangyu Xu et.al. | 2403.18238 | null |
| 2024-03-26 | OmniVid: A Generative Framework for Universal Video Understanding | Junke Wang et.al. | 2403.17935 | link |
| 2024-03-26 | Exploring Dynamic Transformer for Efficient Object Tracking | Jiawen Zhu et.al. | 2403.17651 | null |
| 2024-03-29 | Elysium: Exploring Object-level Perception in Videos via MLLM | Han Wang et.al. | 2403.16558 | link |
| 2024-03-25 | Multi-attention Associate Prediction Network for Visual Tracking | Xinglong Sun et.al. | 2403.16395 | null |
| 2024-03-28 | SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking | Xiaojun Hou et.al. | 2403.16002 | link |
| 2024-03-23 | Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking | Shaoyu Sun et.al. | 2403.15831 | null |
| 2024-03-19 | TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO | Chaoran Xiong et.al. | 2403.12504 | link |
| 2024-03-18 | Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model | Jan Krejčí et.al. | 2403.11978 | null |
| 2024-03-16 | A Spectrum-based Image Denoising Method with Edge Feature Enhancement | Peter Luvton et.al. | 2403.11036 | null |
| 2024-03-15 | Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers | Jinxia Xie et.al. | 2403.10574 | null |
| 2024-03-14 | OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning | Lingyi Hong et.al. | 2403.09634 | null |
| 2024-02-27 | ACTrack: Adding Spatio-Temporal Condition for Visual Object Tracking | Yushan Han et.al. | 2403.07914 | null |
| 2024-04-03 | Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline | Xiao Wang et.al. | 2403.05839 | link |
| 2024-03-08 | Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance | Liting Lin et.al. | 2403.05231 | link |
| 2024-03-08 | Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy | Yuelin Zhang et.al. | 2403.05146 | link |
| 2024-03-06 | VastTrack: Vast Category Visual Object Tracking | Liang Peng et.al. | 2403.03493 | link |
| 2024-02-28 | Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks | Zhewei Wu et.al. | 2402.17976 | null |
| 2024-02-26 | SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking | Yu Lin et.al. | 2402.16249 | link |
| 2024-02-26 | Reading Relevant Feature from Global Representation Memory for Visual Object Tracking | Xinyu Zhou et.al. | 2402.14392 | null |
| 2024-02-13 | Optimized Information Flow for Transformer Tracking | Janani Kugarajeevan et.al. | 2402.08195 | link |
| 2024-02-07 | BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision | Xin Zhao et.al. | 2402.04519 | null |
| 2024-02-04 | Spatio-temporal Prompting Network for Robust Video Feature Extraction | Guanxiong Sun et.al. | 2402.02574 | link |
| 2024-01-24 | Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region | Shengjing Tian et.al. | 2401.13285 | null |
| 2024-01-23 | Correlation-Embedded Transformer Tracking: A Single-Branch Framework | Fei Xie et.al. | 2401.12743 | link |
| 2024-01-20 | Unifying Visual and Vision-Language Tracking via Contrastive Learning | Yinchao Ma et.al. | 2401.11228 | link |
| 2024-01-20 | Towards Category Unification of 3D Single Object Tracking on Point Clouds | Jiahao Nie et.al. | 2401.11204 | null |
| 2024-01-18 | Multi-task Learning for Joint Re-identification, Team Affiliation, and Role Classification for Sports Visual Tracking | Amir M. Mansourian et.al. | 2401.09942 | null |
| 2024-01-12 | Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements | Muhammad Wasim Nawaz et.al. | 2401.06396 | null |
| 2024-01-18 | Hold 'em and Fold 'em: Towards Human-scale, Feedback-Controlled Soft Origami Robots | Immanuel Ampomah Mensah et.al. | 2401.04650 | null |
| 2024-01-06 | Explicit Visual Prompts for Visual Object Tracking | Liangtao Shi et.al. | 2401.03142 | link |
| 2024-01-03 | ODTrack: Online Dense Temporal Token Learning for Visual Tracking | Yaozong Zheng et.al. | 2401.01686 | link |
| 2023-12-27 | X Modality Assisting RGBT Object Tracking | Zhaisheng Ding et.al. | 2312.17273 | null |
| 2023-12-22 | Cross-Modal Object Tracking via Modality-Aware Fusion Network and A Large-Scale Dataset | Lei Liu et.al. | 2312.14446 | link |
| 2023-12-18 | Multi-Correlation Siamese Transformer Network with Dense Connection for 3D Single Object Tracking | Shihao Feng et.al. | 2312.11051 | link |
| 2023-12-17 | Robust 3D Tracking with Quality-Aware Shape Completion | Jingwen Zhang et.al. | 2312.10608 | null |
| 2023-12-15 | Tracking Skiers from the Top to the Bottom | Matteo Dunnhofer et.al. | 2312.09723 | null |
| 2023-12-11 | M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking | Jiaming Liu et.al. | 2312.06117 | link |
| 2023-12-07 | Instance Tracking in 3D Scenes from Egocentric Videos | Yunhan Zhao et.al. | 2312.04117 | link |
| 2024-02-19 | Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking | Jiawei Ge et.al. | 2311.17085 | null |
| 2023-11-21 | Visual tracking brain computer interface | Changxing Huang et.al. | 2311.12592 | null |
| 2024-01-10 | ViKi-HyCo: A Hybrid-Control approach for complex car-like maneuvers | Edison P. Velasco Sánchez et.al. | 2311.07268 | null |
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2025-07-23 | Pretraining on the Test Set Is No Longer All You Need: A Debate-Driven Approach to QA Benchmarks | Linbo Cao et.al. | 2507.17747 | null |
| 2025-07-23 | Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains | Anisha Gunjal et.al. | 2507.17746 | null |
| 2025-07-23 | Megrez2 Technical Report | Boxun Li et.al. | 2507.17728 | null |
| 2025-07-23 | BetterCheck: Towards Safeguarding VLMs for Automotive Perception Systems | Malsha Ashani Mahawatta Dona et.al. | 2507.17722 | null |
| 2025-07-23 | AI Telephone Surveying: Automating Quantitative Data Collection with an AI Interviewer | Danny D. Leybzon et.al. | 2507.17718 | null |
| 2025-07-23 | HydraOpt: Navigating the Efficiency-Performance Trade-off of Adapter Merging | Taha Ceritli et.al. | 2507.17706 | null |
| 2025-07-23 | Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models | Changxin Tian et.al. | 2507.17702 | null |
| 2025-07-23 | Thinking Isn't an Illusion: Overcoming the Limitations of Reasoning Models via Tool Augmentations | Zhao Song et.al. | 2507.17699 | null |
| 2025-07-23 | Symbiotic Agents: A Novel Paradigm for Trustworthy AGI-driven Networks | Ilias Chatzistefanidis et.al. | 2507.17695 | null |
| 2025-07-23 | Simulating multiple human perspectives in socio-ecological systems using large language models | Yongchao Zeng et.al. | 2507.17680 | null |
| 2025-07-23 | See the Forest and the Trees: A Synergistic Reasoning Framework for Knowledge-Based Visual Question Answering | Junjie Wang et.al. | 2507.17659 | null |
| 2025-07-23 | Who Attacks, and Why? Using LLMs to Identify Negative Campaigning in 18M Tweets across 19 Countries | Victor Hartman et.al. | 2507.17636 | null |
| 2025-07-23 | A Hybrid Early-Exit Algorithm for Large Language Models Based on Space Alignment Decoding (SPADE) | Bowen Zheng et.al. | 2507.17618 | null |
| 2025-07-23 | Decoding Consumer Preferences Using Attention-Based Language Models | Joshua Foster et.al. | 2507.17564 | null |
| 2025-07-23 | BoSS: Beyond-Semantic Speech | Qing Wang et.al. | 2507.17563 | null |
| 2025-07-23 | CodeReasoner: Enhancing the Code Reasoning Ability with Reinforcement Learning | Lingxiao Tang et.al. | 2507.17548 | null |
| 2025-07-23 | Anticipate, Simulate, Reason (ASR): A Comprehensive Generative AI Framework for Combating Messaging Scams | Xue Wen Tan et.al. | 2507.17543 | null |
| 2025-07-23 | AssertFlip: Reproducing Bugs via Inversion of LLM-Generated Passing Tests | Lara Khatib et.al. | 2507.17542 | null |
| 2025-07-23 | Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning | Xinyao Liu et.al. | 2507.17539 | null |
| 2025-07-23 | InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation | Shuai Yang et.al. | 2507.17520 | null |
| 2025-07-22 | Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning | Junhao Shen et.al. | 2507.16814 | null |
| 2025-07-22 | LingBench++: A Linguistically-Informed Benchmark and Reasoning Framework for Multi-Step and Cross-Cultural Inference with LLMs | Da-Chen Lian et.al. | 2507.16809 | null |
| 2025-07-22 | Rethinking LLM-Based RTL Code Optimization Via Timing Logic Metamorphosis | Zhihao Xu et.al. | 2507.16808 | null |
| 2025-07-22 | Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty | Mehul Damani et.al. | 2507.16806 | null |
| 2025-07-23 | Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning | Yanjun Zheng et.al. | 2507.16802 | null |
| 2025-07-23 | Test-Time-Matching: Decouple Personality, Memory, and Linguistic Style in LLM-based Role-Playing Language Agent | Xiaoyu Zhan et.al. | 2507.16799 | null |
| 2025-07-22 | Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning | Helena Casademunt et.al. | 2507.16795 | null |
| 2025-07-22 | ChatChecker: A Framework for Dialogue System Testing and Evaluation Through Non-cooperative User Simulation | Roman Mayr et.al. | 2507.16792 | null |
| 2025-07-22 | Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning | Hongyin Luo et.al. | 2507.16784 | null |
| 2025-07-22 | Cooling Matters: Benchmarking Large Language Models and Vision-Language Models on Liquid-Cooled Versus Air-Cooled H100 GPU Systems | Imran Latif et.al. | 2507.16781 | null |
| 2025-07-22 | When LLMs Copy to Think: Uncovering Copy-Guided Attacks in Reasoning LLMs | Yue Li et.al. | 2507.16773 | null |
| 2025-07-22 | WGRAMMAR: Leverage Prior Knowledge to Accelerate Structured Decoding | Ran Wang et.al. | 2507.16768 | null |
| 2025-07-22 | Never Come Up Empty: Adaptive HyDE Retrieval for Improving LLM Developer Support | Fangjian Lei et.al. | 2507.16754 | null |
| 2025-07-22 | CMP: A Composable Meta Prompt for SAM-Based Cross-Domain Few-Shot Segmentation | Shuai Chen et.al. | 2507.16753 | null |
| 2025-07-22 | Collaborative Inference and Learning between Edge SLMs and Cloud LLMs: A Survey of Algorithms, Execution, and Open Challenges | Senyao Li et.al. | 2507.16731 | null |
| 2025-07-23 | Deliberative Searcher: Improving LLM Reliability via Reinforcement Learning with constraints | Zhenyun Yin et.al. | 2507.16727 | null |
| 2025-07-22 | SALM: Spatial Audio Language Model with Structured Embeddings for Understanding and Editing | Jinbo Hu et.al. | 2507.16724 | null |
| 2025-07-22 | Enhancing Remote Sensing Vision-Language Models Through MLLM and LLM-Based High-Quality Image-Text Dataset Generation | Yiguo He et.al. | 2507.16716 | null |
| 2025-07-22 | Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory | Guowei Lan et.al. | 2507.16713 | null |
| 2025-07-22 | Advancing Risk and Quality Assurance: A RAG Chatbot for Improved Regulatory Compliance | Lars Hillebrand et.al. | 2507.16711 | null |
| 2025-07-21 | Diffusion Beats Autoregressive in Data-Constrained Settings | Mihir Prabhudesai et.al. | 2507.15857 | null |
| 2025-07-21 | Gemini 2.5 Pro Capable of Winning Gold at IMO 2025 | Yichen Huang et.al. | 2507.15855 | null |
| 2025-07-22 | SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction | Zhixiong Zhang et.al. | 2507.15852 | null |
| 2025-07-21 | The Other Mind: How Language Models Exhibit Human Temporal Cognition | Lingyu Li et.al. | 2507.15851 | null |
| 2025-07-21 | 3LM: Bridging Arabic, STEM, and Code through Benchmarking | Basma El Amel Boussaha et.al. | 2507.15850 | null |
| 2025-07-21 | The Impact of Language Mixing on Bilingual LLM Reasoning | Yihao Li et.al. | 2507.15849 | null |
| 2025-07-21 | FASTGEN: Fast and Cost-Effective Synthetic Tabular Data Generation with LLMs | Anh Nguyen et.al. | 2507.15839 | null |
| 2025-07-21 | Just Ask for Music (JAM): Multimodal and Personalized Natural Language Music Recommendation | Alessandro B. Melchiorre et.al. | 2507.15826 | null |
| 2025-07-21 | ACS: An interactive framework for conformal selection | Yu Gui et.al. | 2507.15825 | null |
| 2025-07-21 | Can Your Model Separate Yolks with a Water Bottle? Benchmarking Physical Commonsense Understanding in Video Generation Models | Enes Sanli et.al. | 2507.15824 | null |
| 2025-07-21 | Do AI models help produce verified bug fixes? | Li Huang et.al. | 2507.15822 | null |
| 2025-07-21 | LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra | Seth Karten et.al. | 2507.15815 | null |
| 2025-07-21 | True Multimodal In-Context Learning Needs Attention to the Visual Context | Shuo Chen et.al. | 2507.15807 | null |
| 2025-07-21 | ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction | Danhui Chen et.al. | 2507.15803 | null |
| 2025-07-21 | Regularized Low-Rank Adaptation for Few-Shot Organ Segmentation | Ghassen Baklouti et.al. | 2507.15793 | null |
| 2025-07-21 | Small LLMs Do Not Learn a Generalizable Theory of Mind via Reinforcement Learning | Sneheel Sarangi et.al. | 2507.15788 | null |
| 2025-07-21 | Reservoir Computing as a Language Model | Felix Köster et.al. | 2507.15779 | null |
| 2025-07-21 | Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR | Jiakang Wang et.al. | 2507.15778 | null |
| 2025-07-21 | Left Leaning Models: AI Assumptions on Economic Policy | Maxim Chupilkin et.al. | 2507.15771 | null |
| 2025-07-21 | A Framework for Analyzing Abnormal Emergence in Service Ecosystems Through LLM-based Agent Intention Mining | Yifan Shen et.al. | 2507.15770 | null |
| 2025-07-18 | Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning | Shashanka Venkataramanan et.al. | 2507.14137 | null |
| 2025-07-18 | CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning | Xiaoya Li et.al. | 2507.14111 | null |
| 2025-07-18 | Automated Interpretation of Non-Destructive Evaluation Contour Maps Using Large Language Models for Bridge Condition Assessment | Viraj Nishesh Darji et.al. | 2507.14107 | null |
| 2025-07-18 | Generative AI-Driven High-Fidelity Human Motion Simulation | Hari Iyer et.al. | 2507.14097 | null |
| 2025-07-18 | Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) track | Brian Ondov et.al. | 2507.14096 | null |
| 2025-07-18 | DPMT: Dual Process Multi-scale Theory of Mind Framework for Real-time Human-AI Collaboration | Xiyun Li et.al. | 2507.14088 | null |
| 2025-07-18 | DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits | Garapati Keerthana et.al. | 2507.14079 | null |
| 2025-07-18 | VLA-Mark: A cross modal watermark for large vision-language alignment model | Shuliang Liu et.al. | 2507.14067 | null |
| 2025-07-18 | Foundation Models as Class-Incremental Learners for Dermatological Image Classification | Mohamed Elkhayat et.al. | 2507.14050 | null |
| 2025-07-18 | EdgeVLA: Efficient Vision-Language-Action Models | Paweł Budzianowski et.al. | 2507.14049 | null |
| 2025-07-18 | Evaluating the Effectiveness of Cost-Efficient Large Language Models in Benchmark Biomedical Tasks | Israt Jahan et.al. | 2507.14045 | null |
| 2025-07-18 | Architecting Human-AI Cocreation for Technical Services -- Interaction Modes and Contingency Factors | Jochen Wulf et.al. | 2507.14034 | null |
| 2025-07-18 | KROMA: Ontology Matching with Knowledge Retrieval and Large Language Models | Lam Nguyen et.al. | 2507.14032 | null |
| 2025-07-18 | Moodifier: MLLM-Enhanced Emotion-Driven Image Editing | Jiarong Ye et.al. | 2507.14024 | null |
| 2025-07-18 | Efficient Temporal Tokenization for Mobility Prediction with Large Language Models | Haoyu He et.al. | 2507.14017 | null |
| 2025-07-18 | OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models | Ningyong Wu et.al. | 2507.13993 | null |
| 2025-07-18 | Leveraging Pathology Foundation Models for Panoptic Segmentation of Melanoma in H&E Images | Jiaqi Lv et.al. | 2507.13974 | null |
| 2025-07-18 | Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need | Bhishma Dedhia et.al. | 2507.13966 | null |
| 2025-07-18 | DUALRec: A Hybrid Sequential and Language Model Framework for Context-Aware Movie Recommendation | Yitong Li et.al. | 2507.13957 | null |
| 2025-07-18 | Cross-modal Causal Intervention for Alzheimer's Disease Prediction | Yutao Jin et.al. | 2507.13956 | null |
| 2025-07-17 | VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding | Shihao Wang et.al. | 2507.13353 | null |
| 2025-07-17 | VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning | Senqiao Yang et.al. | 2507.13348 | null |
| 2025-07-17 | Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes | Tyler Loakman et.al. | 2507.13335 | null |
| 2025-07-17 | A Survey of Context Engineering for Large Language Models | Lingrui Mei et.al. | 2507.13334 | null |
| 2025-07-17 | The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner | Zhouqi Hua et.al. | 2507.13332 | null |
| 2025-07-17 | Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It | Yulu Qin et.al. | 2507.13328 | null |
| 2025-07-17 | GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM | Kyeongjin Ahn et.al. | 2507.13323 | null |
| 2025-07-17 | HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals | Guimin Hu et.al. | 2507.13318 | null |
| 2025-07-17 | Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark | Junsu Kim et.al. | 2507.13314 | null |
| 2025-07-17 | The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations | Carlos Arriaga et.al. | 2507.13302 | null |
| 2025-07-17 | AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research | Yilun Zhao et.al. | 2507.13300 | null |
| 2025-07-17 | Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management | Luis Gasco et.al. | 2507.13275 | null |
| 2025-07-17 | Automating Steering for Safe Multimodal Large Language Models | Lyucheng Wu et.al. | 2507.13255 | null |
| 2025-07-17 | HATS: Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models | Ashray Gupta et.al. | 2507.13238 | null |
| 2025-07-17 | Enhancing Cross-task Transfer of Large Language Models via Activation Steering | Xinyu Tang et.al. | 2507.13236 | null |
| 2025-07-18 | MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling | Etienne Le Naour et.al. | 2507.13207 | null |
| 2025-07-18 | Automatically assessing oral narratives of Afrikaans and isiXhosa children | Retief Louw et.al. | 2507.13205 | null |
| 2025-07-17 | GEMMAS: Graph-based Evaluation Metrics for Multi Agent Systems | Jisoo Lee et.al. | 2507.13190 | null |
| 2025-07-17 | Black Box Deployed -- Functional Criteria for Artificial Moral Agents in the LLM Era | Matthew E. Brophy et.al. | 2507.13175 | null |
| 2025-07-17 | Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities | Hao Sun et.al. | 2507.13158 | null |
| 2025-07-16 | Language Models Improve When Pretraining Data Matches Target Tasks | David Mizrahi et.al. | 2507.12466 | null |
| 2025-07-16 | PhysX: Physical-Grounded 3D Asset Generation | Ziang Cao et.al. | 2507.12465 | null |
| 2025-07-16 | CytoSAE: Interpretable Cell Embeddings for Hematology | Muhammed Furkan Dasdelen et.al. | 2507.12464 | null |
| 2025-07-16 | Mitigating Object Hallucinations via Sentence-Level Early Intervention | Shangpin Peng et.al. | 2507.12455 | null |
| 2025-07-16 | Characterizing State Space Model (SSM) and SSM-Transformer Hybrid Language Model Performance with Long Context Length | Saptarshi Mitra et.al. | 2507.12442 | null |
| 2025-07-16 | Describe Anything Model for Visual Question Answering on Text-rich Images | Yen-Linh Vu et.al. | 2507.12441 | null |
| 2025-07-16 | Can We Predict Alignment Before Models Finish Thinking? Towards Monitoring Misaligned Reasoning Models | Yik Siu Chan et.al. | 2507.12428 | null |
| 2025-07-16 | Advancing Retrieval-Augmented Generation for Structured Enterprise and Internal Data | Chandana Cheerla et.al. | 2507.12425 | null |
| 2025-07-16 | SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? | Xinyi He et.al. | 2507.12415 | null |
| 2025-07-16 | AutoVDC: Automated Vision Data Cleaning Using Vision-Language Models | Santosh Vasa et.al. | 2507.12414 | null |
| 2025-07-16 | ROC-n-reroll: How verifier imperfection affects test-time scaling | Florian E. Dorner et.al. | 2507.12399 | null |
| 2025-07-16 | Assessing the Value of Visual Input: A Benchmark of Multimodal Large Language Models for Robotic Path Planning | Jacinto Colan et.al. | 2507.12391 | null |
| 2025-07-16 | Probing for Arithmetic Errors in Language Models | Yucheng Sun et.al. | 2507.12379 | null |
| 2025-07-16 | Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker | Rachna Saxena et.al. | 2507.12378 | null |
| 2025-07-16 | Web-Browsing LLMs Can Access Social Media Profiles and Infer User Demographics | Meysam Alizadeh et.al. | 2507.12372 | null |
| 2025-07-16 | Beyond Single Models: Enhancing LLM Detection of Ambiguity in Requests through Debate | Ana Davila et.al. | 2507.12370 | null |
| 2025-07-16 | GitChameleon: Evaluating AI Code Generation Against Python Library Version Incompatibilities | Diganta Misra et.al. | 2507.12367 | null |
| 2025-07-16 | Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models | Samuel Lavoie et.al. | 2507.12318 | null |
| 2025-07-16 | Thought Purity: Defense Paradigm For Chain-of-Thought Attack | Zihao Xue et.al. | 2507.12314 | null |
| 2025-07-16 | Chain-of-Descriptions: Improving Code LLMs for VHDL Code Generation and Summarization | Prashanth Vijayaraghavan et.al. | 2507.12308 | null |
| 2025-07-15 | Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation | Zhen Xu et.al. | 2507.11540 | null |
| 2025-07-15 | Streaming 4D Visual Geometry Transformer | Dong Zhuo et.al. | 2507.11539 | null |
| 2025-07-15 | DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering | Yinsheng Li et.al. | 2507.11527 | null |
| 2025-07-15 | LLM-based ambiguity detection in natural language instructions for collaborative surgical robots | Ana Davila et.al. | 2507.11525 | null |
| 2025-07-15 | AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air | Shiyi Yang et.al. | 2507.11515 | null |
| 2025-07-15 | LRMR: LLM-Driven Relational Multi-node Ranking for Lymph Node Metastasis Assessment in Rectal Cancer | Yaoxian Dong et.al. | 2507.11457 | null |
| 2025-07-16 | Reasoning Strategies in Large Language Models: Can They Follow, Prefer, and Optimize? | Yanjian Zhang et.al. | 2507.11423 | null |
| 2025-07-15 | Quantifying the Energy Consumption and Carbon Emissions of LLM Inference via Simulations | Miray Özcan et.al. | 2507.11417 | null |
| 2025-07-15 | Seq vs Seq: An Open Suite of Paired Encoders and Decoders | Orion Weller et.al. | 2507.11412 | null |
| 2025-07-15 | KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning? | Soumadeep Saha et.al. | 2507.11408 | null |
| 2025-07-15 | EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes | LG AI Research et.al. | 2507.11407 | null |
| 2025-07-15 | DCR: Quantifying Data Contamination in LLMs Evaluation | Cheng Xu et.al. | 2507.11405 | null |
| 2025-07-15 | Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs | Gabriel Bo et.al. | 2507.11371 | null |
| 2025-07-15 | From Chaos to Automation: Enabling the Use of Unstructured Data for Robotic Process Automation | Kelly Kurowski et.al. | 2507.11364 | null |
| 2025-07-15 | What is the Best Process Model Representation? A Comparative Analysis for Process Modeling with Large Language Models | Alexis Brissard et.al. | 2507.11356 | null |
| 2025-07-15 | Foundation Models for Logistics: Toward Certifiable, Conversational Planning Interfaces | Yunhao Yang et.al. | 2507.11352 | null |
| 2025-07-15 | RefModel: Detecting Refactorings using Foundation Models | Pedro Simões et.al. | 2507.11346 | null |
| 2025-07-15 | Guiding LLM Decision-Making with Fairness Reward Models | Zara Hall et.al. | 2507.11344 | null |
| 2025-07-15 | MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network | Jianfei Jiang et.al. | 2507.11333 | null |
| 2025-07-16 | Automated Novelty Evaluation of Academic Paper: A Collaborative Approach Integrating Human and Large Language Model Knowledge | Wenqing Wu et.al. | 2507.11330 | null |
| 2025-07-14 | EmbRACE-3K: Embodied Reasoning and Action in Complex Environments | Mingxian Lin et.al. | 2507.10548 | null |
| 2025-07-14 | Fusing LLM Capabilities with Routing Data | Tao Feng et.al. | 2507.10540 | null |
| 2025-07-14 | Graph World Model | Tao Feng et.al. | 2507.10539 | null |
| 2025-07-14 | CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks | Hongchao Jiang et.al. | 2507.10535 | null |
| 2025-07-14 | Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination | Mingqi Wu et.al. | 2507.10532 | null |
| 2025-07-14 | Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation | Sangmin Bae et.al. | 2507.10524 | null |
| 2025-07-14 | Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI | Jiangkai Wu et.al. | 2507.10510 | null |
| 2025-07-14 | Scene-Aware Conversational ADAS with Generative AI for Real-Time Driver Assistance | Kyungtae Han et.al. | 2507.10500 | null |
| 2025-07-14 | Can You Detect the Difference? | İsmail Tarım et.al. | 2507.10475 | null |
| 2025-07-14 | MLAR: Multi-layer Large Language Model-based Robotic Process Automation Applicant Tracking | Mohamed T. Younes et.al. | 2507.10472 | null |
| 2025-07-14 | An Empirical Evaluation of AI-Powered Non-Player Characters' Perceived Realism and Performance in Virtual Reality Environments | Mikko Korkiakoski et.al. | 2507.10469 | null |
| 2025-07-14 | Logic layer Prompt Control Injection (LPCI): A Novel Security Vulnerability Class in Agentic Systems | Hammad Atta et.al. | 2507.10457 | null |
| 2025-07-14 | CoralVQA: A Large-Scale Visual Question Answering Dataset for Coral Reef Image Understanding | Hongyong Han et.al. | 2507.10449 | null |
| 2025-07-15 | Text-Visual Semantic Constrained AI-Generated Image Quality Assessment | Qiang Li et.al. | 2507.10432 | null |
| 2025-07-14 | Towards Emotion Co-regulation with LLM-powered Socially Assistive Robots: Integrating LLM Prompts and Robotic Behaviors to Support Parent-Neurodivergent Child Dyads | Jing Li et.al. | 2507.10427 | null |
| 2025-07-14 | Multiple Choice Learning of Low Rank Adapters for Language Modeling | Victor Letzelter et.al. | 2507.10419 | null |
| 2025-07-14 | Zorse: Optimizing LLM Training Efficiency on Heterogeneous GPU Clusters | Runsheng Benson Guo et.al. | 2507.10392 | null |
| 2025-07-14 | Extracting Important Tokens in E-Commerce Queries with a Tag Interaction-Aware Transformer Model | Md. Ahsanul Kabir et.al. | 2507.10385 | null |
| 2025-07-14 | Test-Time Canonicalization by Foundation Models for Robust Perception | Utkarsh Singhal et.al. | 2507.10375 | null |
| 2025-07-14 | Beyond Graph Model: Reliable VLM Fine-Tuning via Random Graph Adapter | Bo Jiang et.al. | 2507.10355 | null |
| 2025-07-11 | The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability? | Denis Sutter et.al. | 2507.08802 | null |
| 2025-07-11 | Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective | Hangjie Yuan et.al. | 2507.08801 | null |
| 2025-07-11 | KV Cache Steering for Inducing Reasoning in Small Language Models | Max Belitsky et.al. | 2507.08799 | null |
| 2025-07-11 | One Token to Fool LLM-as-a-Judge | Yulai Zhao et.al. | 2507.08794 | null |
| 2025-07-11 | From One to More: Contextual Part Latents for 3D Generation | Shaocong Dong et.al. | 2507.08772 | null |
| 2025-07-11 | BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity | Chenyang Song et.al. | 2507.08771 | null |
| 2025-07-11 | EqualMotion: Accessible Motion Capture for the Creative Industries | Clarice Hilton et.al. | 2507.08744 | null |
| 2025-07-11 | Multilingual Multimodal Software Developer for Code Generation | Linzheng Chai et.al. | 2507.08719 | null |
| 2025-07-11 | Unreal is all you need: Multimodal ISAC Data Simulation with Only One Engine | Kongwu Huang et.al. | 2507.08716 | null |
| 2025-07-11 | KG-Attention: Knowledge Graph-Guided Attention at Test-Time via Bidirectional Information Aggregation | Songlin Zhai et.al. | 2507.08704 | null |
| 2025-07-11 | ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way | Rajarshi Roy et.al. | 2507.08679 | null |
| 2025-07-11 | LLMCup: Ranking-Enhanced Comment Updating with LLMs | Hua Ge et.al. | 2507.08671 | null |
| 2025-07-11 | KELPS: A Framework for Verified Multi-Language Autoformalization via Semantic-Syntactic Alignment | Jiyao Zhang et.al. | 2507.08665 | null |
| 2025-07-11 | Introspection of Thought Helps AI Agents | Haoran Sun et.al. | 2507.08664 | null |
| 2025-07-11 | Leanabell-Prover-V2: Verifier-integrated Reasoning for Formal Theorem Proving via Reinforcement Learning | Xingguang Ji et.al. | 2507.08649 | null |
| 2025-07-11 | DatasetAgent: A Novel Multi-Agent System for Auto-Constructing Datasets from Real-World Images | Haoran Sun et.al. | 2507.08648 | null |
| 2025-07-11 | NL in the Middle: Code Translation with LLMs and Intermediate Representations | Chi-en Amy Tai et.al. | 2507.08627 | null |
| 2025-07-11 | Adaptive Framework for Ambient Intelligence in Rehabilitation Assistance | Gábor Baranyi et.al. | 2507.08624 | null |
| 2025-07-11 | A comprehensive study of LLM-based argument classification: from LLAMA through GPT-4o to Deepseek-R1 | Marcin Pietroń et.al. | 2507.08621 | null |
| 2025-07-11 | Agentic Large Language Models for Conceptual Systems Engineering and Design | Soheyl Massoudi et.al. | 2507.08619 | null |
| 2025-07-10 | Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs | Ziyue Li et.al. | 2507.07996 | null |
| 2025-07-10 | Multigranular Evaluation for Brain Visual Decoding | Weihao Xia et.al. | 2507.07993 | null |
| 2025-07-10 | Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs | Jeongseok Hyun et.al. | 2507.07990 | null |
| 2025-07-10 | Automating Expert-Level Medical Reasoning Evaluation of Large Language Models | Shuang Zhou et.al. | 2507.07988 | null |
| 2025-07-10 | CLIP Won't Learn Object-Attribute Binding from Natural Data and Here is Why | Bijay Gurung et.al. | 2507.07985 | null |
| 2025-07-10 | OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding | JingLi Lin et.al. | 2507.07984 | null |
| 2025-07-10 | Performance and Practical Considerations of Large and Small Language Models in Clinical Decision Support in Rheumatology | Sabine Felde et.al. | 2507.07983 | null |
| 2025-07-10 | Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling | Haoyu Wu et.al. | 2507.07982 | null |
| 2025-07-10 | Why is Your Language Model a Poor Implicit Reward Model? | Noam Razin et.al. | 2507.07981 | null |
| 2025-07-10 | Defending Against Prompt Injection With a Few DefensiveTokens | Sizhe Chen et.al. | 2507.07974 | null |
| 2025-07-10 | Scaling RL to Long Videos | Yukang Chen et.al. | 2507.07966 | null |
| 2025-07-10 | MIRIX: Multi-Agent Memory System for LLM-Based Agents | Yu Wang et.al. | 2507.07957 | null |
| 2025-07-10 | Dynamic Chunking for End-to-End Hierarchical Sequence Modeling | Sukjun Hwang et.al. | 2507.07955 | null |
| 2025-07-10 | Input Conditioned Layer Dropping in Speech Foundation Models | Abdul Hannan et.al. | 2507.07954 | null |
| 2025-07-10 | SAGE: A Visual Language Model for Anomaly Detection via Fact Enhancement and Entropy-aware Alignment | Guoxin Zang et.al. | 2507.07939 | null |
| 2025-07-10 | Can Large Language Models Improve Phishing Defense? A Large-Scale Controlled Experiment on Warning Dialogue Explanations | Federico Maria Cau et.al. | 2507.07916 | null |
| 2025-07-10 | MIRA: A Novel Framework for Fusing Modalities in Medical RAG | Jinhong Wang et.al. | 2507.07902 | null |
| 2025-07-10 | An Integrated Framework of Prompt Engineering and Multidimensional Knowledge Graphs for Legal Dispute Analysis | Mingda Zhang et.al. | 2507.07893 | null |
| 2025-07-10 | Automating MD simulations for Proteins using Large language Models: NAMD-Agent | Achuth Chandrasekhar et.al. | 2507.07887 | null |
| 2025-07-10 | Opting Out of Generative AI: a Behavioral Experiment on the Role of Education in Perplexity AI Avoidance | Roberto Ulloa et.al. | 2507.07881 | null |
| 2025-07-09 | Towards Multimodal Understanding via Stable Diffusion as a Task-Aware Feature Extractor | Vatsal Agarwal et.al. | 2507.07106 | null |
| 2025-07-09 | 4KAgent: Agentic Any Image to 4K Super-Resolution | Yushen Zuo et.al. | 2507.07105 | null |
| 2025-07-09 | Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models | Tiezheng Zhang et.al. | 2507.07104 | null |
| 2025-07-09 | Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful | Martin Marek et.al. | 2507.07101 | null |
| 2025-07-09 | Evaluating Attribute Confusion in Fashion Text-to-Image Generation | Ziyue Liu et.al. | 2507.07079 | null |
| 2025-07-09 | 5C Prompt Contracts: A Minimalist, Creative-Friendly, Token-Efficient Design Framework for Individual and SME LLM Usage | Ugur Ari et.al. | 2507.07045 | null |
| 2025-07-09 | UniConv: Unifying Retrieval and Response Generation for Large Language Models in Conversations | Fengran Mo et.al. | 2507.07030 | null |
| 2025-07-09 | FlexOlmo: Open Language Models for Flexible Data Use | Weijia Shi et.al. | 2507.07024 | null |
| 2025-07-09 | First Return, Entropy-Eliciting Explore | Tianyu Zheng et.al. | 2507.07017 | null |
| 2025-07-09 | Integrating Pathology Foundation Models and Spatial Transcriptomics for Cellular Decomposition from Histology Images | Yutong Sun et.al. | 2507.07013 | null |
| 2025-07-09 | GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning | S M Taslim Uddin Raju et.al. | 2507.07006 | null |
| 2025-07-09 | Learning Deliberately, Acting Intuitively: Unlocking Test-Time Reasoning in Multimodal LLMs | Yahan Yu et.al. | 2507.06999 | null |
| 2025-07-09 | MCA-RG: Enhancing LLMs with Medical Concept Alignment for Radiology Report Generation | Qilong Xing et.al. | 2507.06992 | null |
| 2025-07-09 | Are They All Good? Evaluating the Quality of CoTs in LLM-based Code Generation | Binquan Zhang et.al. | 2507.06980 | null |
| 2025-07-09 | Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM | Qiyuan Dai et.al. | 2507.06973 | null |
| 2025-07-09 | Scaling Towards the Information Boundary of Instruction Set: InfinityInstruct-Subject Technical Report | Li Du et.al. | 2507.06968 | null |
| 2025-07-09 | CheXPO: Preference Optimization for Chest X-ray VLMs with Counterfactual Rationale | Xiao Liang et.al. | 2507.06959 | null |
| 2025-07-09 | Investigating the Robustness of Retrieval-Augmented Generation at the Query Level | Sezen Perçin et.al. | 2507.06956 | null |
| 2025-07-10 | What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models | Keyon Vafa et.al. | 2507.06952 | null |
| 2025-07-10 | Rethinking Verification for LLM Code Generation: From Generation to Testing | Zihan Ma et.al. | 2507.06920 | null |
| 2025-07-08 | RSRefSeg 2: Decoupling Referring Remote Sensing Image Segmentation with Foundation Models | Keyan Chen et.al. | 2507.06231 | null |
| 2025-07-08 | Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers | Zhiyuan Peng et.al. | 2507.06223 | null |
| 2025-07-08 | Aligned Textual Scoring Rules | Yuxuan Lu et.al. | 2507.06221 | null |
| 2025-07-08 | Is Diversity All You Need for Scalable Robotic Manipulation? | Modi Shi et.al. | 2507.06219 | null |
| 2025-07-08 | CultureCLIP: Empowering CLIP with Cultural Awareness through Synthetic Images and Contextualized Captions | Yuchen Huang et.al. | 2507.06210 | null |
| 2025-07-08 | Ontological differentiation as a measure of semantic accuracy | Pablo Garcia-Cuadrillero et.al. | 2507.06208 | null |
| 2025-07-08 | Differential Mamba | Nadav Schneider et.al. | 2507.06204 | null |
| 2025-07-08 | A Survey on Latent Reasoning | Rui-Jie Zhu et.al. | 2507.06203 | null |
| 2025-07-08 | UQLM: A Python Package for Uncertainty Quantification in Large Language Models | Dylan Bouchard et.al. | 2507.06196 | null |
| 2025-07-08 | SQLBarber: A System Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads | Jiale Lao et.al. | 2507.06192 | null |
| 2025-07-08 | The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains | Scott Geng et.al. | 2507.06187 | null |
| 2025-07-08 | Hidden Prompts in Manuscripts Exploit AI-Assisted Peer Review | Zhicheng Lin et.al. | 2507.06185 | null |
| 2025-07-08 | Enhancing Scientific Visual Question Answering through Multimodal Reasoning and Ensemble Modeling | Prahitha Movva et.al. | 2507.06183 | null |
| 2025-07-08 | Data-Semantics-Aware Recommendation of Diverse Pivot Tables | Whanhee Cho et.al. | 2507.06171 | null |
| 2025-07-09 | Skywork-R1V3 Technical Report | Wei Shen et.al. | 2507.06167 | null |
| 2025-07-08 | Evaluation of Habitat Robotics using Large Language Models | William Li et.al. | 2507.06157 | null |
| 2025-07-08 | Large Language Models Predict Human Well-being -- But Not Equally Everywhere | Pat Pataranutaporn et.al. | 2507.06141 | null |
| 2025-07-08 | LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models | Zhihao Chen et.al. | 2507.06140 | null |
| 2025-07-08 | Coding Triangle: How Does Large Language Model Understand Code? | Taolin Zhang et.al. | 2507.06138 | null |
| 2025-07-08 | PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder Optimization | Dongsheng Zuo et.al. | 2507.06127 | null |
| 2025-07-07 | Beyond Simple Edits: X-Planner for Complex Instruction-Based Image Editing | Chun-Hsiao Yeh et.al. | 2507.05259 | null |
| 2025-07-07 | Spatio-Temporal LLM: Reasoning about Environments and Actions | Haozhen Zheng et.al. | 2507.05258 | null |
| 2025-07-07 | Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions | Yuanzhe Hu et.al. | 2507.05257 | null |
| 2025-07-07 | Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning | Yana Wei et.al. | 2507.05255 | null |
| 2025-07-07 | Response Attack: Exploiting Contextual Priming to Jailbreak Large Language Models | Ziqi Miao et.al. | 2507.05248 | null |
| 2025-07-07 | When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors | Scott Emmons et.al. | 2507.05246 | null |
| 2025-07-07 | StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling | Meng Wei et.al. | 2507.05240 | null |
| 2025-07-07 | Logit Reweighting for Topic-Focused Summarization | Joschka Braun et.al. | 2507.05235 | null |
| 2025-07-07 | NavigScene: Bridging Local Perception and Global Navigation for Beyond-Visual-Range Autonomous Driving | Qucheng Peng et.al. | 2507.05227 | null |
| 2025-07-07 | QuEst: Enhancing Estimates of Quantile-Based Distributional Measures Using Model Predictions | Zhun Deng et.al. | 2507.05220 | null |
| 2025-07-07 | All in One: Visual-Description-Guided Unified Point Cloud Segmentation | Zongyan Han et.al. | 2507.05211 | null |
| 2025-07-07 | MedGemma Technical Report | Andrew Sellergren et.al. | 2507.05201 | null |
| 2025-07-07 | Train-before-Test Harmonizes Language Model Rankings | Guanhua Zhang et.al. | 2507.05195 | null |
| 2025-07-07 | CREW-WILDFIRE: Benchmarking Agentic Multi-Agent Collaborations at Scale | Jonathan Hyun et.al. | 2507.05178 | null |
| 2025-07-08 | OpenS2S: Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model | Chen Wang et.al. | 2507.05177 | null |
| 2025-07-07 | Differential Attention for Multimodal Crisis Event Analysis | Nusrat Munia et.al. | 2507.05165 | null |
| 2025-07-07 | InfoSteer: Steering Information Utility in Language Model Post-Training | Chunyuan Deng et.al. | 2507.05158 | null |
| 2025-07-07 | AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models | Chinnappa Guggilla et.al. | 2507.05157 | null |
| 2025-07-07 | Interpretable Mnemonic Generation for Kanji Learning via Expectation-Maximization | Jaewook Lee et.al. | 2507.05137 | null |
| 2025-07-07 | LERa: Replanning with Visual Feedback in Instruction Following | Svyatoslav Pchelintsev et.al. | 2507.05135 | null |
| 2025-07-03 | Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation | Jiaer Xia et.al. | 2507.02859 | null |
| 2025-07-03 | Requirements Elicitation Follow-Up Question Generation | Yuchen Shen et.al. | 2507.02858 | null |
| 2025-07-03 | Answer Matching Outperforms Multiple Choice for Language Model Evaluation | Nikhil Chandak et.al. | 2507.02856 | null |
| 2025-07-03 | MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs | Purbesh Mitra et.al. | 2507.02851 | null |
| 2025-07-03 | LLM Hypnosis: Exploiting User Feedback for Unauthorized Knowledge Injection to All Users | Almog Hilel et.al. | 2507.02850 | null |
| 2025-07-03 | Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection | Ziqi Miao et.al. | 2507.02844 | null |
| 2025-07-03 | LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding | Yuchen Ma et.al. | 2507.02843 | null |
| 2025-07-03 | StepHint: Multi-level Stepwise Hints Enhance Reinforcement Learning to Reason | Kaiyi Zhang et.al. | 2507.02841 | null |
| 2025-07-03 | ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning | Ruiyang Zhou et.al. | 2507.02834 | null |
| 2025-07-03 | Generalizing Verifiable Instruction Following | Valentina Pyatkin et.al. | 2507.02833 | null |
| 2025-07-03 | SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model | Wencheng Zhang et.al. | 2507.02822 | null |
| 2025-07-03 | Multimodal Mathematical Reasoning with Diverse Solving Perspective | Wenhao Shi et.al. | 2507.02804 | null |
| 2025-07-03 | Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models | Riccardo Cantini et.al. | 2507.02799 | null |
| 2025-07-03 | No time to train! Training-Free Reference-Based Instance Segmentation | Miguel Espinosa et.al. | 2507.02798 | null |
| 2025-07-03 | From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding | Xiangfeng Wang et.al. | 2507.02790 | null |
| 2025-07-03 | Moral Responsibility or Obedience: What Do We Want from AI? | Joseph Boland et.al. | 2507.02788 | null |
| 2025-07-03 | Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs | Ken Tsui et.al. | 2507.02778 | null |
| 2025-07-03 | KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs | Yuzhang Xie et.al. | 2507.02773 | null |
| 2025-07-03 | DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment | Ke-Han Lu et.al. | 2507.02768 | null |
| 2025-07-03 | Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work | Guangwei Zhang et.al. | 2507.02760 | null |
| 2025-07-02 | How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks | Rahul Ramachandran et.al. | 2507.01955 | null |
| 2025-07-02 | Kwai Keye-VL Technical Report | Kwai Keye Team et.al. | 2507.01949 | null |
| 2025-07-02 | SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars | Xiaosheng Zhao et.al. | 2507.01939 | null |
| 2025-07-02 | The Thin Line Between Comprehension and Persuasion in LLMs | Adrian de Wynter et.al. | 2507.01936 | null |
| 2025-07-03 | Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations | Wenhao Wang et.al. | 2507.01930 | null |
| 2025-07-02 | A Survey on Vision-Language-Action Models: An Action Tokenization Perspective | Yifan Zhong et.al. | 2507.01925 | null |
| 2025-07-03 | Decision-Oriented Text Evaluation | Yu-Shiang Huang et.al. | 2507.01923 | null |
| 2025-07-02 | Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models | Chengao Li et.al. | 2507.01915 | null |
| 2025-07-02 | Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning | Qingdong He et.al. | 2507.01908 | null |
| 2025-07-02 | AI4Research: A Survey of Artificial Intelligence for Scientific Research | Qiguang Chen et.al. | 2507.01903 | null |
| 2025-07-02 | High-Layer Attention Pruning with Rescaling | Songtao Liu et.al. | 2507.01900 | null |
| 2025-07-02 | MiCoTA: Bridging the Learnability Gap with Intermediate CoT and Teacher Assistants | Dongyi Ding et.al. | 2507.01887 | null |
| 2025-07-02 | A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs | Niccolò McConnell et.al. | 2507.01881 | null |
| 2025-07-02 | Towards Foundation Auto-Encoders for Time-Series Anomaly Detection | Gastón García González et.al. | 2507.01875 | null |
| 2025-07-02 | DIY-MKG: An LLM-Based Polyglot Language Learning System | Kenan Tang et.al. | 2507.01872 | null |
| 2025-07-02 | Bridging UI Design and chatbot Interactions: Applying Form-Based Principles to Conversational Agents | Sanjay Krishna Anbalagan et.al. | 2507.01862 | null |
| 2025-07-02 | TypeTele: Releasing Dexterity in Teleoperation by Dexterous Manipulation Types | Yuhao Lin et.al. | 2507.01857 | null |
| 2025-07-02 | Eka-Eval : A Comprehensive Evaluation Framework for Large Language Models in Indian Languages | Samridhi Raj Sinha et.al. | 2507.01853 | null |
| 2025-07-02 | Low-Perplexity LLM-Generated Sequences and Where To Find Them | Arthur Wuhrmann et.al. | 2507.01844 | null |
| 2025-07-02 | MoIRA: Modular Instruction Routing Architecture for Multi-Task Robotics | Dmytro Kuzmenko et.al. | 2507.01843 | null |
| 2025-07-01 | Teaching Time Series to See and Speak: Forecasting with Aligned Visual and Textual Perspectives | Sixun Dong et.al. | 2506.24124 | null |
| 2025-06-30 | Calligrapher: Freestyle Text Image Customization | Yue Ma et.al. | 2506.24123 | null |
| 2025-06-30 | Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime | Yuqing Wang et.al. | 2506.24120 | null |
| 2025-07-01 | SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning | Bo Liu et.al. | 2506.24119 | null |
| 2025-07-01 | Intertextual Parallel Detection in Biblical Hebrew: A Transformer-Based Benchmark | David M. Smiley et.al. | 2506.24117 | null |
| 2025-06-30 | On the Predictive Power of Representation Dispersion in Language Models | Yanhong Li et.al. | 2506.24106 | null |
| 2025-06-30 | DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World | Xiangtai Li et.al. | 2506.24102 | null |
| 2025-06-30 | MotionGPT3: Human Motion as a Second Modality | Bingfan Zhu et.al. | 2506.24086 | null |
| 2025-06-30 | Logit-Gap Steering: Efficient Short-Suffix Jailbreaks for Aligned Large Language Models | Tung-Ling Li et.al. | 2506.24056 | null |
| 2025-06-30 | Agent.xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC | Xinming Wei et.al. | 2506.24045 | null |
| 2025-06-30 | A Survey on Vision-Language-Action Models for Autonomous Driving | Sicong Jiang et.al. | 2506.24044 | null |
| 2025-06-30 | Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data | Shubhabrata Mukherjee et.al. | 2506.24039 | null |
| 2025-06-30 | Ella: Embodied Social Agents with Lifelong Memory | Hongxin Zhang et.al. | 2506.24019 | null |
| 2025-06-30 | EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations | Hyunjong Kim et.al. | 2506.24016 | null |
| 2025-06-30 | Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective | Anselm R. Strohmaier et.al. | 2506.24006 | null |
| 2025-06-30 | The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models | Lijun Sheng et.al. | 2506.24000 | null |
| 2025-06-30 | Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning | Seungjun Yi et.al. | 2506.23998 | null |
| 2025-06-30 | StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving | Ruiyang Hao et.al. | 2506.23982 | null |
| 2025-06-30 | TaP: A Taxonomy-Guided Framework for Automated and Scalable Preference Data Generation | Renren Jin et.al. | 2506.23979 | null |
| 2025-06-30 | Visual and Memory Dual Adapter for Multi-Modal Object Tracking | Boyue Xu et.al. | 2506.23972 | null |
| 2025-06-27 | MiCo: Multi-image Contrast for Reinforcement Visual Reasoning | Xi Chen et.al. | 2506.22434 | null |
| 2025-06-27 | The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements | Bingchen Zhao et.al. | 2506.22419 | null |
| 2025-06-27 | Sequential Diagnosis with Language Models | Harsha Nori et.al. | 2506.22405 | null |
| 2025-06-27 | HyperCLOVA X THINK Technical Report | NAVER Cloud HyperCLOVA X Team et.al. | 2506.22403 | null |
| 2025-06-27 | Refining Czech GEC: Insights from a Multi-Experiment Approach | Petr Pechman et.al. | 2506.22402 | null |
| 2025-06-27 | QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization | Danush Khanna et.al. | 2506.22396 | null |
| 2025-06-27 | Test-Time Consistency in Vision Language Models | Shih-Han Chou et.al. | 2506.22395 | null |
| 2025-06-27 | What Makes ChatGPT Effective for Software Issue Resolution? An Empirical Study of Developer-ChatGPT Conversations in GitHub | Ramtin Ehsani et.al. | 2506.22390 | null |
| 2025-06-27 | Can Video Large Multimodal Models Think Like Doubters-or Double-Down: A Study on Defeasible Video Entailment | Yue Zhang et.al. | 2506.22385 | null |
| 2025-06-27 | Probabilistic Optimality for Inference-time Scaling | Youkang Wang et.al. | 2506.22376 | null |
| 2025-06-27 | Exploiting Vision Language Model for Training-Free 3D Point Cloud OOD Detection via Graph Score Propagation | Tiankai Chen et.al. | 2506.22375 | null |
| 2025-06-27 | Towards Fair Rankings: Leveraging LLMs for Gender Bias Detection and Measurement | Maryam Mousavian et.al. | 2506.22372 | null |
| 2025-06-27 | Can Large Language Models Help Students Prove Software Correctness? An Experimental Study with Dafny | Carolina Carreira et.al. | 2506.22370 | null |
| 2025-06-27 | DiffSoundStream: Efficient Speech Tokenization via Diffusion Decoding | Yang Yang et.al. | 2506.22362 | null |
| 2025-06-27 | Concept-Level AI for Telecom: Moving Beyond Large Language Models | Viswanath Kumarskandpriya et.al. | 2506.22359 | null |
| 2025-06-27 | Optimal Estimation of Watermark Proportions in Hybrid AI-Human Texts | Xiang Li et.al. | 2506.22343 | null |
| 2025-06-27 | Evaluating Scoring Bias in LLM-as-a-Judge | Qingquan Li et.al. | 2506.22316 | null |
| 2025-06-27 | Detection of Personal Data in Structured Datasets Using a Large Language Model | Albert Agisha Ntwali et.al. | 2506.22305 | null |
| 2025-06-27 | Rethinking Visual Token Reduction in LVLMs under Cross-modal Misalignment | Rui Xu et.al. | 2506.22283 | null |
| 2025-06-27 | COOCO -- Common Objects Out-of-Context -- Semantic Violation in Scenes: Investigating Multimodal Context in Referential Communication | Filippo Merlo et.al. | 2506.22274 | null |
| 2025-06-26 | Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test | Ziyue Li et.al. | 2506.21551 | null |
| 2025-06-26 | mTSBench: Benchmarking Multivariate Time Series Anomaly Detection and Model Selection at Scale | Xiaona Zhou et.al. | 2506.21550 | null |
| 2025-06-26 | SAM4D: Segment Anything in Camera and LiDAR Streams | Jianyun Xu et.al. | 2506.21547 | null |
| 2025-06-26 | Data Efficacy for Language Model Training | Yalun Dai et.al. | 2506.21545 | null |
| 2025-06-26 | PsyLite Technical Report | Fangjun Ding et.al. | 2506.21536 | null |
| 2025-06-26 | Exploring the Design Space of 3D MLLMs for CT Report Generation | Mohammed Baharoon et.al. | 2506.21535 | null |
| 2025-06-26 | "What's Up, Doc?": Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets | Akshay Paruchuri et.al. | 2506.21532 | null |
| 2025-06-26 | Potemkin Understanding in Large Language Models | Marina Mancoridis et.al. | 2506.21521 | null |
| 2025-06-26 | Assessing an evolutionary search engine for small language models, prompts, and evaluation metrics | Cláudio Lúcio do Val Lopes et.al. | 2506.21512 | null |
| 2025-06-26 | Mitigating Hallucination of Large Vision-Language Models via Dynamic Logits Calibration | Jiahe Chen et.al. | 2506.21509 | null |
| 2025-06-26 | skLEP: A Slovak General Language Understanding Benchmark | Marek Šuppa et.al. | 2506.21508 | null |
| 2025-06-26 | Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge | Boyu Gou et.al. | 2506.21506 | null |
| 2025-06-26 | Bridging Offline and Online Reinforcement Learning for LLMs | Jack Lanchantin et.al. | 2506.21495 | null |
| 2025-06-26 | Global and Local Entailment Learning for Natural World Imagery | Srikumar Sastry et.al. | 2506.21476 | null |
| 2025-06-26 | TopK Language Models | Ryosuke Takahashi et.al. | 2506.21468 | null |
| 2025-06-26 | Efficient and Reuseable Cloud Configuration Search Using Discovery Spaces | Michael Johnston et.al. | 2506.21467 | null |
| 2025-06-26 | Aligning Spoken Dialogue Models from User Interactions | Anne Wu et.al. | 2506.21463 | null |
| 2025-06-26 | Spatial Mental Modeling from Limited Views | Baiqiao Yin et.al. | 2506.21458 | null |
| 2025-06-26 | ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing | Huadai Liu et.al. | 2506.21448 | null |
| 2025-06-26 | Text2Cypher Across Languages: Evaluating Foundational Models Beyond English | Makbule Gulcin Ozsoy et.al. | 2506.21445 | null |
| 2025-06-25 | The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind | Andrei Lupu et.al. | 2506.20664 | null |
| 2025-06-25 | Memento: Note-Taking for Your Future Self | Chao Wan et.al. | 2506.20642 | null |
| 2025-06-25 | Towards Community-Driven Agents for Machine Learning Engineering | Sijie Li et.al. | 2506.20640 | null |
| 2025-06-26 | DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation | Shansan Gong et.al. | 2506.20639 | null |
| 2025-06-25 | Shape2Animal: Creative Animal Generation from Natural Silhouettes | Quoc-Duy Tran et.al. | 2506.20616 | null |
| 2025-06-25 | AI Assistants to Enhance and Exploit the PETSc Knowledge Base | Barry Smith et.al. | 2506.20608 | null |
| 2025-06-25 | Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or Harm | Baixiang Huang et.al. | 2506.20606 | null |
| 2025-06-25 | Video Perception Models for 3D Scene Synthesis | Rui Huang et.al. | 2506.20601 | null |
| 2025-06-25 | HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction | Zhonghao Shi et.al. | 2506.20566 | null |
| 2025-06-25 | Large Language Model-Driven Code Compliance Checking in Building Information Modeling | Soumya Madireddy et.al. | 2506.20551 | null |
| 2025-06-25 | When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs | Ammar Khairi et.al. | 2506.20544 | null |
| 2025-06-25 | WattsOnAI: Measuring, Analyzing, and Visualizing Energy and Carbon Footprint of AI Workloads | Hongzhen Huang et.al. | 2506.20535 | null |
| 2025-06-25 | Case-based Reasoning Augmented Large Language Model Framework for Decision Making in Realistic Safety-Critical Driving Scenarios | Wenbin Gan et.al. | 2506.20531 | null |
| 2025-06-25 | Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards | Charles Arnal et.al. | 2506.20520 | null |
| 2025-06-25 | OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling | Zengzhi Wang et.al. | 2506.20512 | null |
| 2025-06-25 | BotHash: Efficient and Training-Free Bot Detection Through Approximate Nearest Neighbor | Edoardo Di Paolo et.al. | 2506.20503 | null |
| 2025-06-25 | ReCode: Updating Code API Knowledge with Reinforcement Learning | Haoze Wu et.al. | 2506.20495 | null |
| 2025-06-25 | Brains and language models converge on a shared conceptual space across different languages | Zaid Zada et.al. | 2506.20489 | null |
| 2025-06-25 | Behavior Foundation Model: Towards Next-Generation Whole-Body Control System of Humanoid Robots | Mingqi Yuan et.al. | 2506.20487 | null |
| 2025-06-25 | Counterfactual Influence as a Distributional Quantity | Matthieu Meeus et.al. | 2506.20481 | null |
| 2025-06-24 | Unified Vision-Language-Action Model | Yuqi Wang et.al. | 2506.19850 | null |
| 2025-06-24 | Orthogonal Finetuning Made Scalable | Zeju Qiu et.al. | 2506.19847 | null |
| 2025-06-24 | JoyAgents-R1: Joint Evolution Dynamics for Versatile Multi-LLM Agents with Reinforcement Learning | Ai Han et.al. | 2506.19846 | null |
| 2025-06-24 | MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration | Yucheng Zhou et.al. | 2506.19835 | null |
| 2025-06-24 | Evaluating Compliance with Visualization Guidelines in Diagrams for Scientific Publications Using Large Vision Language Models | Johannes Rückert et.al. | 2506.19825 | null |
| 2025-06-24 | Persona Features Control Emergent Misalignment | Miles Wang et.al. | 2506.19823 | null |
| 2025-06-24 | CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation | Hao Li et.al. | 2506.19816 | null |
| 2025-06-24 | Curating art exhibitions using machine learning | Eurico Covas et.al. | 2506.19813 | null |
| 2025-06-24 | KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality | Baochang Ren et.al. | 2506.19807 | null |
| 2025-06-24 | LLM-Based Social Simulations Require a Boundary | Zengqing Wu et.al. | 2506.19806 | null |
| 2025-06-24 | KnowML: Improving Generalization of ML-NIDS with Attack Knowledge Graphs | Xin Fan Guo et.al. | 2506.19802 | null |
| 2025-06-24 | Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study | Yuqi Zhu et.al. | 2506.19794 | null |
| 2025-06-24 | SAGE: Strategy-Adaptive Generation Engine for Query Rewriting | Teng Wang et.al. | 2506.19783 | null |
| 2025-06-24 | Multi-Preference Lambda-weighted Listwise DPO for Dynamic Preference Alignment | Yuhui Sun et.al. | 2506.19780 | null |
| 2025-06-24 | SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning | Yuqian Fu et.al. | 2506.19767 | null |
| 2025-06-24 | Arabic Dialect Classification using RNNs, Transformers, and Large Language Models: A Comparative Analysis | Omar A. Essameldin et.al. | 2506.19753 | null |
| 2025-06-24 | Breaking Barriers: Do Reinforcement Post Training Gains Transfer To Unseen Domains? | Chuxuan Hu et.al. | 2506.19733 | null |
| 2025-06-24 | LLM-Driven Medical Document Analysis: Enhancing Trustworthy Pathology and Differential Diagnosis | Lei Kang et.al. | 2506.19702 | null |
| 2025-06-24 | Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models | Jungwoo Park et.al. | 2506.19697 | null |
| 2025-06-24 | UltraAD: Fine-Grained Ultrasound Anomaly Classification via Few-Shot CLIP Adaptation | Yue Zhou et.al. | 2506.19694 | null |
| 2025-06-23 | Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations | Jiaming Han et.al. | 2506.18898 | null |
| 2025-06-23 | ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs | Jiaru Zou et.al. | 2506.18896 | null |
| 2025-06-23 | Steering Conceptual Bias via Transformer Latent-Subspace Activation | Vansh Sharma et.al. | 2506.18887 | null |
| 2025-06-23 | Universal Video Temporal Grounding with Generative Multi-modal Large Language Models | Zeqian Li et.al. | 2506.18883 | null |
| 2025-06-23 | OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization | Yiyou Sun et.al. | 2506.18880 | null |
| 2025-06-23 | CommVQ: Commutative Vector Quantization for KV Cache Compression | Junyan Li et.al. | 2506.18879 | null |
| 2025-06-23 | OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation | Qijun Gan et.al. | 2506.18866 | null |
| 2025-06-23 | TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting | Zhongbin Guo et.al. | 2506.18862 | null |
| 2025-06-23 | LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning | Yuhao Wu et.al. | 2506.18841 | null |
| 2025-06-23 | STU-PID: Steering Token Usage via PID Controller for Efficient Large Language Model Reasoning | Aryasomayajula Ram Bharadwaj et.al. | 2506.18831 | null |
| 2025-06-23 | Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories | Islem Bouzenia et.al. | 2506.18824 | null |
| 2025-06-23 | RWESummary: A Framework and Test for Choosing Large Language Models to Summarize Real-World Evidence (RWE) Studies | Arjun Mukerji et.al. | 2506.18819 | null |
| 2025-06-23 | Context-Aware CodeLLM Eviction for AI-assisted Coding | Kishanthan Thangarajah et.al. | 2506.18796 | null |
| 2025-06-23 | TRIZ Agents: A Multi-Agent LLM Approach for TRIZ-Based Innovation | Kamil Szczepanik et.al. | 2506.18783 | null |
| 2025-06-23 | Existing LLMs Are Not Self-Consistent For Simple Tasks | Zhenru Lin et.al. | 2506.18781 | null |
| 2025-06-23 | Programming by Backprop: LLMs Acquire Reusable Algorithmic Abstractions During Code Training | Jonathan Cook et.al. | 2506.18777 | null |
| 2025-06-23 | Towards Group Fairness with Multiple Sensitive Attributes in Federated Foundation Models | Yuning Yang et.al. | 2506.18732 | null |
| 2025-06-23 | PARALLELPROMPT: Extracting Parallelism from Large Language Model Queries | Steven Kolawole et.al. | 2506.18728 | null |
| 2025-06-23 | Multi-modal Anchor Gated Transformer with Knowledge Distillation for Emotion Recognition in Conversation | Jie Li et.al. | 2506.18716 | link |
| 2025-06-23 | LLM-enhanced Interactions in Human-Robot Collaborative Drawing with Older Adults | Marianne Bossema et.al. | 2506.18711 | null |
| 2025-06-20 | VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning | Zhangyang Qi et.al. | 2506.17221 | null |
| 2025-06-20 | No Free Lunch: Rethinking Internal Feedback for LLM Reasoning | Yanzhi Zhang et.al. | 2506.17219 | null |
| 2025-06-20 | Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens | Zeyuan Yang et.al. | 2506.17218 | link |
| 2025-06-20 | BREAD: Branched Rollouts from Expert Anchors Bridge SFT & RL for Reasoning | Xuechen Zhang et.al. | 2506.17211 | null |
| 2025-06-20 | Fine-Tuning Lowers Safety and Disrupts Evaluation Consistency | Kathleen C. Fraser et.al. | 2506.17209 | null |
| 2025-06-20 | Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM- and Agent-Based Repair Systems | Matias Martinez et.al. | 2506.17208 | null |
| 2025-06-20 | DreamCube: 3D Panorama Generation via Multi-plane Synchronization | Yukun Huang et.al. | 2506.17206 | null |
| 2025-06-20 | Confidence Scoring for LLM-Generated SQL in Supply Chain Data Extraction | Jiekai Ma et.al. | 2506.17203 | null |
| 2025-06-20 | Detecting LLM-Generated Short Answers and Effects on Learner Performance | Shambhavi Bhushan et.al. | 2506.17196 | link |
| 2025-06-20 | CLEAR-3K: Assessing Causal Explanatory Capabilities in Language Models | Naiming Liu et.al. | 2506.17180 | null |
| 2025-06-20 | The MedPerturb Dataset: What Non-Content Perturbations Reveal About Human and Clinical LLM Decision Making | Abinitha Gourabathina et.al. | 2506.17163 | null |
| 2025-06-20 | Analyzing PDFs like Binaries: Adversarially Robust PDF Malware Analysis via Intermediate Representation and Language Model | Side Liu et.al. | 2506.17162 | null |
| 2025-06-20 | Do We Need Large VLMs for Spotting Soccer Actions? | Ritabrata Chakraborty et.al. | 2506.17144 | null |
| 2025-06-20 | MeDi: Metadata-Guided Diffusion Models for Mitigating Biases in Tumor Classification | David Jacob Drexlin et.al. | 2506.17140 | null |
| 2025-06-20 | Large Language Model Unlearning for Source Code | Xue Jiang et.al. | 2506.17125 | null |
| 2025-06-20 | When Can Model-Free Reinforcement Learning be Enough for Thinking? | Josiah P. Hanna et.al. | 2506.17124 | null |
| 2025-06-20 | Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs? | Adithya Bhaskar et.al. | 2506.17121 | link |
| 2025-06-20 | Reassessing Code Authorship Attribution in the Era of Language Models | Atish Kumar Dipongkor et.al. | 2506.17120 | null |
| 2025-06-20 | Are Bias Evaluation Methods Biased ? | Lina Berrayana et.al. | 2506.17111 | null |
| 2025-06-20 | Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving | Chuxue Cao et.al. | 2506.17104 | null |
| 2025-06-18 | PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning | Yuhui Shi et.al. | 2506.15683 | null |
| 2025-06-18 | GenRecal: Generation after Recalibration from Large to Small Vision-Language Models | Byung-Kwan Lee et.al. | 2506.15681 | null |
| 2025-06-18 | Dense SAE Latents Are Features, Not Bugs | Xiaoqing Sun et.al. | 2506.15679 | null |
| 2025-06-18 | SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence | Yao Zhang et.al. | 2506.15672 | null |
| 2025-06-18 | CC-LEARN: Cohort-based Consistency Learning | Xiao Ye et.al. | 2506.15662 | null |
| 2025-06-18 | PhishDebate: An LLM-Based Multi-Agent Framework for Phishing Website Detection | Wenhao Li et.al. | 2506.15656 | null |
| 2025-06-18 | AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning | Tevin Wang et.al. | 2506.15651 | null |
| 2025-06-18 | Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning | Ankan Deria et.al. | 2506.15649 | null |
| 2025-06-18 | deepSURF: Detecting Memory Safety Vulnerabilities in Rust Through Fuzzing LLM-Augmented Harnesses | Georgios Androutsopoulos et.al. | 2506.15648 | null |
| 2025-06-18 | Exploring and Exploiting the Inherent Efficiency within Large Reasoning Models for Self-Guided Efficiency Enhancement | Weixiang Zhao et.al. | 2506.15647 | null |
| 2025-06-18 | Demystifying the Visual Quality Paradox in Multimodal Large Language Models | Shuo Xing et.al. | 2506.15645 | null |
| 2025-06-18 | FindingDory: A Benchmark to Evaluate Memory in Embodied Agents | Karmesh Yadav et.al. | 2506.15635 | null |
| 2025-06-18 | Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability | Yusuke Sakai et.al. | 2506.15629 | null |
| 2025-06-18 | The Effect of State Representation on LLM Agent Behavior in Dynamic Routing Games | Lyle Goodyear et.al. | 2506.15624 | null |
| 2025-06-18 | The Compositional Architecture of Regret in Large Language Models | Xiangxiang Cui et.al. | 2506.15617 | null |
| 2025-06-18 | BoxFusion: Reconstruction-Free Open-Vocabulary 3D Object Detection via Real-Time Multi-View Box Fusion | Yuqing Lan et.al. | 2506.15610 | null |
| 2025-06-18 | LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning | Gabrel J. Perin et.al. | 2506.15606 | link |
| 2025-06-18 | LiteGD: Lightweight and dynamic GPU Dispatching for Large-scale Heterogeneous Clusters | Kunming Zhang et.al. | 2506.15595 | null |
| 2025-06-18 | WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts | Negar Foroutan et.al. | 2506.15594 | link |
| 2025-06-18 | DiscoSG: Towards Discourse-Level Text Scene Graph Parsing through Iterative Graph Refinement | Shaoqing Lin et.al. | 2506.15583 | link |
| 2025-06-17 | A Variational Framework for Improving Naturalness in Generative Spoken Language Models | Li-Wei Chen et.al. | 2506.14767 | link |
| 2025-06-17 | ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM | Yujun Wang et.al. | 2506.14766 | null |
| 2025-06-17 | Scaling-Up the Pretraining of the Earth Observation Foundation Model PhilEO to the MajorTOM Dataset | Nikolaos Dionelis et.al. | 2506.14765 | link |
| 2025-06-17 | RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills | Chunru Lin et.al. | 2506.14763 | null |
| 2025-06-17 | From Bytes to Ideas: Language Modeling with Autoregressive U-Nets | Mathurin Videau et.al. | 2506.14761 | link |
| 2025-06-17 | Reasoning with Exploration: An Entropy Perspective | Daixuan Cheng et.al. | 2506.14758 | null |
| 2025-06-17 | Large Language Models -- the Future of Fundamental Physics? | Caroline Heneka et.al. | 2506.14757 | null |
| 2025-06-17 | Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs | Ring Team et.al. | 2506.14731 | null |
| 2025-06-17 | AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes | Jiahao Qiu et.al. | 2506.14728 | null |
| 2025-06-17 | Casper: Inferring Diverse Intents for Assistive Teleoperation with Vision Language Models | Huihan Liu et.al. | 2506.14727 | null |
| 2025-06-17 | Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data | Anton Changalidis et.al. | 2506.14704 | link |
| 2025-06-17 | AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions | Aishan Liu et.al. | 2506.14697 | null |
| 2025-06-17 | Unified Software Engineering agent as AI Software Engineer | Leonhard Applis et.al. | 2506.14683 | null |
| 2025-06-17 | AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models | Ads Dawson et.al. | 2506.14682 | link |
| 2025-06-17 | Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality | Yuto Harada et.al. | 2506.14681 | null |
| 2025-06-17 | Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models | Ling Li et.al. | 2506.14674 | null |
| 2025-06-17 | StreetLens: Enabling Human-Centered AI Agents for Neighborhood Assessment from Street View Imagery | Jina Kim et.al. | 2506.14670 | null |
| 2025-06-17 | GuiLoMo: Allocating Expert Number and Rank for LoRA-MoE via Bilevel Optimization with GuidedSelection Vectors | Hengyuan Zhang et.al. | 2506.14646 | link |
| 2025-06-17 | Passing the Turing Test in Political Discourse: Fine-Tuning LLMs to Mimic Polarized Social Media Comments | . Pazzaglia et.al. | 2506.14645 | null |
| 2025-06-17 | Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot | Xiang Cheng et.al. | 2506.14641 | null |
| 2025-06-16 | Touch begins where vision ends: Generalizable policies for contact-rich manipulation | Zifan Zhao et.al. | 2506.13762 | null |
| 2025-06-16 | Prompting with the Future: Open-World Model Predictive Control with Interactive Digital Twins | Chuanruo Ning et.al. | 2506.13761 | null |
| 2025-06-16 | Discrete Diffusion in Large Language and Multimodal Models: A Survey | Runpeng Yu et.al. | 2506.13759 | link |
| 2025-06-16 | AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning | Zewei Zhou et.al. | 2506.13757 | link |
| 2025-06-16 | Steering LLM Thinking with Budget Guidance | Junyan Li et.al. | 2506.13752 | link |
| 2025-06-16 | Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability | Shova Kuikel et.al. | 2506.13746 | link |
| 2025-06-16 | Instruction Following by Boosting Attention of Large Language Models | Vitoria Guardieiro et.al. | 2506.13734 | null |
| 2025-06-16 | Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs | Sayed Mohammad Vakilzadeh Hatefi et.al. | 2506.13727 | link |
| 2025-06-16 | Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models | Arjun Krishna et.al. | 2506.13726 | null |
| 2025-06-16 | OTFusion: Bridging Vision-only and Vision-Language Models via Optimal Transport for Transductive Zero-Shot Learning | Qiyu Xu et.al. | 2506.13723 | null |
| 2025-06-16 | TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning | Junru Zhang et.al. | 2506.13705 | link |
| 2025-06-16 | Value-Free Policy Optimization via Reward Partitioning | Bilal Faye et.al. | 2506.13702 | link |
| 2025-06-16 | Balancing Knowledge Delivery and Emotional Comfort in Healthcare Conversational Systems | Shang-Chi Tsai et.al. | 2506.13692 | null |
| 2025-06-16 | What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers | Pulkit Gopalani et.al. | 2506.13688 | link |
| 2025-06-16 | An LLM's Apology: Outsourcing Awkwardness in the Age of AI | Twm Stone et.al. | 2506.13685 | link |
| 2025-06-16 | Turning Down the Heat: A Critical Analysis of Min-p Sampling in Language Models | Rylan Schaeffer et.al. | 2506.13681 | null |
| 2025-06-16 | ROSA: Harnessing Robot States for Vision-Language and Action Alignment | Yuqing Wen et.al. | 2506.13679 | null |
| 2025-06-16 | Prefix-Tuning+: Modernizing Prefix-Tuning through Attention Independent Prefix Data | Haonan Wang et.al. | 2506.13674 | null |
| 2025-06-16 | We Should Identify and Mitigate Third-Party Safety Risks in MCP-Powered Agent Systems | Junfeng Fang et.al. | 2506.13666 | link |
| 2025-06-16 | DesignCoder: Hierarchy-Aware and Self-Correcting UI Code Generation with Large Language Models | Yunnong Chen et.al. | 2506.13663 | null |
| 2025-06-13 | EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction | Hsi-Che Lin et.al. | 2506.12015 | null |
| 2025-06-13 | code_transformed: The Influence of Large Language Models on Code | Yuliang Xu et.al. | 2506.12014 | null |
| 2025-06-13 | Tracing LLM Reasoning Processes with Strategic Games: A Framework for Planning, Revision, and Resource-Constrained Decision Making | Xiaopeng Yuan et.al. | 2506.12012 | null |
| 2025-06-13 | Affogato: Learning Open-Vocabulary Affordance Grounding with Automated Data Generation at Scale | Junha Lee et.al. | 2506.12009 | null |
| 2025-06-13 | Generative Representational Learning of Foundation Models for Recommendation | Zheli Zhou et.al. | 2506.11999 | null |
| 2025-06-13 | pLSTM: parallelizable Linear Source Transition Mark networks | Korbinian Pöppel et.al. | 2506.11997 | null |
| 2025-06-13 | VGR: Visual Grounded Reasoning | Jiacong Wang et.al. | 2506.11991 | null |
| 2025-06-13 | How Visual Representations Map to Language Feature Space in Multimodal LLMs | Constantin Venhoff et.al. | 2506.11976 | null |
| 2025-06-13 | Improving Large Language Model Safety with Contrastive Representation Learning | Samuel Simko et.al. | 2506.11938 | link |
| 2025-06-13 | Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback | Dongwei Jiang et.al. | 2506.11930 | null |
| 2025-06-13 | LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? | Zihan Zheng et.al. | 2506.11928 | null |
| 2025-06-13 | GeistBERT: Breathing Life into German NLP | Raphael Scheible-Schmitt et.al. | 2506.11903 | null |
| 2025-06-13 | Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache | Xiaoran Liu et.al. | 2506.11886 | null |
| 2025-06-13 | Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment | Alejandro Peña et.al. | 2506.11880 | null |
| 2025-06-13 | A Short Survey on Formalising Software Requirements using Large Language Models | Arshad Beg et.al. | 2506.11874 | null |
| 2025-06-13 | Post Persona Alignment for Multi-Session Dialogue Generation | Yi-Pei Chen et.al. | 2506.11857 | null |
| 2025-06-13 | TrustGLM: Evaluating the Robustness of GraphLLMs Against Prompt, Text, and Structure Attacks | Qihai Zhang et.al. | 2506.11844 | null |
| 2025-06-13 | Your Ride, Your Rules: Psychology and Cognition Enabled Automated Driving Systems | Zhipeng Bao et.al. | 2506.11842 | null |
| 2025-06-13 | CLEAN-MI: A Scalable and Efficient Pipeline for Constructing High-Quality Neurodata in Motor Imagery Paradigm | Dingkun Liu et.al. | 2506.11830 | null |
| 2025-06-13 | Revealing Political Bias in LLMs through Structured Multi-Agent Debate | Aishwarya Bandaru et.al. | 2506.11825 | link |
| 2025-06-12 | AutoMind: Adaptive Knowledgeable Agent for Automated Data Science | Yixin Ou et.al. | 2506.10974 | link |
| 2025-06-12 | Farseer: A Refined Scaling Law in Large Language Models | Houyi Li et.al. | 2506.10972 | link |
| 2025-06-12 | Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs | Qizhe Zhang et.al. | 2506.10967 | link |
| 2025-06-12 | GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation | Ning Gao et.al. | 2506.10966 | null |
| 2025-06-12 | ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark | Kangwei Liu et.al. | 2506.10960 | link |
| 2025-06-12 | Distillation of atomistic foundation models across architectures and chemical domains | John L. A. Gardner et.al. | 2506.10956 | link |
| 2025-06-12 | SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks | Lianghong Guo et.al. | 2506.10954 | link |
| 2025-06-12 | Build the web for agents, not agents for the web | Xing Han Lù et.al. | 2506.10953 | null |
| 2025-06-12 | Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training | Mozhi Zhang et.al. | 2506.10952 | null |
| 2025-06-12 | Execution Guided Line-by-Line Code Generation | Boaz Lavon et.al. | 2506.10948 | link |
| 2025-06-12 | GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models | Evelyn Ma et.al. | 2506.10946 | null |
| 2025-06-12 | Self-Adapting Language Models | Adam Zweiger et.al. | 2506.10943 | null |
| 2025-06-12 | Dynamic Epistemic Friction in Dialogue | Timothy Obiso et.al. | 2506.10934 | null |
| 2025-06-12 | The Role of Generative AI in Facilitating Social Interactions: A Scoping Review | T. T. J. E. Arets et.al. | 2506.10927 | null |
| 2025-06-12 | Robustly Improving LLM Fairness in Realistic Settings via Interpretability | Adam Karvonen et.al. | 2506.10922 | link |
| 2025-06-12 | Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization | Or Shafran et.al. | 2506.10920 | link |
| 2025-06-12 | Sequential-Parallel Duality in Prefix Scannable Models | Morris Yau et.al. | 2506.10918 | null |
| 2025-06-12 | Foundation Models for Causal Inference via Prior-Data Fitted Networks | Yuchen Ma et.al. | 2506.10914 | null |
| 2025-06-12 | Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification? | Fei Lin et.al. | 2506.10912 | null |
| 2025-06-12 | NoLoCo: No-all-reduce Low Communication Training Method for Large Models | Jari Kolehmainen et.al. | 2506.10911 | link |
| 2025-06-11 | Flipping Against All Odds: Reducing LLM Coin Flip Bias via Verbalized Rejection Sampling | Tim Z. Xiao et.al. | 2506.09998 | null |
| 2025-06-11 | From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring | Yang Li et.al. | 2506.09996 | null |
| 2025-06-11 | Large Language Models for Toxic Language Detection in Low-Resource Balkan Languages | Amel Muminovic et.al. | 2506.09992 | link |
| 2025-06-11 | Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation | Xinyu Yang et.al. | 2506.09991 | null |
| 2025-06-11 | EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits | Ron Yosef et.al. | 2506.09988 | null |
| 2025-06-11 | A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs | Benno Krojer et.al. | 2506.09987 | null |
| 2025-06-11 | V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning | Mido Assran et.al. | 2506.09985 | link |
| 2025-06-11 | Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs | Hiroshi Matsuda et.al. | 2506.09983 | link |
| 2025-06-11 | AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation | Zijie Wu et.al. | 2506.09982 | null |
| 2025-06-11 | SRLAgent: Enhancing Self-Regulated Learning Skills through Gamification and LLM Assistance | Wentao Ge et.al. | 2506.09968 | null |
| 2025-06-11 | Resa: Transparent Reasoning Models via SAEs | Shangshang Wang et.al. | 2506.09967 | link |
| 2025-06-11 | Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing | Junfei Wu et.al. | 2506.09965 | link |
| 2025-06-11 | Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy | Sushant Gautam et.al. | 2506.09958 | null |
| 2025-06-11 | LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge | Sahar Abdelnabi et.al. | 2506.09956 | link |
| 2025-06-11 | Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking | Wuwei Zhang et.al. | 2506.09944 | link |
| 2025-06-11 | VerIF: Verification Engineering for Reinforcement Learning in Instruction Following | Hao Peng et.al. | 2506.09942 | link |
| 2025-06-11 | From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models | Irving Fang et.al. | 2506.09930 | null |
| 2025-06-11 | PersonaLens: A Benchmark for Personalization Evaluation in Conversational AI Assistants | Zheng Zhao et.al. | 2506.09902 | link |
| 2025-06-11 | The Emergence of Abstract Thought in Large Language Models Beyond Any Language | Yuxin Chen et.al. | 2506.09890 | null |
| 2025-06-11 | Attention Head Embeddings with Trainable Deep Kernels for Hallucination Detection in LLMs | Rodion Oblovatny et.al. | 2506.09886 | null |
| 2025-06-10 | VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning | Li Kang et.al. | 2506.09049 | null |
| 2025-06-10 | Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs | Yaniv Nikankin et.al. | 2506.09047 | link |
| 2025-06-10 | Agentic Neural Networks: Self-Evolving Multi-Agent Systems via Textual Backpropagation | Xiaowen Ma et.al. | 2506.09046 | null |
| 2025-06-10 | Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models | Xuanchi Ren et.al. | 2506.09042 | link |
| 2025-06-10 | Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better | Dianyi Wang et.al. | 2506.09040 | link |
| 2025-06-10 | AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions | Polina Kirichenko et.al. | 2506.09038 | link |
| 2025-06-10 | FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed | Sizhe Dang et.al. | 2506.09034 | null |
| 2025-06-10 | Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning | Haozhen Zhang et.al. | 2506.09033 | link |
| 2025-06-10 | Do MIL Models Transfer? | Daniel Shao et.al. | 2506.09022 | link |
| 2025-06-10 | SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning | Ruiqi Zhang et.al. | 2506.09016 | link |
| 2025-06-10 | Learning to Reason Across Parallel Samples for LLM Reasoning | Jianing Qi et.al. | 2506.09014 | null |
| 2025-06-10 | Boosting Rust Unit Test Coverage through Hybrid Program Analysis and Large Language Models | Bei Chu et.al. | 2506.09002 | null |
| 2025-06-10 | Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models | Chenyu Lian et.al. | 2506.08990 | link |
| 2025-06-10 | SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning | Xiao Liang et.al. | 2506.08989 | link |
| 2025-06-10 | On Finetuning Tabular Foundation Models | Ivan Rubachev et.al. | 2506.08982 | link |
| 2025-06-10 | AdaDec: Uncertainty-Guided Adaptive Decoding for LLM-based Code Generation | Kaifeng He et.al. | 2506.08980 | null |
| 2025-06-10 | Propositional Logic for Probing Generalization in Neural Networks | Anna Langedijk et.al. | 2506.08978 | null |
| 2025-06-10 | Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Scheduling System | Yuan Guo et.al. | 2506.08972 | null |
| 2025-06-10 | ADAM: Autonomous Discovery and Annotation Model using LLMs for Context-Aware Annotations | Amirreza Rouhi et.al. | 2506.08968 | null |
| 2025-06-10 | Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model | Ailin Huang et.al. | 2506.08967 | null |
| 2025-06-09 | GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior | Penghao Wu et.al. | 2506.08012 | null |
| 2025-06-09 | Play to Generalize: Learning to Reason Through Game Play | Yunfei Xie et.al. | 2506.08011 | link |
| 2025-06-09 | Vision Transformers Don't Need Trained Registers | Nick Jiang et.al. | 2506.08010 | link |
| 2025-06-09 | Hidden in plain sight: VLMs overlook their visual representations | Stephanie Fu et.al. | 2506.08008 | null |
| 2025-06-09 | Reinforcement Pre-Training | Qingxiu Dong et.al. | 2506.08007 | null |
| 2025-06-09 | Reparameterized LLM Training via Orthogonal Equivalence Transformation | Zeju Qiu et.al. | 2506.08001 | null |
| 2025-06-09 | Supporting Construction Worker Well-Being with a Multi-Agent Conversational AI System | Fan Yang et.al. | 2506.07997 | null |
| 2025-06-09 | HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization | Hongzheng Chen et.al. | 2506.07972 | link |
| 2025-06-09 | CyberV: Cybernetics for Test-time Scaling in Video Understanding | Jiahao Meng et.al. | 2506.07971 | link |
| 2025-06-09 | SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence | Ziyang Gong et.al. | 2506.07966 | link |
| 2025-06-09 | Reinforcing Multimodal Understanding and Generation with Dual Self-rewards | Jixiang Hong et.al. | 2506.07963 | null |
| 2025-06-09 | Correlated Errors in Large Language Models | Elliot Kim et.al. | 2506.07962 | null |
| 2025-06-09 | BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models | Peiyan Li et.al. | 2506.07961 | null |
| 2025-06-09 | Language Models over Canonical Byte-Pair Encodings | Tim Vieira et.al. | 2506.07956 | null |
| 2025-06-09 | TokenBreak: Bypassing Text Classification Models Through Token Manipulation | Kasimir Schulz et.al. | 2506.07948 | null |
| 2025-06-09 | Statistical Hypothesis Testing for Auditing Robustness in Language Models | Paulius Rauba et.al. | 2506.07947 | null |
| 2025-06-09 | ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols | Arnav Sheth et.al. | 2506.07945 | link |
| 2025-06-09 | Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations | Yizhen Li et.al. | 2506.07943 | null |
| 2025-06-09 | Adversarial Attack Classification and Robustness Testing for Large Language Models for Code | Yang Liu et.al. | 2506.07942 | null |
| 2025-06-09 | Gradients: When Markets Meet Fine-tuning -- A Distributed Approach to Model Optimisation | Christopher Subia-Waud et.al. | 2506.07940 | null |
| 2025-06-06 | TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation | Muhammad Sohail Danish et.al. | 2506.06281 | null |
| 2025-06-06 | Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias | Yuanzhe Hu et.al. | 2506.06280 | null |
| 2025-06-06 | CoMemo: LVLMs Need Image Context with Image Memory | Shi Liu et.al. | 2506.06279 | null |
| 2025-06-06 | Movie Facts and Fibs (MF |
Emmanouil Zaranis et.al. | 2506.06275 | null |
| 2025-06-06 | AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization | Mukur Gupta et.al. | 2506.06273 | null |
| 2025-06-06 | RecGPT: A Foundation Model for Sequential Recommendation | Yangqin Jiang et.al. | 2506.06270 | link |
| 2025-06-06 | Cartridges: Lightweight and general-purpose long context representations via self-study | Sabri Eyuboglu et.al. | 2506.06266 | null |
| 2025-06-06 | PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time | Weizhi Zhang et.al. | 2506.06254 | null |
| 2025-06-06 | DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation | Jingyu Xiao et.al. | 2506.06251 | link |
| 2025-06-06 | Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models | Zahra Babaiee et.al. | 2506.06242 | null |
| 2025-06-06 | Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge | Yi Sui et.al. | 2506.06240 | null |
| 2025-06-06 | Explaining Matters: Leveraging Definitions and Semantic Expansion for Sexism Detection | Sahrish Khan et.al. | 2506.06238 | null |
| 2025-06-06 | Challenging Vision-Language Models with Surgical Data: A New Dataset and Broad Benchmarking Study | Leon Mayer et.al. | 2506.06232 | null |
| 2025-06-06 | CompilerGPT: Leveraging Large Language Models for Analyzing and Acting on Compiler Optimization Reports | Peter Pirkelbauer et.al. | 2506.06227 | null |
| 2025-06-06 | PROVSYN: Synthesizing Provenance Graphs for Data Augmentation in Intrusion Detection Systems | Yi Huang et.al. | 2506.06226 | null |
| 2025-06-06 | GenIR: Generative Visual Feedback for Mental Image Retrieval | Diji Yang et.al. | 2506.06220 | null |
| 2025-06-06 | STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving | Christian Fruhwirth-Reisinger et.al. | 2506.06218 | link |
| 2025-06-06 | Corrector Sampling in Language Models | Itai Gat et.al. | 2506.06215 | null |
| 2025-06-06 | Can Theoretical Physics Research Benefit from Language Agents? | Sirui Lu et.al. | 2506.06214 | null |
| 2025-06-06 | PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts | Hengzhi Li et.al. | 2506.06211 | null |
| 2025-06-05 | Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets | Lei Hsiung et.al. | 2506.05346 | null |
| 2025-06-05 | SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs | Jiahui Wang et.al. | 2506.05344 | link |
| 2025-06-05 | Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning | Xingjian Ran et.al. | 2506.05341 | null |
| 2025-06-05 | Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models | Anirudh Bharadwaj et.al. | 2506.05339 | link |
| 2025-06-05 | VideoMolmo: Spatio-Temporal Grounding Meets Pointing | Ghazi Shazan Ahmad et.al. | 2506.05336 | link |
| 2025-06-05 | Search Arena: Analyzing Search-Augmented LLMs | Mihran Miroyan et.al. | 2506.05334 | link |
| 2025-06-05 | Unleashing Hour-Scale Video Training for Long Video-Language Understanding | Jingyang Lin et.al. | 2506.05332 | null |
| 2025-06-05 | MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning | Xinyan Chen et.al. | 2506.05331 | link |
| 2025-06-05 | LSM-2: Learning from Incomplete Wearable Sensor Data | Maxwell A. Xu et.al. | 2506.05321 | null |
| 2025-06-06 | Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs | Haoyuan Li et.al. | 2506.05318 | null |
| 2025-06-05 | Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay | Yifan Sun et.al. | 2506.05316 | null |
| 2025-06-05 | Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models | Taha Entesari et.al. | 2506.05314 | null |
| 2025-06-05 | ProRefine: Inference-time Prompt Refinement with Textual Feedback | Deepak Pandita et.al. | 2506.05305 | null |
| 2025-06-05 | Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos | Weifeng Lin et.al. | 2506.05302 | null |
| 2025-06-05 | Power Law Guided Dynamic Sifting for Efficient Attention | Nirav Koley et.al. | 2506.05300 | null |
| 2025-06-05 | Control Tax: The Price of Keeping AI in Check | Mikhail Terekhov et.al. | 2506.05296 | null |
| 2025-06-05 | Sample Complexity and Representation Ability of Test-time Scaling Paradigms | Baihe Huang et.al. | 2506.05295 | null |
| 2025-06-05 | EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World? | Yuqian Yuan et.al. | 2506.05287 | null |
| 2025-06-05 | Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning | Nan Huo et.al. | 2506.05278 | null |
| 2025-06-05 | Teaming in the AI Era: AI-Augmented Frameworks for Forming, Simulating, and Optimizing Human Teams | Mohammed Almutairi et.al. | 2506.05265 | null |
| 2025-06-04 | OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis | Junting Chen et.al. | 2506.04217 | link |
| 2025-06-04 | Language-Image Alignment with Fixed Text Encoders | Jingfeng Yang et.al. | 2506.04209 | null |
| 2025-06-04 | Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning | Shuang Chen et.al. | 2506.04207 | null |
| 2025-06-04 | EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation | Jinghan Jia et.al. | 2506.04205 | link |
| 2025-06-04 | Cascadia: A Cascade Serving System for Large Language Models | Youhe Jiang et.al. | 2506.04203 | null |
| 2025-06-04 | TracLLM: A Generic Framework for Attributing Long Context LLMs | Yanting Wang et.al. | 2506.04202 | link |
| 2025-06-04 | R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning | Qingfei Zhao et.al. | 2506.04185 | link |
| 2025-06-04 | SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models | Yuhao Wu et.al. | 2506.04180 | null |
| 2025-06-04 | SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling | Anhao Zhao et.al. | 2506.04179 | null |
| 2025-06-04 | Does Prompt Design Impact Quality of Data Imputation by LLMs? | Shreenidhi Srinivasan et.al. | 2506.04172 | null |
| 2025-06-04 | VISCA: Inferring Component Abstractions for Automated End-to-End Testing | Parsa Alian et.al. | 2506.04161 | null |
| 2025-06-04 | Image Editing As Programs with Diffusion Models | Yujia Hu et.al. | 2506.04158 | null |
| 2025-06-04 | A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization | Sarvesh Soni et.al. | 2506.04156 | null |
| 2025-06-04 | Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis | Kejian Zhu et.al. | 2506.04142 | null |
| 2025-06-04 | MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos | Kejian Zhu et.al. | 2506.04141 | null |
| 2025-06-04 | TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems | Shaina Raza et.al. | 2506.04133 | null |
| 2025-06-04 | Recent Advances in Medical Image Classification | Loan Dao et.al. | 2506.04129 | null |
| 2025-06-04 | Guided Speculative Inference for Efficient Test-Time Alignment of LLMs | Jonathan Geuter et.al. | 2506.04118 | link |
| 2025-06-05 | Rectified Sparse Attention | Yutao Sun et.al. | 2506.04108 | null |
| 2025-06-04 | TextAtari: 100K Frames Game Playing with Language Agents | Wenhao Li et.al. | 2506.04098 | link |
| 2025-06-03 | Causal Estimation of Tokenisation Bias | Pietro Lesci et.al. | 2506.03149 | null |
| 2025-06-03 | UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation | Bin Lin et.al. | 2506.03147 | null |
| 2025-06-03 | Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM | Pralaypati Ta et.al. | 2506.03145 | null |
| 2025-06-03 | Not All Tokens Are Meant to Be Forgotten | Xiangyu Zhou et.al. | 2506.03142 | null |
| 2025-06-03 | SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation | Siqi Chen et.al. | 2506.03139 | null |
| 2025-06-03 | OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models | Mengdi Jia et.al. | 2506.03135 | null |
| 2025-06-03 | Native-Resolution Image Synthesis | Zidong Wang et.al. | 2506.03131 | null |
| 2025-06-03 | AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation | Lu Qiu et.al. | 2506.03126 | null |
| 2025-06-03 | AUTOCIRCUIT-RL: Reinforcement Learning-Driven LLM for Automated Circuit Topology Generation | Prashanth Vijayaraghavan et.al. | 2506.03122 | null |
| 2025-06-03 | Targeted Forgetting of Image Subgroups in CLIP Models | Zeliang Zhang et.al. | 2506.03117 | null |
| 2025-06-04 | Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback | Xiaoying Zhang et.al. | 2506.03106 | null |
| 2025-06-03 | Beyond Text Compression: Evaluating Tokenizers Across Scales | Jonas F. Lotz et.al. | 2506.03101 | null |
| 2025-06-03 | TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models | Chetwin Low et.al. | 2506.03099 | null |
| 2025-06-03 | EgoVLM: Policy Optimization for Egocentric Video Understanding | Ashwin Vinod et.al. | 2506.03097 | link |
| 2025-06-03 | DPO Learning with LLMs-Judge Signal for Computer Use Agents | Man Luo et.al. | 2506.03095 | null |
| 2025-06-03 | From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit | Valérie Costa et.al. | 2506.03093 | null |
| 2025-06-03 | Literary Evidence Retrieval via Long-Context Language Models | Katherine Thai et.al. | 2506.03090 | null |
| 2025-06-03 | StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs | Qijun Luo et.al. | 2506.03077 | null |
| 2025-06-03 | LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM | Roman Titkov et.al. | 2506.03073 | null |
| 2025-06-03 | EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models | Mingzhe Li et.al. | 2506.03067 | null |
| 2025-05-30 | ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL | Yu Zhang et.al. | 2505.24875 | null |
| 2025-05-30 | The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models | Adam Stein et.al. | 2505.24874 | link |
| 2025-05-30 | ProxyThinker: Test-Time Guidance through Small Visual Reasoners | Zilin Xiao et.al. | 2505.24872 | link |
| 2025-05-30 | MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning | Yiqing Liang et.al. | 2505.24871 | null |
| 2025-05-30 | GenSpace: Benchmarking Spatially-Aware Image Generation | Zehan Wang et.al. | 2505.24870 | null |
| 2025-05-30 | SiLVR: A Simple Language-based Video Reasoning Framework | Ce Zhang et.al. | 2505.24869 | link |
| 2025-05-30 | Time Blindness: Why Video-Language Models Can't See What Humans Can? | Ujjwal Upadhyay et.al. | 2505.24867 | null |
| 2025-05-30 | ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models | Mingjie Liu et.al. | 2505.24864 | link |
| 2025-05-30 | Beyond Multiple Choice: Evaluating Steering Vectors for Adaptive Free-Form Summarization | Joschka Braun et.al. | 2505.24859 | null |
| 2025-05-30 | Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking | Heli Ben-Hamu et.al. | 2505.24857 | null |
| 2025-05-30 | MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning | Jingyan Shen et.al. | 2505.24846 | null |
| 2025-05-30 | Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning | Wanyun Xie et.al. | 2505.24844 | link |
| 2025-05-30 | Cascading Adversarial Bias from Injection to Distillation in Language Models | Harsh Chaudhari et.al. | 2505.24842 | null |
| 2025-05-30 | Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck | Yuwen Tan et.al. | 2505.24840 | null |
| 2025-05-30 | VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software | Brandon Man et.al. | 2505.24838 | link |
| 2025-06-02 | How much do language models memorize? | John X. Morris et.al. | 2505.24832 | null |
| 2025-05-30 | Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs | Juraj Vladika et.al. | 2505.24830 | null |
| 2025-05-30 | LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal Text | Li yunhan et.al. | 2505.24826 | link |
| 2025-05-30 | PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models | Yinggan Xu et.al. | 2505.24823 | null |
| 2025-05-30 | Bi-Manual Joint Camera Calibration and Scene Representation | Haozhan Tang et.al. | 2505.24819 | null |
| 2025-05-29 | TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models | Yao Xiao et.al. | 2505.23769 | link |
| 2025-05-29 | Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought | Yunze Man et.al. | 2505.23766 | null |
| 2025-05-29 | From Chat Logs to Collective Insights: Aggregative Question Answering | Wentao Zhang et.al. | 2505.23765 | null |
| 2025-05-29 | MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence | Sihan Yang et.al. | 2505.23764 | null |
| 2025-05-29 | ZeroGUI: Automating Online GUI Learning at Zero Human Cost | Chenyu Yang et.al. | 2505.23762 | link |
| 2025-05-29 | Differential Information: An Information-Theoretic Perspective on Preference Optimization | Yunjae Won et.al. | 2505.23761 | null |
| 2025-05-29 | Puzzled by Puzzles: When Vision-Language Models Can't Take a Hint | Heekyung Lee et.al. | 2505.23759 | link |
| 2025-05-29 | DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning | Ziyin Zhang et.al. | 2505.23754 | link |
| 2025-05-29 | ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks | Akashah Shabbir et.al. | 2505.23752 | link |
| 2025-05-29 | Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences? | Paul Gölz et.al. | 2505.23749 | null |
| 2025-05-29 | Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence | Diankun Wu et.al. | 2505.23747 | null |
| 2025-05-29 | To Trust Or Not To Trust Your Vision-Language Model's Prediction | Hao Dong et.al. | 2505.23745 | link |
| 2025-05-29 | LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization | Ronghuan Wu et.al. | 2505.23740 | null |
| 2025-05-29 | ATLAS: Learning to Optimally Memorize the Context at Test Time | Ali Behrouz et.al. | 2505.23735 | null |
| 2025-05-29 | Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time | Mohamad Chehade et.al. | 2505.23729 | null |
| 2025-05-29 | PixelThink: Towards Efficient Chain-of-Pixel Reasoning | Song Wang et.al. | 2505.23727 | null |
| 2025-05-29 | FMG-Det: Foundation Model Guided Robust Object Detection | Darryl Hannan et.al. | 2505.23726 | null |
| 2025-05-29 | MuLoCo: Muon is a practical inner optimizer for DiLoCo | Benjamin Thérien et.al. | 2505.23725 | null |
| 2025-05-29 | SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA | Minrui Luo et.al. | 2505.23724 | null |
| 2025-05-29 | ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering | Zexi Liu et.al. | 2505.23723 | link |
| 2025-05-28 | Zero-Shot Vision Encoder Grafting via LLM Surrogates | Kaiyu Yue et.al. | 2505.22664 | link |
| 2025-05-28 | Training Free Stylized Abstraction | Aimon Rahman et.al. | 2505.22663 | null |
| 2025-05-28 | AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models | Feng Luo et.al. | 2505.22662 | null |
| 2025-05-28 | GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning | Qingchen Yu et.al. | 2505.22661 | null |
| 2025-05-28 | Maximizing Confidence Alone Improves Reasoning | Mihir Prabhudesai et.al. | 2505.22660 | null |
| 2025-05-28 | 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model | Wenbo Hu et.al. | 2505.22657 | null |
| 2025-05-28 | Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents | Michael Kirchhof et.al. | 2505.22655 | null |
| 2025-05-28 | VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models | Ce Zhang et.al. | 2505.22654 | null |
| 2025-05-28 | The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason | Ang Lv et.al. | 2505.22653 | null |
| 2025-05-28 | Sherlock: Self-Correcting Reasoning in Vision-Language Models | Yi Ding et.al. | 2505.22651 | null |
| 2025-05-28 | Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese | Hanjia Lyu et.al. | 2505.22645 | link |
| 2025-05-28 | Understanding (Un)Reliability of Steering Vectors in Language Models | Joschka Braun et.al. | 2505.22637 | null |
| 2025-05-28 | Learning Composable Chains-of-Thought | Fangcong Yin et.al. | 2505.22635 | null |
| 2025-05-28 | Spatial Knowledge Graph-Guided Multimodal Synthesis | Yida Xue et.al. | 2505.22633 | null |
| 2025-05-28 | Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs | Ziling Cheng et.al. | 2505.22630 | null |
| 2025-05-28 | Principled Out-of-Distribution Generalization via Simplicity | Jiawei Ge et.al. | 2505.22622 | null |
| 2025-05-28 | Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding | Chengyue Wu et.al. | 2505.22618 | null |
| 2025-05-28 | The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models | Ganqu Cui et.al. | 2505.22617 | null |
| 2025-05-28 | RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction | Yuchi Wang et.al. | 2505.22613 | null |
| 2025-05-28 | Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates | Haoning Xu et.al. | 2505.22608 | null |
| 2025-05-27 | Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making | Yihan Wang et.al. | 2505.21503 | null |
| 2025-05-27 | ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models | Dingming Li et.al. | 2505.21500 | null |
| 2025-05-27 | AdInject: Real-World Black-Box Attacks on Web Agents via Advertising Delivery | Haowei Wang et.al. | 2505.21499 | link |
| 2025-05-27 | Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment | Xiaojun Jia et.al. | 2505.21494 | link |
| 2025-05-27 | Reinforcing General Reasoning without Verifiers | Xiangxin Zhou et.al. | 2505.21493 | link |
| 2025-05-27 | Robust Hypothesis Generation: LLM-Automated Language Bias for Inductive Logic Programming | Yang Yang et.al. | 2505.21486 | null |
| 2025-05-27 | Are Language Models Consequentialist or Deontological Moral Reasoners? | Keenan Samway et.al. | 2505.21479 | null |
| 2025-05-27 | Policy Optimized Text-to-Image Pipeline Design | Uri Gadot et.al. | 2505.21478 | null |
| 2025-05-27 | Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration | Mehrdad Fazli et.al. | 2505.21472 | null |
| 2025-05-27 | Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration | Zijun Liu et.al. | 2505.21471 | link |
| 2025-05-27 | Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion | Zhanqiu Hu et.al. | 2505.21467 | null |
| 2025-05-27 | ID-Align: RoPE-Conscious Position Remapping for Dynamic High-Resolution Adaptation in Vision-Language Models | Bozhou Li et.al. | 2505.21465 | null |
| 2025-05-27 | LazyVLM: Neuro-Symbolic Approach to Video Analytics | Xiangru Jian et.al. | 2505.21459 | null |
| 2025-05-27 | Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance | Shintaro Ozaki et.al. | 2505.21458 | null |
| 2025-05-27 | Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO | Muzhi Zhu et.al. | 2505.21457 | null |
| 2025-05-27 | Can Large Reasoning Models Self-Train? | Sheikh Shafayat et.al. | 2505.21444 | null |
| 2025-05-27 | Towards Better Instruction Following Retrieval Models | Yuchen Zhuang et.al. | 2505.21439 | null |
| 2025-05-27 | Hume: Introducing System-2 Thinking in Visual-Language-Action Model | Haoming Song et.al. | 2505.21432 | null |
| 2025-05-27 | Policy Induction: Predicting Startup Success via Explainable Memory-Augmented In-Context Learning | Xianling Mu et.al. | 2505.21427 | null |
| 2025-05-27 | GUARD:Dual-Agent based Backdoor Defense on Chain-of-Thought in Neural Code Generation | Naizhu Jin et.al. | 2505.21425 | null |
| 2025-05-26 | Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs | Hanting Chen et.al. | 2505.20155 | null |
| 2025-05-26 | UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models | Xueyan Zhang et.al. | 2505.20154 | null |
| 2025-05-26 | MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents | Ziming Wei et.al. | 2505.20148 | link |
| 2025-05-26 | FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities | Jin Wang et.al. | 2505.20147 | null |
| 2025-05-26 | SeMe: Training-Free Language Model Merging via Semantic Alignment | Jian Gu et.al. | 2505.20144 | null |
| 2025-05-26 | StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs | Jialin Yang et.al. | 2505.20139 | null |
| 2025-05-26 | AweDist: Attention-aware Embedding Distillation for New Input Token Embeddings | Konstantin Dobler et.al. | 2505.20133 | null |
| 2025-05-26 | Agentic 3D Scene Generation with Spatially Contextualized VLMs | Xinhang Liu et.al. | 2505.20129 | null |
| 2025-05-26 | Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers | Zhengliang Shi et.al. | 2505.20128 | link |
| 2025-05-26 | Agentic AI Process Observability: Discovering Behavioral Variability | Fabiana Fournier et.al. | 2505.20127 | null |
| 2025-05-26 | MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models | Anh Thai et.al. | 2505.20122 | null |
| 2025-05-27 | TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent | Dominik Meier et.al. | 2505.20118 | link |
| 2025-05-26 | Named Entity Recognition in Historical Italian: The Case of Giacomo Leopardi's Zibaldone | Cristian Santini et.al. | 2505.20113 | null |
| 2025-05-26 | ResSVD: Residual Compensated SVD for Large Language Model Compression | Haolei Bai et.al. | 2505.20112 | null |
| 2025-05-26 | Language-Agnostic Suicidal Risk Detection Using Large Language Models | June-Woo Kim et.al. | 2505.20109 | null |
| 2025-05-26 | Adaptive Deep Reasoning: Triggering Deep Thinking When Needed | Yunhao Wang et.al. | 2505.20101 | null |
| 2025-05-26 | AdaTP: Attention-Debiased Token Pruning for Video Large Language Models | Fengyuan Sun et.al. | 2505.20100 | null |
| 2025-05-26 | Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities | Chuangtao Ma et.al. | 2505.20099 | link |
| 2025-05-26 | S2LPP: Small-to-Large Prompt Prediction across LLMs | Liang Cheng et.al. | 2505.20097 | null |
| 2025-05-26 | Multi-Domain Explainability of Preferences | Nitay Calderon et.al. | 2505.20088 | null |
| 2025-05-26 | Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models | Makesh Narsimhan Sreedhar et.al. | 2505.20087 | null |
| 2025-05-26 | Inference-time Alignment in Continuous Space | Yige Yuan et.al. | 2505.20081 | link |
| 2025-05-23 | Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs | Wafa Alghallabi et.al. | 2505.18152 | link |
| 2025-05-23 | First Finish Search: Efficient Test-Time Scaling in Large Language Models | Aradhye Agarwal et.al. | 2505.18149 | null |
| 2025-05-23 | Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find | Owen Bianchi et.al. | 2505.18148 | null |
| 2025-05-23 | Graph-Linguistic Fusion: Using Language Models for Wikidata Vandalism Detection | Mykola Trokhymovych et.al. | 2505.18136 | null |
| 2025-05-23 | Gaming Tool Preferences in Agentic LLMs | Kazem Faghih et.al. | 2505.18135 | link |
| 2025-05-23 | VideoGameBench: Can Vision-Language Models complete popular video games? | Alex L. Zhang et.al. | 2505.18134 | null |
| 2025-05-23 | One RL to See Them All: Visual Triple Unified Reinforcement Learning | Yan Ma et.al. | 2505.18129 | null |
| 2025-05-23 | Reward Model Overoptimisation in Iterated RLHF | Lorenz Wolf et.al. | 2505.18126 | null |
| 2025-05-23 | TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations | Alan Arazi et.al. | 2505.18125 | null |
| 2025-05-23 | UNJOIN: Enhancing Multi-Table Text-to-SQL Generation via Schema Simplification | Poojah Ganesan et.al. | 2505.18122 | null |
| 2025-05-23 | ProgRM: Build Better GUI Agents with Progress Rewards | Danyang Zhang et.al. | 2505.18121 | null |
| 2025-05-23 | Bidirectional Knowledge Distillation for Enhancing Sequential Recommendation with Large Language Models | Jiongran Wu et.al. | 2505.18120 | null |
| 2025-05-23 | Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM | Zinuo Li et.al. | 2505.18110 | null |
| 2025-05-23 | ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework | Lisheng Huang et.al. | 2505.18105 | link |
| 2025-05-23 | How Can I Publish My LLM Benchmark Without Giving the True Answers Away? | Takashi Ishida et.al. | 2505.18102 | null |
| 2025-05-23 | Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL | Joey Hong et.al. | 2505.18098 | null |
| 2025-05-23 | QwenLong-CPRS: Towards |
Weizhou Shen et.al. | 2505.18092 | null |
| 2025-05-23 | Data Mixing Can Induce Phase Transitions in Knowledge Acquisition | Xinran Gu et.al. | 2505.18091 | null |
| 2025-05-23 | CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays | Hyungyung Lee et.al. | 2505.18087 | link |
| 2025-05-23 | Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding | Xiaoyi Zhang et.al. | 2505.18079 | null |
| 2025-05-22 | CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms | Shilin Yan et.al. | 2505.17020 | link |
| 2025-05-22 | Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning Framework | Chenhao Zhang et.al. | 2505.17019 | link |
| 2025-05-22 | SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward | Kaixuan Fan et.al. | 2505.17018 | link |
| 2025-05-22 | Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO | Chengzhuo Tong et.al. | 2505.17017 | link |
| 2025-05-22 | Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models | Runsen Xu et.al. | 2505.17015 | null |
| 2025-05-22 | SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding | Haoning Wu et.al. | 2505.17012 | link |
| 2025-05-22 | R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning | Huatong Song et.al. | 2505.17005 | link |
| 2025-05-22 | Do Large Language Models Excel in Complex Logical Reasoning with Formal Language? | Jin Jiang et.al. | 2505.16998 | link |
| 2025-05-22 | DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization | Chao Zhang et.al. | 2505.16995 | null |
| 2025-05-22 | Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding | Runpeng Yu et.al. | 2505.16990 | link |
| 2025-05-22 | T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning | Amartya Chakraborty et.al. | 2505.16986 | null |
| 2025-05-22 | UFT: Unifying Supervised and Reinforcement Fine-Tuning | Mingyang Liu et.al. | 2505.16984 | link |
| 2025-05-22 | LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding | Junlong Tong et.al. | 2505.16983 | link |
| 2025-05-22 | Beyond Correlation: Towards Causal Large Language Model Agents in Biomedicine | Adib Bazgir et.al. | 2505.16982 | null |
| 2025-05-22 | HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation | Weizhi Tang et.al. | 2505.16978 | link |
| 2025-05-22 | SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development | Yaxin Du et.al. | 2505.16975 | link |
| 2025-05-22 | CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark | Ahmed Heakl et.al. | 2505.16968 | link |
| 2025-05-22 | Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models | Junjie Xiong et.al. | 2505.16957 | null |
| 2025-05-22 | On Multilingual Encoder Language Model Compression for Low-Resource Languages | Daniil Gurgurov et.al. | 2505.16956 | null |
| 2025-05-22 | A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization | Shengyu Feng et.al. | 2505.16952 | null |
| 2025-05-21 | InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition | Yijie Zheng et.al. | 2505.15818 | link |
| 2025-05-21 | On the creation of narrow AI: hierarchy and nonlocality of neural network skills | Eric J. Michaud et.al. | 2505.15811 | link |
| 2025-05-21 | MMaDA: Multimodal Large Diffusion Language Models | Ling Yang et.al. | 2505.15809 | link |
| 2025-05-21 | The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation | Patrick Kahardipraja et.al. | 2505.15807 | link |
| 2025-05-21 | Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering | Hwan Chang et.al. | 2505.15805 | link |
| 2025-05-21 | STAR-R1: Spacial TrAnsformation Reasoning by Reinforcing Multimodal LLMs | Zongzhao Li et.al. | 2505.15804 | link |
| 2025-05-21 | VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models | Yuchen Yan et.al. | 2505.15801 | null |
| 2025-05-21 | Model Merging is Secretly Certifiable: Non-Vacuous Generalisation Bounds for Low-Shot Learning | Taehoon Kim et.al. | 2505.15798 | null |
| 2025-05-21 | Reverse Engineering Human Preferences with Reinforcement Learning | Lisa Alazraki et.al. | 2505.15795 | null |
| 2025-05-21 | HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving | Zhiwen Chen et.al. | 2505.15793 | null |
| 2025-05-21 | Large Language Models as Computable Approximations to Solomonoff Induction | Jun Wan et.al. | 2505.15784 | null |
| 2025-05-21 | dKV-Cache: The Cache for Diffusion Language Models | Xinyin Ma et.al. | 2505.15781 | link |
| 2025-05-21 | ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning | Changtai Zhu et.al. | 2505.15776 | link |
| 2025-05-21 | Beyond Hard and Soft: Hybrid Context Compression for Balancing Local and Global Information Retention | Huanxuan Liao et.al. | 2505.15774 | link |
| 2025-05-21 | MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling | Cheng Yifan et.al. | 2505.15772 | null |
| 2025-05-21 | An Empirical Analysis of Vulnerability Detection Tools for Solidity Smart Contracts Using Line Level Manually Annotated Vulnerabilities | Francesco Salzano et.al. | 2505.15756 | null |
| 2025-05-21 | Exploring The Visual Feature Space for Multimodal Neural Decoding | Weihao Xia et.al. | 2505.15755 | null |
| 2025-05-21 | Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval | Taiye Chen et.al. | 2505.15753 | null |
| 2025-05-21 | Multi-modal Integration Analysis of Alzheimer's Disease Using Large Language Models and Knowledge Graphs | Kanan Kiguchi et.al. | 2505.15747 | null |
| 2025-05-21 | Evolutionary Computation and Large Language Models: A Survey of Methods, Synergies, and Applications | Dikshit Chauhan et.al. | 2505.15741 | null |
| 2025-05-20 | Language Models use Lookbacks to Track Beliefs | Nikhil Prakash et.al. | 2505.14685 | null |
| 2025-05-20 | Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning | Haolei Xu et.al. | 2505.14684 | null |
| 2025-05-20 | Emerging Properties in Unified Multimodal Pretraining | Chaorui Deng et.al. | 2505.14683 | null |
| 2025-05-20 | UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation | Rui Tian et.al. | 2505.14682 | null |
| 2025-05-20 | UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models | Xiaojie Gu et.al. | 2505.14679 | link |
| 2025-05-20 | Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning | Jiaer Xia et.al. | 2505.14677 | null |
| 2025-05-20 | Reward Reasoning Model | Jiaxin Guo et.al. | 2505.14674 | null |
| 2025-05-20 | UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens | Ruichuan An et.al. | 2505.14671 | link |
| 2025-05-20 | Quartet: Native FP4 Training Can Be Optimal for Large Language Models | Roberto L. Castro et.al. | 2505.14669 | link |
| 2025-05-20 | ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions | Bufang Yang et.al. | 2505.14668 | null |
| 2025-05-20 | Beyond Words: Multimodal LLM Knows When to Speak | Zikai Liao et.al. | 2505.14654 | null |
| 2025-05-20 | General-Reasoner: Advancing LLM Reasoning Across All Domains | Xueguang Ma et.al. | 2505.14652 | null |
| 2025-05-20 | Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits | Tiantian Feng et.al. | 2505.14648 | link |
| 2025-05-20 | CAD-Coder: An Open-Source Vision-Language Model for Computer-Aided Design Code Generation | Anna C. Doris et.al. | 2505.14646 | link |
| 2025-05-20 | Think Only When You Need with Large Hybrid-Reasoning Models | Lingjie Jiang et.al. | 2505.14631 | null |
| 2025-05-20 | KERL: Knowledge-Enhanced Personalized Recipe Recommendation using Large Language Models | Fnu Mohbat et.al. | 2505.14629 | link |
| 2025-05-20 | Debating for Better Reasoning: An Unsupervised Multimodal Approach | Ashutosh Adhikari et.al. | 2505.14627 | null |
| 2025-05-20 | TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning | Zhangchen Xu et.al. | 2505.14625 | link |
| 2025-05-20 | Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs | Morgan Lindsay Heisler et.al. | 2505.14620 | null |
| 2025-05-20 | Linear Control of Test Awareness Reveals Differential Compliance in Reasoning Models | Sahar Abdelnabi et.al. | 2505.14617 | link |
| 2025-05-19 | CIE: Controlling Language Model Text Generations Using Continuous Signals | Vinay Samuel et.al. | 2505.13448 | link |
| 2025-05-19 | Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards | Xiaoyuan Liu et.al. | 2505.13445 | link |
| 2025-05-19 | ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models | Liyan Tang et.al. | 2505.13444 | null |
| 2025-05-19 | GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation | Abhay Deshpande et.al. | 2505.13441 | null |
| 2025-05-19 | Optimizing Anytime Reasoning via Budget Relative Policy Optimization | Penghui Qi et.al. | 2505.13438 | link |
| 2025-05-19 | SMOTExT: SMOTE meets Large Language Models | Mateusz Bystroński et.al. | 2505.13434 | null |
| 2025-05-19 | Fine-tuning Quantized Neural Networks with Zeroth-order Optimization | Sifeng Shang et.al. | 2505.13430 | link |
| 2025-05-19 | MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision | Lingxiao Du et.al. | 2505.13427 | link |
| 2025-05-19 | G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning | Liang Chen et.al. | 2505.13426 | link |
| 2025-05-19 | Learnware of Language Models: Specialized Small Language Models Can Do Big | Zhi-Hao Tan et.al. | 2505.13425 | link |
| 2025-05-19 | Make Still Further Progress: Chain of Thoughts for Tabular Data Leaderboard | Si-Yang Liu et.al. | 2505.13421 | null |
| 2025-05-19 | FEALLM: Advancing Facial Emotion Analysis in Multimodal Large Language Models with Emotional Synergy and Reasoning | Zhuozhao Hu et.al. | 2505.13419 | link |
| 2025-05-19 | CoT-Kinetics: A Theoretical Modeling Assessing LRM Reasoning Process | Jinhe Bi et.al. | 2505.13408 | null |
| 2025-05-19 | AutoMathKG: The automated mathematical knowledge graph based on LLM and vector database | Rong Bian et.al. | 2505.13406 | null |
| 2025-05-19 | MR. Judge: Multimodal Reasoner as a Judge | Renjie Pi et.al. | 2505.13403 | null |
| 2025-05-19 | R3: Robust Rubric-Agnostic Reward Models | David Anugraha et.al. | 2505.13388 | link |
| 2025-05-19 | CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition | Nam V. Nguyen et.al. | 2505.13380 | link |
| 2025-05-19 | Thinkless: LLM Learns When to Think | Gongfan Fang et.al. | 2505.13379 | link |
| 2025-05-19 | Seeing, Saying, Solving: An LLM-to-TL Framework for Cooperative Robots | Dan BW Choe et.al. | 2505.13376 | null |
| 2025-05-19 | Multi-Armed Bandits Meet Large Language Models | Djallel Bouneffouf et.al. | 2505.13355 | null |
| 2025-05-16 | Modeling cognitive processes of natural reading with transformer-based Language Models | Bruno Bianchi et.al. | 2505.11485 | null |
| 2025-05-16 | msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML | Zhaolan Huang et.al. | 2505.11483 | link |
| 2025-05-16 | Improving Assembly Code Performance with Large Language Models via Reinforcement Learning | Anjiang Wei et.al. | 2505.11480 | null |
| 2025-05-16 | HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages | Zhilin Wang et.al. | 2505.11475 | null |
| 2025-05-16 | Disentangling Reasoning and Knowledge in Medical Large Language Models | Rahul Thapa et.al. | 2505.11462 | null |
| 2025-05-16 | ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks | Zhixiong Zhuang et.al. | 2505.11459 | null |
| 2025-05-16 | LLMs unlock new paths to monetizing exploits | Nicholas Carlini et.al. | 2505.11449 | null |
| 2025-05-16 | Is Compression Really Linear with Code Intelligence? | Xianzhen Luo et.al. | 2505.11441 | null |
| 2025-05-16 | GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art | Chenkai Zhang et.al. | 2505.11436 | link |
| 2025-05-16 | MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production | Chao Jin et.al. | 2505.11432 | null |
| 2025-05-16 | Mergenetic: a Simple Evolutionary Model Merging Library | Adrian Robert Minut et.al. | 2505.11427 | link |
| 2025-05-16 | When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs | Xiaomin Li et.al. | 2505.11423 | null |
| 2025-05-16 | Towards Cultural Bridge by Bahnaric-Vietnamese Translation Using Transfer Learning of Sequence-To-Sequence Pre-training Language Model | Phan Tran Minh Dat et.al. | 2505.11421 | null |
| 2025-05-16 | EdgeWisePersona: A Dataset for On-Device User Profiling from Natural Language Interactions | Patryk Bartkowiak et.al. | 2505.11417 | link |
| 2025-05-16 | MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems | Yinsicheng Jiang et.al. | 2505.11415 | null |
| 2025-05-16 | CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs | Sijia Chen et.al. | 2505.11413 | null |
| 2025-05-16 | Visual Planning: Let's Think Only with Images | Yi Xu et.al. | 2505.11409 | link |
| 2025-05-16 | Large Language Model Use Impact Locus of Control | Jenny Xiyu Fu et.al. | 2505.11406 | null |
| 2025-05-16 | EmotionHallucer: Evaluating Emotion Hallucinations in Multimodal Large Language Models | Bohao Xing et.al. | 2505.11405 | link |
| 2025-05-16 | Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner | Wenchuan Zhang et.al. | 2505.11404 | link |
| 2025-05-15 | End-to-End Vision Tokenizer Tuning | Wenxuan Wang et.al. | 2505.10562 | null |
| 2025-05-15 | Neural Thermodynamic Laws for Large Language Model Training | Ziming Liu et.al. | 2505.10559 | null |
| 2025-05-15 | Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data | Yiwen Liu et.al. | 2505.10551 | link |
| 2025-05-15 | Real-Time Out-of-Distribution Failure Prevention via Multi-Modal Reasoning | Milan Ganai et.al. | 2505.10547 | null |
| 2025-05-15 | Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models | Annie Wong et.al. | 2505.10543 | link |
| 2025-05-15 | Exploring Implicit Visual Misunderstandings in Multimodal Large Language Models through Attention Analysis | Pengfei Wang et.al. | 2505.10541 | link |
| 2025-05-15 | S3C2 Summit 2024-09: Industry Secure Software Supply Chain Summit | Imranur Rahman et.al. | 2505.10538 | null |
| 2025-05-15 | WorldPM: Scaling Human Preference Modeling | Binghai Wang et.al. | 2505.10527 | link |
| 2025-05-15 | MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models | Mugilan Ganesan et.al. | 2505.10526 | null |
| 2025-05-15 | Multi-Token Prediction Needs Registers | Anastasios Gerontopoulos et.al. | 2505.10518 | link |
| 2025-05-15 | RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs | Vibha Belavadi et.al. | 2505.10495 | null |
| 2025-05-15 | Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective | Yutao Mou et.al. | 2505.10494 | link |
| 2025-05-15 | CL-RAG: Bridging the Gap in Retrieval-Augmented Generation with Curriculum Learning | Shaohan Wang et.al. | 2505.10493 | null |
| 2025-05-15 | Campus AI vs Commercial AI: A Late-Breaking Study on How LLM As-A-Service Customizations Shape Trust and Usage Patterns | Leon Hannig et.al. | 2505.10490 | null |
| 2025-05-15 | Parallel Scaling Law for Language Models | Mouxiang Chen et.al. | 2505.10475 | link |
| 2025-05-15 | Large Language Models for Cancer Communication: Evaluating Linguistic Quality, Safety, and Accessibility in Generative AI | Agnik Saha et.al. | 2505.10472 | null |
| 2025-05-15 | AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenge | Ranjan Sapkota et.al. | 2505.10468 | null |
| 2025-05-15 | Superposition Yields Robust Neural Scaling | Yizhou liu et.al. | 2505.10465 | link |
| 2025-05-15 | Vision language models have difficulty recognizing virtual objects | Tyler Tran et.al. | 2505.10453 | null |
| 2025-05-15 | Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models | Zemin Huang et.al. | 2505.10446 | null |
| 2025-05-14 | Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists? | Anthony GX-Chen et.al. | 2505.09614 | null |
| 2025-05-14 | Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors | Nicolas Dupuis et.al. | 2505.09610 | null |
| 2025-05-14 | Adversarial Suffix Filtering: a Defense Pipeline for LLMs | David Khachaturov et.al. | 2505.09602 | null |
| 2025-05-14 | How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference | Nidhal Jegham et.al. | 2505.09598 | null |
| 2025-05-14 | WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models | Abdullah Mushtaq et.al. | 2505.09595 | null |
| 2025-05-14 | Variational Visual Question Answering | Tobias Jan Wieczorek et.al. | 2505.09591 | null |
| 2025-05-15 | Beyond Likes: How Normative Feedback Complements Engagement Signals on Social Media | Yuchen Wu et.al. | 2505.09583 | null |
| 2025-05-14 | VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation | Chaofan Zhang et.al. | 2505.09577 | null |
| 2025-05-14 | Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach | Shannon Lodoen et.al. | 2505.09576 | null |
| 2025-05-14 | MIGRATION-BENCH: Repository-Level Code Migration Benchmark from Java 8 | Linbo Liu et.al. | 2505.09569 | link |
| 2025-05-14 | Using Foundation Models as Pseudo-Label Generators for Pre-Clinical 4D Cardiac CT Segmentation | Anne-Marie Rickmann et.al. | 2505.09564 | null |
| 2025-05-14 | WavReward: Spoken Dialogue Models With Generalist Reward Evaluators | Shengpeng Ji et.al. | 2505.09558 | link |
| 2025-05-14 | PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning | Zongqian Li et.al. | 2505.09519 | link |
| 2025-05-15 | Towards Fair In-Context Learning with Tabular Foundation Models | Patrik Kenfack et.al. | 2505.09503 | null |
| 2025-05-14 | Layered Unlearning for Adversarial Relearning | Timothy Qian et.al. | 2505.09500 | link |
| 2025-05-14 | Flash-VL 2B: Optimizing Vision-Language Model Performance for Ultra-Low Latency and High Throughput | Bo Zhang et.al. | 2505.09498 | null |
| 2025-05-14 | Card Sorting Simulator: Augmenting Design of Logical Information Architectures with Large Language Models | Eduard Kuric et.al. | 2505.09478 | null |
| 2025-05-14 | Deploying Foundation Model-Enabled Air and Ground Robots in the Field: Challenges and Opportunities | Zachary Ravichandran et.al. | 2505.09477 | null |
| 2025-05-14 | Evaluating GPT- and Reasoning-based Large Language Models on Physics Olympiad Problems: Surpassing Human Performance and Implications for Educational Assessment | Paul Tschisgale et.al. | 2505.09438 | null |
| 2025-05-14 | CXMArena: Unified Dataset to benchmark performance in realistic CXM Scenarios | Raghav Garg et.al. | 2505.09436 | link |
| 2025-05-13 | CodePDE: An Inference Framework for LLM-driven PDE Solver Generation | Shanda Li et.al. | 2505.08783 | link |
| 2025-05-13 | HealthBench: Evaluating Large Language Models Towards Improved Human Health | Rahul K. Arora et.al. | 2505.08775 | link |
| 2025-05-14 | Towards Autonomous UAV Visual Object Search in City Space: Benchmark and Agentic Methodology | Yatai Ji et.al. | 2505.08765 | null |
| 2025-05-13 | Aya Vision: Advancing the Frontier of Multilingual Multimodality | Saurabh Dash et.al. | 2505.08751 | null |
| 2025-05-13 | AC-Reason: Towards Theory-Guided Actual Causality Reasoning with Large Language Models | Yanxi Zhang et.al. | 2505.08750 | link |
| 2025-05-13 | DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models | Xiaoyang Chen et.al. | 2505.08744 | link |
| 2025-05-13 | Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies | Xiaoliang Luo et.al. | 2505.08739 | link |
| 2025-05-13 | Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data | James Giroux et.al. | 2505.08736 | link |
| 2025-05-13 | NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context | Ben Yao et.al. | 2505.08734 | null |
| 2025-05-13 | Securing RAG: A Risk Assessment and Mitigation Framework | Lukas Ammann et.al. | 2505.08728 | null |
| 2025-05-13 | Memorization-Compression Cycles Improve Generalization | Fangyuan Yu et.al. | 2505.08727 | null |
| 2025-05-13 | Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving | Zongchuang Zhao et.al. | 2505.08725 | link |
| 2025-05-13 | TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series | Xiaolei Qin et.al. | 2505.08723 | link |
| 2025-05-13 | PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts | Yang Su et.al. | 2505.08719 | null |
| 2025-05-13 | Controllable Image Colorization with Instance-aware Texts and Masks | Yanru An et.al. | 2505.08705 | null |
| 2025-05-13 | LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs | K M Sajjadul Islam et.al. | 2505.08704 | null |
| 2025-05-14 | Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities | George Saon et.al. | 2505.08699 | null |
| 2025-05-13 | VizCV: AI-assisted visualization of researchers' publications tracks | Vladimír Lazárik et.al. | 2505.08691 | null |
| 2025-05-13 | Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation | Sheng Liang et.al. | 2505.08690 | null |
| 2025-05-13 | A Social Robot with Inner Speech for Dietary Guidance | Valerio Belcamino et.al. | 2505.08664 | link |
| 2025-05-12 | DanceGRPO: Unleashing GRPO on Visual Generation | Zeyue Xue et.al. | 2505.07818 | null |
| 2025-05-12 | Imagine, Verify, Execute: Memory-Guided Agentic Exploration with Vision-Language Models | Seungjae Lee et.al. | 2505.07815 | null |
| 2025-05-12 | Learning Dynamics in Continual Pre-Training for Large Language Models | Xingjin Wang et.al. | 2505.07796 | null |
| 2025-05-12 | Domain Regeneration: How well do LLMs match syntactic properties of text domains? | Da Ju et.al. | 2505.07784 | null |
| 2025-05-12 | Relative Overfitting and Accept-Reject Framework | Yanxin Liu et.al. | 2505.07783 | null |
| 2025-05-12 | MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering | Rushi Qiang et.al. | 2505.07782 | link |
| 2025-05-12 | Must Read: A Systematic Survey of Computational Persuasion | Nimet Beyza Bozdag et.al. | 2505.07775 | link |
| 2025-05-12 | Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving | Xinji Mai et.al. | 2505.07773 | link |
| 2025-05-12 | Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding | Yifeng Di et.al. | 2505.07768 | link |
| 2025-05-12 | BodyGPS: Anatomical Positioning System | Halid Ziya Yerebakan et.al. | 2505.07744 | null |
| 2025-05-12 | Assessing the Chemical Intelligence of Large Language Models | Nicholas T. Runcie et.al. | 2505.07735 | link |
| 2025-05-12 | Spoken Language Understanding on Unseen Tasks With In-Context Learning | Neeraj Agrawal et.al. | 2505.07731 | null |
| 2025-05-12 | Reproducibility, Replicability, and Insights into Visual Document Retrieval with Late Interaction | Jingfen Qiao et.al. | 2505.07730 | link |
| 2025-05-12 | Circuit Partitioning Using Large Language Models for Quantum Compilation and Simulations | Pranav Sinha et.al. | 2505.07711 | null |
| 2025-05-12 | Through the Looking Glass: Common Sense Consistency Evaluation of Weird Images | Elisei Rykov et.al. | 2505.07704 | null |
| 2025-05-12 | PatchTrack: A Comprehensive Analysis of ChatGPT's Influence on Pull Request Outcomes | Daniel Ogenrwot et.al. | 2505.07700 | null |
| 2025-05-12 | Beyond CLIP Generalization: Against Forward&Backward Forgetting Adapter for Continual Learning of Vision-Language Models | Songlin Dong et.al. | 2505.07690 | null |
| 2025-05-12 | S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models | Muzhi Dai et.al. | 2505.07686 | null |
| 2025-05-12 | Multimodal Survival Modeling in the Age of Foundation Models | Steven Song et.al. | 2505.07683 | link |
| 2025-05-12 | SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models | Hang Wu et.al. | 2505.07680 | null |
| 2025-05-09 | Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks | Christos Plachouras et.al. | 2505.06224 | link |
| 2025-05-09 | Adapting a Segmentation Foundation Model for Medical Image Classification | Pengfei Gu et.al. | 2505.06217 | null |
| 2025-05-09 | From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling | Vahid Rahimzadeh et.al. | 2505.06184 | null |
| 2025-05-09 | A Large Language Model-Enhanced Q-learning for Capacitated Vehicle Routing Problem with Time Windows | Linjiang Cao et.al. | 2505.06178 | null |
| 2025-05-09 | MonetGPT: Solving Puzzles Enhances MLLMs' Image Retouching Skills | Niladri Shekhar Dutt et.al. | 2505.06176 | null |
| 2025-05-09 | Turbo-ICL: In-Context Learning-Based Turbo Equalization | Zihang Song et.al. | 2505.06175 | null |
| 2025-05-09 | MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from Textbooks | Wenqi Zeng et.al. | 2505.06152 | link |
| 2025-05-09 | A Scaling Law for Token Efficiency in LLM Fine-Tuning Under Fixed Compute Budgets | Ryan Lagasse et.al. | 2505.06150 | null |
| 2025-05-09 | Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study | Faeze Ghorbanpour et.al. | 2505.06149 | null |
| 2025-05-09 | LLMs Get Lost In Multi-Turn Conversation | Philippe Laban et.al. | 2505.06120 | link |
| 2025-05-09 | LLMs Outperform Experts on Challenging Biology Benchmarks | Lennart Justen et.al. | 2505.06108 | null |
| 2025-05-09 | Free and Fair Hardware: A Pathway to Copyright Infringement-Free Verilog Generation using LLMs | Sam Bush et.al. | 2505.06096 | null |
| 2025-05-09 | Assessing Tenstorrent's RISC-V MatMul Acceleration Capabilities | Hiari Pizzini Cavagna et.al. | 2505.06085 | null |
| 2025-05-09 | Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information | Joshua Harris et.al. | 2505.06046 | null |
| 2025-05-09 | Short-circuiting Shortcuts: Mechanistic Investigation of Shortcuts in Text Classification | Leon Eshuijs et.al. | 2505.06032 | link |
| 2025-05-09 | Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation | Stefan Vasilev et.al. | 2505.06027 | null |
| 2025-05-09 | ArtRAG: Retrieval-Augmented Generation with Structured Context for Visual Art Understanding | Shuai Wang et.al. | 2505.06020 | null |
| 2025-05-09 | Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models | Dawid Wisniewski et.al. | 2505.06004 | link |
| 2025-05-09 | Task-Adapter++: Task-specific Adaptation with Order-aware Alignment for Few-shot Action Recognition | Congqi Cao et.al. | 2505.06002 | link |
| 2025-05-09 | Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models | Lennart Stöpler et.al. | 2505.05970 | null |
| 2025-05-08 | Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation | Chao Liao et.al. | 2505.05472 | null |
| 2025-05-08 | Generating Physically Stable and Buildable LEGO Designs from Text | Ava Pun et.al. | 2505.05469 | link |
| 2025-05-08 | StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant | Haibo Wang et.al. | 2505.05467 | null |
| 2025-05-08 | ComPO: Preference Alignment via Comparison Oracles | Peter Chen et.al. | 2505.05465 | null |
| 2025-05-08 | Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging | Shiqi Chen et.al. | 2505.05464 | link |
| 2025-05-08 | UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections | Fatima Haouari et.al. | 2505.05459 | null |
| 2025-05-08 | SITE: towards Spatial Intelligence Thorough Evaluation | Wenqi Wang et.al. | 2505.05456 | null |
| 2025-05-08 | Conversational Process Model Redesign | Nataliia Klievtsova et.al. | 2505.05453 | null |
| 2025-05-08 | clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations | Chalamalasetti Kranti et.al. | 2505.05445 | null |
| 2025-05-08 | GesPrompt: Leveraging Co-Speech Gestures to Augment LLM-Based Interaction in Virtual Reality | Xiyun Hu et.al. | 2505.05441 | null |
| 2025-05-09 | EcoAgent: An Efficient Edge-Cloud Collaborative Multi-Agent Framework for Mobile Automation | Biao Yi et.al. | 2505.05440 | null |
| 2025-05-08 | Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data | Yudong Wang et.al. | 2505.05427 | null |
| 2025-05-09 | LiTransProQA: an LLM-based Literary Translation evaluation metric with Professional Question Answering | Ran Zhang et.al. | 2505.05423 | link |
| 2025-05-08 | Crosslingual Reasoning through Test-Time Scaling | Zheng-Xin Yong et.al. | 2505.05408 | link |
| 2025-05-08 | Frame In, Frame Out: Do LLMs Generate More Biased News Headlines than Humans? | Valeria Pastorino et.al. | 2505.05406 | null |
| 2025-05-08 | A Pain Assessment Framework based on multimodal data and Deep Machine Learning methods | Stefanos Gkikas et.al. | 2505.05396 | null |
| 2025-05-08 | DSDrive: Distilling Large Language Model for Lightweight End-to-End Autonomous Driving with Unified Reasoning and Planning | Wenru Liu et.al. | 2505.05360 | null |
| 2025-05-08 | Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization | Sooyoung Park et.al. | 2505.05343 | link |
| 2025-05-08 | FLAM: Frame-Wise Language-Audio Modeling | Yusong Wu et.al. | 2505.05335 | null |
| 2025-05-08 | ICon: In-Context Contribution for Automatic Data Selection | Yixin Yang et.al. | 2505.05327 | null |
| 2025-05-07 | EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning | Zhenghao Xing et.al. | 2505.04623 | link |
| 2025-05-07 | On Path to Multimodal Generalist: General-Level and General-Bench | Hao Fei et.al. | 2505.04620 | null |
| 2025-05-07 | OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution | Lianghong Guo et.al. | 2505.04606 | link |
| 2025-05-07 | OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning | Xianhang Li et.al. | 2505.04601 | null |
| 2025-05-08 | MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection | Zhihao Zhang et.al. | 2505.04594 | null |
| 2025-05-07 | ZeroSearch: Incentivize the Search Capability of LLMs without Searching | Hao Sun et.al. | 2505.04588 | link |
| 2025-05-07 | SlideItRight: Using AI to Find Relevant Slides and Provide Feedback for Open-Ended Questions | Chloe Qianhui Zhao et.al. | 2505.04584 | link |
| 2025-05-07 | Fight Fire with Fire: Defending Against Malicious RL Fine-Tuning via Reward Neutralization | Wenjun Cao et.al. | 2505.04578 | null |
| 2025-05-07 | Communication-Efficient Federated Fine-Tuning of Language Models via Dynamic Update Schedules | Michail Theologitis et.al. | 2505.04535 | link |
| 2025-05-07 | Overcoming Data Scarcity in Generative Language Modelling for Low-Resource Languages: A Systematic Review | Josh McGiff et.al. | 2505.04531 | null |
| 2025-05-07 | Comparative Analysis of Carbon Footprint in Manual vs. LLM-Assisted Code Development | Kuen Sum Cheung et.al. | 2505.04521 | null |
| 2025-05-07 | Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs | Yehui Tang et.al. | 2505.04519 | null |
| 2025-05-07 | "I Can See Forever!": Evaluating Real-time VideoLLMs for Assisting Individuals with Visual Impairments | Ziyi Zhang et.al. | 2505.04488 | null |
| 2025-05-07 | CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation | Jiahao Li et.al. | 2505.04481 | null |
| 2025-05-07 | TrajEvo: Designing Trajectory Prediction Heuristics via LLM-driven Evolution | Zhikai Zhao et.al. | 2505.04480 | link |
| 2025-05-07 | Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration | Shigeki Karita et.al. | 2505.04457 | link |
| 2025-05-07 | M2Rec: Multi-scale Mamba for Efficient Sequential Recommendation | Qianru Zhang et.al. | 2505.04445 | null |
| 2025-05-07 | Towards Effectively Leveraging Execution Traces for Program Repair with Code LLMs | Mirazul Haque et.al. | 2505.04441 | null |
| 2025-05-07 | OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models | Xiaoyu Xu et.al. | 2505.04416 | null |
| 2025-05-07 | DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception | Junjie Wang et.al. | 2505.04410 | link |
| 2025-05-06 | VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model | Zuwei Long et.al. | 2505.03739 | link |
| 2025-05-06 | Decentralized Nonconvex Optimization under Heavy-Tailed Noise: Normalization and Optimal Convergence | Shuhua Yu et.al. | 2505.03736 | null |
| 2025-05-06 | Meta-Optimization and Program Search using Language Models for Task and Motion Planning | Denis Shcherba et.al. | 2505.03725 | null |
| 2025-05-06 | Fill the Gap: Quantifying and Reducing the Modality Gap in Image-Text Representation Learning | François Role et.al. | 2505.03703 | null |
| 2025-05-06 | Fairness of Automatic Speech Recognition in Cleft Lip and Palate Speech | Susmita Bhattacharjee et.al. | 2505.03697 | null |
| 2025-05-06 | Graph Drawing for LLMs: An Empirical Evaluation | Walter Didimo et.al. | 2505.03678 | null |
| 2025-05-06 | Distribution-Conditional Generation: From Class Distribution to Creative Generation | Fu Feng et.al. | 2505.03667 | null |
| 2025-05-06 | Binding threshold units with artificial oscillatory neurons | Vladimir Fanaskov et.al. | 2505.03648 | link |
| 2025-05-06 | PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing | Yiping Xie et.al. | 2505.03621 | null |
| 2025-05-06 | Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images | Fangling Jiang et.al. | 2505.03611 | null |
| 2025-05-06 | Learning Knowledge-based Prompts for Robust 3D Mask Presentation Attack Detection | Fangling Jiang et.al. | 2505.03610 | null |
| 2025-05-06 | DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes | Sergey Linok et.al. | 2505.03581 | link |
| 2025-05-06 | LlamaFirewall: An open source guardrail system for building secure AI agents | Sahana Chennabasappa et.al. | 2505.03574 | null |
| 2025-05-06 | Say It Another Way: A Framework for User-Grounded Paraphrasing | Cléa Chataigner et.al. | 2505.03563 | null |
| 2025-05-06 | A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications and Challenges | Feibo Jiang et.al. | 2505.03556 | link |
| 2025-05-06 | A Hashgraph-Inspired Consensus Mechanism for Reliable Multi-Model Reasoning | Kolawole E. Ogunsina et.al. | 2505.03553 | null |
| 2025-05-06 | STORY2GAME: Generating (Almost) Everything in an Interactive Fiction Game | Eric Zhou et.al. | 2505.03547 | null |
| 2025-05-06 | Faster MoE LLM Inference for Extremely Large Models | Haoqi Yang et.al. | 2505.03531 | null |
| 2025-05-06 | Ruled by the Representation Space: On the University's Embrace of Large Language Models | Katia Schwerzmann et.al. | 2505.03513 | null |
| 2025-05-06 | BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models | Zihan Wang et.al. | 2505.03501 | null |
| 2025-05-05 | Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation | Lu Ling et.al. | 2505.02836 | null |
| 2025-05-05 | R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning | Yi-Fan Zhang et.al. | 2505.02835 | link |
| 2025-05-05 | No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves | Dengyang Jiang et.al. | 2505.02831 | link |
| 2025-05-05 | LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery | Jerome Quenum et.al. | 2505.02829 | null |
| 2025-05-05 | ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations | Dmitriy Shopkhoev et.al. | 2505.02819 | link |
| 2025-05-05 | Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing | Diji Yang et.al. | 2505.02811 | link |
| 2025-05-05 | Towards Quantifying the Hessian Structure of Neural Networks | Zhaorui Dong et.al. | 2505.02809 | link |
| 2025-05-05 | Generating HomeAssistant Automations Using an LLM-based Chatbot | Mathyas Giudici et.al. | 2505.02802 | null |
| 2025-05-05 | HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models | Zheng Lin et.al. | 2505.02795 | null |
| 2025-05-05 | Giving Simulated Cells a Voice: Evolving Prompt-to-Intervention Models for Cellular Control | Nam H. Le et.al. | 2505.02766 | null |
| 2025-05-05 | Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models | Matthew Dahl et.al. | 2505.02763 | null |
| 2025-05-05 | Using Knowledge Graphs to harvest datasets for efficient CLIP model training | Simon Ging et.al. | 2505.02746 | link |
| 2025-05-06 | Knowledge Graphs for Enhancing Large Language Models in Entity Disambiguation | Gerard Pons et.al. | 2505.02737 | null |
| 2025-05-05 | FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models | Zhouliang Yu et.al. | 2505.02735 | link |
| 2025-05-05 | Enhancing LLMs' Clinical Reasoning with Real-World Data from a Nationwide Sepsis Registry | Junu Kim et.al. | 2505.02722 | link |
| 2025-05-05 | Less is More: Efficient Weight Farcasting with 1-Layer Neural Network | Xiao Shou et.al. | 2505.02714 | null |
| 2025-05-05 | Technical Report: Evaluating Goal Drift in Language Model Agents | Rauno Arike et.al. | 2505.02709 | null |
| 2025-05-05 | Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play | Yemin Shi et.al. | 2505.02707 | link |
| 2025-05-05 | AI Standardized Patient Improves Human Conversations in Advanced Cancer Care | Kurtis Haut et.al. | 2505.02694 | link |
| 2025-05-05 | Predicting Movie Hits Before They Happen with LLMs | Shaghayegh Agah et.al. | 2505.02693 | null |
| 2025-05-02 | How Effective are Large Time Series Models in Hydrology? A Study on Water Level Forecasting in Everglades | Rahuul Rangaraj et.al. | 2505.01415 | null |
| 2025-05-02 | Dynamic Robot Tool Use with Vision Language Models | Noah Trupin et.al. | 2505.01399 | null |
| 2025-05-02 | FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors | Chenxi Li et.al. | 2505.01322 | null |
| 2025-05-02 | Helping Big Language Models Protect Themselves: An Enhanced Filtering and Summarization System | Sheikh Samit Muhaimin et.al. | 2505.01315 | null |
| 2025-05-02 | Enhancing SPARQL Query Rewriting for Complex Ontology Alignments | Anicet Lepetit Ondo et.al. | 2505.01309 | null |
| 2025-05-02 | Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments | Regan Bolton et.al. | 2505.01307 | null |
| 2025-05-02 | FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing | Gaoxiang Cong et.al. | 2505.01263 | null |
| 2025-05-02 | Digital Pathway Curation (DPC): a comparative pipeline to assess the reproducibility, consensus and accuracy across Gemini, PubMed, and scientific reviewers in biomedical research | Flavio Lichtenstein et.al. | 2505.01259 | null |
| 2025-05-02 | Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging | Elena Mulero Ayllón et.al. | 2505.01239 | null |
| 2025-05-02 | CaReAQA: A Cardiac and Respiratory Audio Question Answering Model for Open-Ended Diagnostic Reasoning | Tsai-Ning Wang et.al. | 2505.01199 | null |
| 2025-05-02 | Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods | Mahdi Dhaini et.al. | 2505.01198 | link |
| 2025-05-02 | TSTMotion: Training-free Scene-awarenText-to-motion Generation | Ziyan Guo et.al. | 2505.01182 | null |
| 2025-05-02 | LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures | Francisco Aguilera-Martínez et.al. | 2505.01177 | null |
| 2025-05-02 | On the Limitations of Steering in Language Model Alignment | Chebrolu Niranjan et.al. | 2505.01162 | null |
| 2025-05-02 | Methodological Foundations for AI-Driven Survey Question Generation | Ted K. Mburu et.al. | 2505.01150 | null |
| 2025-05-02 | Retrieval-Augmented Generation in Biomedicine: A Survey of Technologies, Datasets, and Clinical Applications | Jiawei He et.al. | 2505.01146 | null |
| 2025-05-02 | MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning | Murtadha Ahmed et.al. | 2505.01110 | null |
| 2025-05-02 | Self-Supervision Enhances Instance-based Multiple Instance Learning Methods in Digital Pathology: A Benchmark Study | Ali Mammadov et.al. | 2505.01109 | link |
| 2025-05-02 | Nesterov Method for Asynchronous Pipeline Parallel Optimization | Thalaiyasingam Ajanthan et.al. | 2505.01099 | link |
| 2025-05-02 | Evaluating Vision Language Model Adaptations for Radiology Report Generation in Low-Resource Languages | Marco Salmè et.al. | 2505.01096 | null |
| 2025-05-01 | T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT | Dongzhi Jiang et.al. | 2505.00703 | link |
| 2025-05-01 | Robotic Visual Instruction | Yanbang Li et.al. | 2505.00693 | null |
| 2025-05-01 | Visual Test-time Scaling for GUI Agent Grounding | Tiange Luo et.al. | 2505.00684 | link |
| 2025-05-01 | Steering Large Language Models with Register Analysis for Arbitrary Style Transfer | Xinchen Yang et.al. | 2505.00679 | null |
| 2025-05-01 | Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions | Yiming Du et.al. | 2505.00675 | link |
| 2025-05-01 | DeepCritic: Deliberate Critique with Large Language Models | Wenkai Yang et.al. | 2505.00662 | link |
| 2025-05-01 | On the generalization of language models from in-context learning and finetuning: a controlled study | Andrew K. Lampinen et.al. | 2505.00661 | null |
| 2025-05-01 | Large Language Models Understanding: an Inherent Ambiguity Barrier | Daniel N. Nissani et.al. | 2505.00654 | null |
| 2025-05-01 | Open-Source LLM-Driven Federated Transformer for Predictive IoV Management | Yazan Otoum et.al. | 2505.00651 | null |
| 2025-05-01 | Investigating Task Arithmetic for Zero-Shot Information Retrieval | Marco Braga et.al. | 2505.00649 | link |
| 2025-05-01 | Brain Foundation Models with Hypergraph Dynamic Adapter for Brain Disease Analysis | Zhongying Deng et.al. | 2505.00627 | null |
| 2025-05-01 | The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them) | Zihao Wang et.al. | 2505.00626 | null |
| 2025-05-01 | FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation | Chaitali Bhattacharyya et.al. | 2505.00624 | null |
| 2025-05-01 | Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction | Simon Giebenhain et.al. | 2505.00615 | null |
| 2025-05-01 | Combining LLMs with Logic-Based Framework to Explain MCTS | Ziyan An et.al. | 2505.00610 | null |
| 2025-05-01 | Can LLMs Help Improve Analogical Reasoning For Strategic Decisions? Experimental Evidence from Humans and GPT-4 | Phanish Puranam et.al. | 2505.00603 | null |
| 2025-05-02 | Fast and Low-Cost Genomic Foundation Models via Outlier Removal | Haozheng Luo et.al. | 2505.00598 | link |
| 2025-05-01 | Block Circulant Adapter for Large Language Models | Xinyu Ding et.al. | 2505.00582 | null |
| 2025-05-01 | Parameter-Efficient Fine-Tuning with Circulant and Diagonal Vectors | Xinyu Ding et.al. | 2505.00580 | null |
| 2025-05-01 | FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension | Jushi Kai et.al. | 2505.00570 | null |
| 2025-04-30 | TRUST: An LLM-Based Dialogue System for Trauma Understanding and Structured Assessments | Sichang Tu et.al. | 2504.21851 | null |
| 2025-04-30 | COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning | Xindi Wu et.al. | 2504.21850 | null |
| 2025-04-30 | Early Exit and Multi Stage Knowledge Distillation in VLMs for Video Summarization | Anas Anwarul Haq Khan et.al. | 2504.21831 | null |
| 2025-04-30 | Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields | Yixin Gao et.al. | 2504.21814 | null |
| 2025-04-30 | A simple and effective approach for body part recognition on CT scans based on projection estimation | Franko Hrzic et.al. | 2504.21810 | null |
| 2025-04-30 | An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding | Xiuwei Shang et.al. | 2504.21803 | null |
| 2025-04-30 | DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition | Z. Z. Ren et.al. | 2504.21801 | link |
| 2025-04-30 | SWE-smith: Scaling Data for Software Engineering Agents | John Yang et.al. | 2504.21798 | null |
| 2025-04-30 | MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness | Junsheng Huang et.al. | 2504.21773 | null |
| 2025-04-30 | LASHED: LLMs And Static Hardware Analysis for Early Detection of RTL Bugs | Baleegh Ahmad et.al. | 2504.21770 | null |
| 2025-04-30 | LLM-based Interactive Imitation Learning for Robotic Manipulation | Jonas Werner et.al. | 2504.21769 | link |
| 2025-04-30 | Investigating Literary Motifs in Ancient and Medieval Novels with Large Language Models | Emelie Hallenberg et.al. | 2504.21742 | null |
| 2025-04-30 | TheraQuest: A Gamified, LLM-Powered Simulation for Massage Therapy Training | Shengqian Wang et.al. | 2504.21735 | null |
| 2025-04-30 | XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs | Marco Arazzi et.al. | 2504.21700 | null |
| 2025-04-30 | Visual Text Processing: A Comprehensive Review and Unified Evaluation | Yan Shu et.al. | 2504.21682 | link |
| 2025-04-30 | Hoist with His Own Petard: Inducing Guardrails to Facilitate Denial-of-Service Attacks on Retrieval-Augmented Generation of LLMs | Pan Suo et.al. | 2504.21680 | null |
| 2025-04-30 | Traceback of Poisoning Attacks to Retrieval-Augmented Generation | Baolei Zhang et.al. | 2504.21668 | null |
| 2025-04-30 | From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising | Jingwen Cai et.al. | 2504.21667 | null |
| 2025-04-30 | AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization | Haotian Luo et.al. | 2504.21659 | link |
| 2025-04-30 | Sadeed: Advancing Arabic Diacritization Through Small Language Model | Zeina Aldallal et.al. | 2504.21635 | null |
| 2025-04-29 | Toward Efficient Exploration by Large Language Model Agents | Dilip Arumugam et.al. | 2504.20997 | null |
| 2025-04-29 | X-Fusion: Introducing New Modality to Frozen Large Language Models | Sicheng Mo et.al. | 2504.20996 | null |
| 2025-04-29 | ACE: A Security Architecture for LLM-Integrated App Systems | Evan Li et.al. | 2504.20984 | null |
| 2025-04-29 | Real-Time Wayfinding Assistant for Blind and Low-Vision Users | Dabbrata Das et.al. | 2504.20976 | null |
| 2025-04-29 | SetKE: Knowledge Editing for Knowledge Elements Overlap | Yifan Wei et.al. | 2504.20972 | null |
| 2025-04-29 | OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification | Shangyu Li et.al. | 2504.20964 | link |
| 2025-04-29 | Information Gravity: A Field-Theoretic Model for Token Selection in Large Language Models | Maryna Vyshnyvetska et.al. | 2504.20951 | null |
| 2025-04-29 | Trace-of-Thought: Enhanced Arithmetic Problem Solving via Reasoning Distillation From Large to Small Language Models | Tyler McDonald et.al. | 2504.20946 | null |
| 2025-04-29 | ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification | Ziqing Fan et.al. | 2504.20930 | link |
| 2025-04-29 | An Empirical Study on the Capability of LLMs in Decomposing Bug Reports | Zhiyuan Chen et.al. | 2504.20911 | null |
| 2025-04-29 | Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers | Quentin Guimard et.al. | 2504.20902 | null |
| 2025-04-29 | LELANTE: LEveraging LLM for Automated ANdroid TEsting | Shamit Fatin et.al. | 2504.20896 | null |
| 2025-04-29 | FedMVP: Federated Multi-modal Visual Prompt Tuning for Vision-Language Models | Mainak Singha et.al. | 2504.20860 | null |
| 2025-04-29 | X-Cross: Dynamic Integration of Language Models for Cross-Domain Sequential Recommendation | Guy Hadad et.al. | 2504.20859 | null |
| 2025-04-29 | JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry | Anum Afzal et.al. | 2504.20849 | null |
| 2025-04-29 | Language Model for Large-Text Transmission in Noisy Quantum Communications | Yuqi Li et.al. | 2504.20842 | null |
| 2025-04-29 | Universal language model with the intervention of quantum theory | D. -F. Qin et.al. | 2504.20839 | null |
| 2025-04-29 | Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning | Hongfei Xue et.al. | 2504.20835 | null |
| 2025-04-29 | Reinforcement Learning for LLM Reasoning Under Memory Constraints | Alan Lee et.al. | 2504.20834 | null |
| 2025-04-30 | Ascendra: Dynamic Request Prioritization for Efficient LLM Serving | Azam Ikram et.al. | 2504.20828 | null |
| 2025-04-28 | Learning Streaming Video Representation via Multitask Training | Yibin Yan et.al. | 2504.20041 | null |
| 2025-04-28 | AutoJudge: Judge Decoding Without Manual Annotation | Roman Garipov et.al. | 2504.20039 | null |
| 2025-04-28 | SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning | Wufei Ma et.al. | 2504.20024 | null |
| 2025-04-28 | Better To Ask in English? Evaluating Factual Accuracy of Multilingual LLMs in English and Low-Resource Languages | Pritika Rohera et.al. | 2504.20022 | null |
| 2025-04-28 | Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models | Xin Wang et.al. | 2504.20020 | null |
| 2025-04-28 | LLM-Generated Fake News Induces Truth Decay in News Ecosystem: A Case Study on Neural News Recommendation | Beizhe Hu et.al. | 2504.20013 | null |
| 2025-04-28 | Towards Automated Scoping of AI for Social Good Projects | Jacob Emmerson et.al. | 2504.20010 | null |
| 2025-04-28 | Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom | Rishika Sen et.al. | 2504.20000 | null |
| 2025-04-28 | HJRNO: Hamilton-Jacobi Reachability with Neural Operators | Yankai Li et.al. | 2504.19989 | null |
| 2025-04-28 | TD-EVAL: Revisiting Task-Oriented Dialogue Evaluation by Combining Turn-Level Precision with Dialogue-Level Comparisons | Emre Can Acikgoz et.al. | 2504.19982 | null |
| 2025-04-28 | Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets | Adam Younsi et.al. | 2504.19981 | null |
| 2025-04-29 | From Concept to Practice: an Automated LLM-aided UVM Machine for RTL Verification | Junhao Ye et.al. | 2504.19959 | null |
| 2025-04-28 | Enhancing Surgical Documentation through Multimodal Visual-Temporal Transformers and Generative AI | Hugo Georgenthum et.al. | 2504.19918 | null |
| 2025-04-28 | Can AI Agents Design and Implement Drug Discovery Pipelines? | Khachik Smbatyan et.al. | 2504.19912 | null |
| 2025-04-28 | GenCLS++: Pushing the Boundaries of Generative Classification in LLMs Through Comprehensive SFT and RL Studies Across Diverse Datasets | Mingqian He et.al. | 2504.19898 | null |
| 2025-04-28 | CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition | Quynh Phung et.al. | 2504.19894 | null |
| 2025-04-28 | semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage | Ke Hong et.al. | 2504.19867 | null |
| 2025-04-28 | CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback | Chenhan Jiang et.al. | 2504.19860 | null |
| 2025-04-28 | Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language | Anastasia Zhukova et.al. | 2504.19856 | null |
| 2025-04-29 | The Automation Advantage in AI Red Teaming | Rob Mulla et.al. | 2504.19855 | null |
| 2025-04-25 | Generalization Capability for Imitation Learning | Yixiao Wang et.al. | 2504.18538 | null |
| 2025-04-25 | TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation | Gwen Yidou Weng et.al. | 2504.18535 | null |
| 2025-04-25 | Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation | Shivam Duggal et.al. | 2504.18509 | null |
| 2025-04-25 | Investigating Co-Constructive Behavior of Large Language Models in Explanation Dialogues | Leandra Fichtel et.al. | 2504.18483 | null |
| 2025-04-25 | Generative Induction of Dialogue Task Schemas with Streaming Refinement and Simulated Interactions | James D. Finch et.al. | 2504.18474 | null |
| 2025-04-25 | Fast-Slow Thinking for Large Vision-Language Model Reasoning | Wenyi Xiao et.al. | 2504.18458 | null |
| 2025-04-25 | Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training | Hiroki Naganuma et.al. | 2504.18454 | null |
| 2025-04-25 | Reason Like a Radiologist: Chain-of-Thought and Reinforcement Learning for Verifiable Report Generation | Peiyuan Jing et.al. | 2504.18453 | null |
| 2025-04-25 | Kimi-Audio Technical Report | KimiTeam et.al. | 2504.18425 | link |
| 2025-04-25 | LLMpatronous: Harnessing the Power of LLMs For Vulnerability Detection | Rajesh Yarra et.al. | 2504.18423 | null |
| 2025-04-25 | BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs | Hongyu Wang et.al. | 2504.18415 | null |
| 2025-04-25 | An Empirical Study of Evaluating Long-form Question Answering | Ning Xian et.al. | 2504.18413 | link |
| 2025-04-25 | Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers | Jared Moore et.al. | 2504.18412 | link |
| 2025-04-25 | HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding? | Yusen Zhang et.al. | 2504.18406 | null |
| 2025-04-25 | Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization | Kesen Zhao et.al. | 2504.18397 | link |
| 2025-04-25 | Bridge the Domains: Large Language Models Enhanced Cross-domain Sequential Recommendation | Qidong Liu et.al. | 2504.18383 | null |
| 2025-04-25 | Pushing the boundary on Natural Language Inference | Pablo Miralles-González et.al. | 2504.18376 | null |
| 2025-04-25 | Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal Assistant | Lei Shen et.al. | 2504.18373 | link |
| 2025-04-25 | ThreMoLIA: Threat Modeling of Large Language Model-Integrated Applications | Felix Viktor Jedrzejewski et.al. | 2504.18369 | null |
| 2025-04-25 | Testing Individual Fairness in Graph Neural Networks | Roya Nasiri et.al. | 2504.18353 | null |
| 2025-04-24 | Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models | Xu Ma et.al. | 2504.17789 | null |
| 2025-04-24 | Replay to Remember: Retaining Domain Knowledge in Streaming Language Models | Sneh Pillai et.al. | 2504.17780 | null |
| 2025-04-24 | Conversational Assistants to support Heart Failure Patients: comparing a Neurosymbolic Architecture with ChatGPT | Anuja Tayal et.al. | 2504.17753 | null |
| 2025-04-24 | Towards Robust LLMs: an Adversarial Robustness Measurement Framework | Natan Levy et.al. | 2504.17723 | null |
| 2025-04-24 | Multilingual Performance Biases of Large Language Models in Education | Vansh Gupta et.al. | 2504.17720 | null |
| 2025-04-24 | PICO: Reconstructing 3D People In Contact with Objects | Alpár Cseke et.al. | 2504.17695 | null |
| 2025-04-24 | Ensemble Bayesian Inference: Leveraging Small Language Models to Achieve LLM-level Accuracy in Profile Matching Tasks | Haru-Tada Sato et.al. | 2504.17685 | null |
| 2025-04-24 | INSIGHT: Bridging the Student-Teacher Gap in Times of Large Language Models | Jarne Thys et.al. | 2504.17677 | null |
| 2025-04-24 | Energy Considerations of Large Language Model Inference and Efficiency Optimizations | Jared Fernandez et.al. | 2504.17674 | null |
| 2025-04-24 | Cross-region Model Training with Communication-Computation Overlapping and Delay Compensation | Ying Zhu et.al. | 2504.17672 | null |
| 2025-04-25 | Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction | Yuanchang Ye et.al. | 2504.17671 | null |
| 2025-04-24 | Towards a HIPAA Compliant Agentic AI System in Healthcare | Subash Neupane et.al. | 2504.17669 | null |
| 2025-04-24 | Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics | Zena Al-Khalili et.al. | 2504.17665 | null |
| 2025-04-24 | Effortless, Simulation-Efficient Bayesian Inference using Tabular Foundation Models | Julius Vetter et.al. | 2504.17660 | null |
| 2025-04-24 | Portability of Optimizations from SC to TSO | Akshay Gopalakrishnan et.al. | 2504.17646 | null |
| 2025-04-24 | L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference | Qingyuan Liu et.al. | 2504.17584 | null |
| 2025-04-25 | DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training | Xiaoyu Tian et.al. | 2504.17565 | null |
| 2025-04-24 | When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars | Rei Higuchi et.al. | 2504.17562 | null |
| 2025-04-24 | HalluLens: LLM Hallucination Benchmark | Yejin Bang et.al. | 2504.17550 | null |
| 2025-04-24 | A Comprehensive Survey of Knowledge-Based Vision Question Answering Systems: The Lifecycle of Knowledge in Visual Reasoning Task | Jiaqi Deng et.al. | 2504.17547 | null |
| 2025-04-23 | Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light | Ali Hassani et.al. | 2504.16922 | link |
| 2025-04-23 | IberBench: LLM Evaluation on Iberian Languages | José Ángel González et.al. | 2504.16921 | null |
| 2025-04-23 | Tracing Thought: Using Chain-of-Thought Reasoning to Identify the LLM Behind AI-Generated Text | Shifali Agrahari et.al. | 2504.16913 | null |
| 2025-04-23 | Do Large Language Models know who did what to whom? | Joseph M. Denning et.al. | 2504.16884 | null |
| 2025-04-23 | Enhancing Critical Thinking with AI: A Tailored Warning System for RAG Models | Xuyang Zhu et.al. | 2504.16883 | null |
| 2025-04-23 | Context-Enhanced Vulnerability Detection Based on Large Language Model | Yixin Yang et.al. | 2504.16877 | null |
| 2025-04-23 | Exploring How LLMs Capture and Represent Domain-Specific Knowledge | Mirian Hipolito Garcia et.al. | 2504.16871 | null |
| 2025-04-23 | Common Functional Decompositions Can Mis-attribute Differences in Outcomes Between Populations | Manuel Quintero et.al. | 2504.16864 | null |
| 2025-04-23 | Planning with Diffusion Models for Target-Oriented Dialogue Systems | Hanwen Du et.al. | 2504.16858 | null |
| 2025-04-23 | Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification | Alexander Shvets et.al. | 2504.16856 | null |
| 2025-04-23 | Monte Carlo Planning with Large Language Model for Text-Based Game Agents | Zijing Shi et.al. | 2504.16855 | null |
| 2025-04-23 | Improving Significant Wave Height Prediction Using Chronos Models | Yilin Zhai et.al. | 2504.16834 | null |
| 2025-04-23 | LRASGen: LLM-based RESTful API Specification Generation | Sida Deng et.al. | 2504.16833 | null |
| 2025-04-23 | GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning | Luu Quy Tung et.al. | 2504.16832 | null |
| 2025-04-23 | Decoupled Global-Local Alignment for Improving Compositional Understanding | Xiaoxing Hu et.al. | 2504.16801 | null |
| 2025-04-23 | MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores | Fengwei Zhou et.al. | 2504.16786 | null |
| 2025-04-23 | Graph2Nav: 3D Object-Relation Graph Generation to Robot Navigation | Tixiao Shan et.al. | 2504.16782 | null |
| 2025-04-23 | How Effective are Generative Large Language Models in Performing Requirements Classification? | Waad Alhoshan et.al. | 2504.16768 | null |
| 2025-04-23 | Lightweight Latent Verifiers for Efficient Meta-Generation Strategies | Bartosz Piotrowski et.al. | 2504.16760 | null |
| 2025-04-23 | HEMA : A Hippocampus-Inspired Extended Memory Architecture for Long-Context AI Conversations | Kwangseob Ahn et.al. | 2504.16754 | null |
| 2025-04-22 | TTRL: Test-Time Reinforcement Learning | Yuxin Zuo et.al. | 2504.16084 | link |
| 2025-04-22 | MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention | Yucheng Li et.al. | 2504.16083 | null |
| 2025-04-22 | MR. Video: "MapReduce" is the Principle for Long Video Understanding | Ziqi Pang et.al. | 2504.16082 | null |
| 2025-04-22 | From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning | Le Zhuo et.al. | 2504.16080 | null |
| 2025-04-22 | LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities | Thomas Schmied et.al. | 2504.16078 | null |
| 2025-04-22 | PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models | Shi Qiu et.al. | 2504.16074 | null |
| 2025-04-22 | Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation | Zhiyuan Hu et.al. | 2504.16073 | null |
| 2025-04-22 | Describe Anything: Detailed Localized Image and Video Captioning | Long Lian et.al. | 2504.16072 | null |
| 2025-04-22 | A Python Tool for Reconstructing Full News Text from GDELT | A. Fronzetti Colladon et.al. | 2504.16063 | link |
| 2025-04-22 | Vision language models are unreliable at trivial spatial cognition | Sangeet Khemlani et.al. | 2504.16061 | null |
| 2025-04-22 | Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation | Ziqiao Ma et.al. | 2504.16060 | link |
| 2025-04-22 | Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach | Penghui Li et.al. | 2504.16057 | null |
| 2025-04-22 | Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability | Daniel Hendriks et.al. | 2504.16056 | null |
| 2025-04-22 | LongMamba: Enhancing Mamba's Long Context Capabilities via Training-Free Receptive Field Enlargement | Zhifan Ye et.al. | 2504.16053 | link |
| 2025-04-22 | Evaluating Vision Language Models (VLMs) for Radiology: A Comprehensive Analysis | Frank Li et.al. | 2504.16047 | null |
| 2025-04-22 | Certified Mitigation of Worst-Case LLM Copyright Infringement | Jingyu Zhang et.al. | 2504.16046 | null |
| 2025-04-22 | LLMs meet Federated Learning for Scalable and Secure IoT Management | Yazan Otoum et.al. | 2504.16032 | null |
| 2025-04-22 | LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale | Joya Chen et.al. | 2504.16030 | null |
| 2025-04-22 | Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V3 | Ahmed R. Sadik et.al. | 2504.16027 | null |
| 2025-04-22 | Efficient Temporal Consistency in Diffusion-Based Video Editing with Adaptor Modules: A Theoretical Framework | Xinyuan Song et.al. | 2504.16016 | null |
| 2025-04-21 | Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs | Chun-Hsiao Yeh et.al. | 2504.15280 | link |
| 2025-04-21 | VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models | Weiye Xu et.al. | 2504.15279 | null |
| 2025-04-21 | Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning | Jie Cheng et.al. | 2504.15275 | link |
| 2025-04-21 | Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models | Guo Chen et.al. | 2504.15271 | null |
| 2025-04-21 | Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction | Vaishnavh Nagarajan et.al. | 2504.15266 | link |
| 2025-04-21 | Interpretable Locomotion Prediction in Construction Using a Memory-Driven LLM Agent With Chain-of-Thought Reasoning | Ehsan Ahmadi et.al. | 2504.15263 | null |
| 2025-04-21 | Leveraging Language Models for Automated Patient Record Linkage | Mohammad Beheshti et.al. | 2504.15261 | null |
| 2025-04-21 | CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation | Anirudh Khatry et.al. | 2504.15254 | link |
| 2025-04-21 | Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators | Yilun Zhou et.al. | 2504.15253 | link |
| 2025-04-21 | MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning | Yahan Yang et.al. | 2504.15241 | null |
| 2025-04-21 | Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions | Saffron Huang et.al. | 2504.15236 | null |
| 2025-04-21 | A Self-Improving Coding Agent | Maxime Robeyns et.al. | 2504.15228 | null |
| 2025-04-21 | EvalAgent: Discovering Implicit Evaluation Criteria from the Web | Manya Wadhwa et.al. | 2504.15219 | null |
| 2025-04-21 | Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs | Marina Sakharova et.al. | 2504.15210 | null |
| 2025-04-21 | Compute-Optimal LLMs Provably Generalize Better With Scale | Marc Finzi et.al. | 2504.15208 | null |
| 2025-04-21 | Support Evaluation for the TREC 2024 RAG Track: Comparing Human versus LLM Judges | Nandan Thakur et.al. | 2504.15205 | null |
| 2025-04-22 | Synergistic Weak-Strong Collaboration by Aligning Preferences | Yizhu Jiao et.al. | 2504.15188 | null |
| 2025-04-21 | DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution | Miaomiao Cai et.al. | 2504.15176 | null |
| 2025-04-21 | The Synthetic Imputation Approach: Generating Optimal Synthetic Texts For Underrepresented Categories In Supervised Classification Tasks | Joan C. Timoneda et.al. | 2504.15160 | null |
| 2025-04-21 | KGMEL: Knowledge Graph-Enhanced Multimodal Entity Linking | Juyeon Kim et.al. | 2504.15135 | link |
| 2025-04-18 | Generative AI Act II: Test Time Scaling Drives Cognition Engineering | Shijie Xia et.al. | 2504.13828 | link |
| 2025-04-18 | Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models | Junjie Yang et.al. | 2504.13825 | null |
| 2025-04-18 | CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning | Yang Yue et.al. | 2504.13820 | link |
| 2025-04-18 | Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning | Yixuan Even Xu et.al. | 2504.13818 | null |
| 2025-04-18 | BadApex: Backdoor Attack Based on Adaptive Optimization Mechanism of Black-box Large Language Models | Zhengxian Wu et.al. | 2504.13775 | null |
| 2025-04-18 | DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs | Tamim Al Mahmud et.al. | 2504.13774 | link |
| 2025-04-18 | Detecting Malicious Source Code in PyPI Packages with LLMs: Does RAG Come in Handy? | Motunrayo Ibiyo et.al. | 2504.13769 | null |
| 2025-04-18 | Decoding Vision Transformers: the Diffusion Steering Lens | Ryota Takatsuki et.al. | 2504.13763 | link |
| 2025-04-18 | Scaling sparse feature circuit finding for in-context learning | Dmitrii Kharlapenko et.al. | 2504.13756 | null |
| 2025-04-18 | Learning to Attribute with Attention | Benjamin Cohen-Wang et.al. | 2504.13752 | link |
| 2025-04-18 | Controlled Territory and Conflict Tracking (CONTACT): (Geo-)Mapping Occupied Territory from Open Source Intelligence | Paul K. Mandal et.al. | 2504.13730 | link |
| 2025-04-18 | OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation | Yichen Wu et.al. | 2504.13707 | null |
| 2025-04-18 | Exploring Multimodal Prompt for Visualization Authoring with Large Language Models | Zhen Wen et.al. | 2504.13700 | null |
| 2025-04-18 | Analysing the Robustness of Vision-Language-Models to Common Corruptions | Muhammad Usama et.al. | 2504.13690 | null |
| 2025-04-18 | Intelligent Interaction Strategies for Context-Aware Cognitive Augmentation | Xiangrong et.al. | 2504.13684 | null |
| 2025-04-18 | Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results | Andrea Santilli et.al. | 2504.13677 | null |
| 2025-04-18 | Large Language Models Will Change The Way Children Think About Technology And Impact Every Interaction Paradigm | Russell Beale et.al. | 2504.13667 | null |
| 2025-04-18 | Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code | Antonio Della Porta et.al. | 2504.13656 | null |
| 2025-04-18 | EyecareGPT: Boosting Comprehensive Ophthalmology Understanding with Tailored Dataset, Benchmark and Model | Sijing Li et.al. | 2504.13650 | link |
| 2025-04-18 | Exploring the Potential for Large Language Models to Demonstrate Rational Probabilistic Beliefs | Gabriel Freedman et.al. | 2504.13644 | link |
| 2025-04-17 | Perception Encoder: The best visual embeddings are not at the output of the network | Daniel Bolya et.al. | 2504.13181 | null |
| 2025-04-17 | PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding | Jang Hyun Cho et.al. | 2504.13180 | link |
| 2025-04-17 | It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization | Ali Behrouz et.al. | 2504.13173 | null |
| 2025-04-17 | Sleep-time Compute: Beyond Inference Scaling at Test-time | Kevin Lin et.al. | 2504.13171 | link |
| 2025-04-17 | Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling | Tsung-Han Wu et.al. | 2504.13169 | link |
| 2025-04-17 | CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training | Shizhe Diao et.al. | 2504.13161 | null |
| 2025-04-17 | Digital Twin Generation from Visual Data: A Survey | Andrew Melnik et.al. | 2504.13159 | link |
| 2025-04-17 | MIB: A Mechanistic Interpretability Benchmark | Aaron Mueller et.al. | 2504.13151 | link |
| 2025-04-17 | Exploring Expert Failures Improves LLM Agent Tuning | Li-Cheng Lan et.al. | 2504.13145 | null |
| 2025-04-17 | Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo | João Loula et.al. | 2504.13139 | null |
| 2025-04-17 | Energy-Based Reward Models for Robust Language Model Alignment | Anamika Lochab et.al. | 2504.13134 | link |
| 2025-04-17 | LLMs Meet Finance: Fine-Tuning Foundation Models for the Open FinLLM Leaderboard | Varun Rao et.al. | 2504.13125 | null |
| 2025-04-17 | Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training | Xinsong Zhang et.al. | 2504.13123 | null |
| 2025-04-17 | VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models | Haojian Huang et.al. | 2504.13122 | link |
| 2025-04-17 | Probing and Inducing Combinational Creativity in Vision-Language Models | Yongqian Peng et.al. | 2504.13120 | null |
| 2025-04-17 | Object-Driven Narrative in AR: A Scenario-Metaphor Framework with VLM Integration | Yusi Sun et.al. | 2504.13119 | null |
| 2025-04-17 | Uncertainty-Aware Trajectory Prediction via Rule-Regularized Heteroscedastic Deep Classification | Kumar Manas et.al. | 2504.13111 | link |
| 2025-04-17 | EventVAD: Training-Free Event-Aware Video Anomaly Detection | Yihua Shao et.al. | 2504.13092 | null |
| 2025-04-17 | Retrieval-Augmented Generation with Conflicting Evidence | Han Wang et.al. | 2504.13079 | link |
| 2025-04-18 | SkyReels-V2: Infinite-length Film Generative Model | Guibin Chen et.al. | 2504.13074 | link |
| 2025-04-16 | BitNet b1.58 2B4T Technical Report | Shuming Ma et.al. | 2504.12285 | null |
| 2025-04-16 | HLS-Eval: A Benchmark and Framework for Evaluating LLMs on High-Level Synthesis Design Tasks | Stefan Abi-Karam et.al. | 2504.12268 | link |
| 2025-04-16 | FLIP Reasoning Challenge | Andreas Plesner et.al. | 2504.12256 | link |
| 2025-04-16 | AnomalyGen: An Automated Semantic Log Sequence Generation Framework with LLM for Anomaly Detection | Xinyu Li et.al. | 2504.12250 | null |
| 2025-04-16 | MOS: Towards Effective Smart Contract Vulnerability Detection through Mixture-of-Experts Tuning of Large Language Models | Hang Yuan et.al. | 2504.12234 | null |
| 2025-04-16 | Watermarking Needs Input Repetition Masking | David Khachaturov et.al. | 2504.12229 | null |
| 2025-04-16 | d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning | Siyan Zhao et.al. | 2504.12216 | null |
| 2025-04-16 | What Do Large Language Models Know? Tacit Knowledge as a Potential Causal-Explanatory Structure | Céline Budding et.al. | 2504.12187 | null |
| 2025-04-16 | SALAD: Improving Robustness and Generalization through Contrastive Learning with Structure-Aware and LLM-Driven Augmented Data | Suyoung Bae et.al. | 2504.12185 | null |
| 2025-04-16 | Trusting CHATGPT: how minor tweaks in the prompts lead to major differences in sentiment classification | Jaime E. Cuellar et.al. | 2504.12180 | null |
| 2025-04-16 | Multilingual Contextualization of Large Language Models for Document-Level Machine Translation | Miguel Moura Ramos et.al. | 2504.12140 | null |
| 2025-04-16 | Efficient Contrastive Decoding with Probabilistic Hallucination Detection - Mitigating Hallucinations in Large Vision Language Models - | Laura Fieback et.al. | 2504.12137 | null |
| 2025-04-16 | Clarifying Ambiguities: on the Role of Ambiguity Types in Prompting Methods for Clarification Generation | Anfu Tang et.al. | 2504.12113 | null |
| 2025-04-16 | Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation | Shizhan Cai et.al. | 2504.12108 | null |
| 2025-04-16 | Logits DeConfusion with CLIP for Few-Shot Learning | Shuo Li et.al. | 2504.12104 | link |
| 2025-04-16 | Gauging Overprecision in LLMs: An Empirical Study | Adil Bahaj et.al. | 2504.12098 | null |
| 2025-04-16 | Reasoning-Based AI for Startup Evaluation (R.A.I.S.E.): A Memory-Augmented, Multi-Step Decision Framework | Jack Preuveneers et.al. | 2504.12090 | null |
| 2025-04-16 | Self-alignment of Large Video Language Models with Refined Regularized Preference Optimization | Pritam Sarkar et.al. | 2504.12083 | null |
| 2025-04-16 | Selective Demonstration Retrieval for Improved Implicit Hate Speech Detection | Yumin Kim et.al. | 2504.12082 | null |
| 2025-04-16 | Subitizing-Inspired_Large_Language_Models_for_Floorplanning | Shao-Chien Lu et.al. | 2504.12076 | null |
| 2025-04-16 | Elucidating the Design Space of Multimodal Protein Language Models | Cheng-Yen Hsieh et.al. | 2504.11454 | null |
| 2025-04-15 | TextArena | Leon Guertler et.al. | 2504.11442 | link |
| 2025-04-15 | Masculine Defaults via Gendered Discourse in Podcasts and Large Language Models | Maria Teleki et.al. | 2504.11431 | link |
| 2025-04-15 | A Dual-Space Framework for General Knowledge Distillation of Large Language Models | Xue Zhang et.al. | 2504.11426 | null |
| 2025-04-15 | Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts | Quanyu Long et.al. | 2504.11420 | null |
| 2025-04-15 | Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning | Ali Taghibakhshi et.al. | 2504.11409 | null |
| 2025-04-15 | DataDecide: How to Predict Best Pretraining Data with Small Experiments | Ian Magnusson et.al. | 2504.11393 | null |
| 2025-04-15 | RankAlign: A Ranking View of the Generator-Validator Gap in Large Language Models | Juan Diego Rodriguez et.al. | 2504.11381 | link |
| 2025-04-15 | Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions | Wang Bill Zhu et.al. | 2504.11373 | link |
| 2025-04-15 | OpenTuringBench: An Open-Model-based Benchmark and Framework for Machine-Generated Text Detection and Attribution | Lucio La Cava et.al. | 2504.11369 | null |
| 2025-04-15 | From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation | Jingkun Chen et.al. | 2504.11368 | null |
| 2025-04-15 | Teaching Large Language Models to Reason through Learning and Forgetting | Tianwei Ni et.al. | 2504.11364 | link |
| 2025-04-15 | Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning | Haiming Wang et.al. | 2504.11354 | link |
| 2025-04-15 | Seedream 3.0 Technical Report | Yu Gao et.al. | 2504.11346 | null |
| 2025-04-15 | A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce | Wei Xiong et.al. | 2504.11343 | link |
| 2025-04-15 | REWARD CONSISTENCY: Improving Multi-Objective Alignment from a Data-Centric Perspective | Zhihao Xu et.al. | 2504.11337 | null |
| 2025-04-15 | Looking beyond the next token | Abitha Thankaraj et.al. | 2504.11336 | null |
| 2025-04-15 | Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints | Ruicheng Ao et.al. | 2504.11320 | link |
| 2025-04-15 | Learning to Be A Doctor: Searching for Effective Medical Agent Architectures | Yangyang Zhuang et.al. | 2504.11301 | null |
| 2025-04-15 | Automated Python Translation | Joshua Otten et.al. | 2504.11290 | null |
| 2025-04-14 | InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models | Jinguo Zhu et.al. | 2504.10479 | link |
| 2025-04-14 | Weight Ensembling Improves Reasoning in Language Models | Xingyu Dang et.al. | 2504.10478 | null |
| 2025-04-14 | MIEB: Massive Image Embedding Benchmark | Chenghao Xiao et.al. | 2504.10471 | link |
| 2025-04-14 | Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding | Tao Zhang et.al. | 2504.10465 | link |
| 2025-04-14 | The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer | Weixian Lei et.al. | 2504.10462 | link |
| 2025-04-14 | GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents | Xiaobo Xia et.al. | 2504.10458 | null |
| 2025-04-14 | M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models | Junxiong Wang et.al. | 2504.10449 | link |
| 2025-04-14 | Multimodal Long Video Modeling Based on Temporal Dynamic Context | Haoran Hao et.al. | 2504.10443 | link |
| 2025-04-14 | LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models | Minqian Liu et.al. | 2504.10430 | null |
| 2025-04-14 | Foundation models for electronic health records: representation dynamics and transferability | Michael C. Burkhart et.al. | 2504.10422 | link |
| 2025-04-14 | Can We Edit LLMs for Long-Tail Biomedical Knowledge? | Xinhao Yi et.al. | 2504.10421 | link |
| 2025-04-15 | Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA | Michał Turski et.al. | 2504.10419 | link |
| 2025-04-14 | CliniChat: A Multi-Source Knowledge-Driven Framework for Clinical Interview Dialogue Reconstruction and Evaluation | Jing Chen et.al. | 2504.10418 | null |
| 2025-04-14 | LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models | Parshin Shojaee et.al. | 2504.10415 | link |
| 2025-04-14 | Performance of Large Language Models in Supporting Medical Diagnosis and Treatment | Diogo Sousa et.al. | 2504.10405 | null |
| 2025-04-14 | Satellite Federated Fine-Tuning for Foundation Models in Space Computing Power Networks | Yan zhu et.al. | 2504.10403 | null |
| 2025-04-14 | Can LLMs Assist Expert Elicitation for Probabilistic Causal Modeling? | Olha Shaposhnyk et.al. | 2504.10397 | null |
| 2025-04-14 | SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning | Yiting Wang et.al. | 2504.10369 | null |
| 2025-04-14 | DICE: A Framework for Dimensional and Contextual Evaluation of Language Models | Aryan Shrivastava et.al. | 2504.10359 | null |
| 2025-04-14 | Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis | Yifan Yang et.al. | 2504.10352 | null |
| 2025-04-11 | Quantum Large Language Model Fine-Tuning | Sang Hyub Kim et.al. | 2504.08732 | null |
| 2025-04-11 | DocAgent: A Multi-Agent System for Automated Code Documentation Generation | Dayu Yang et.al. | 2504.08725 | link |
| 2025-04-11 | SWAN-GPT: An Efficient and Scalable Approach for Long-Context Language Modeling | Krishna C. Puvvada et.al. | 2504.08719 | null |
| 2025-04-11 | SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents | Muhammad Shihab Rashid et.al. | 2504.08703 | link |
| 2025-04-11 | Large Language Models as Span Annotators | Zdeněk Kasner et.al. | 2504.08697 | null |
| 2025-04-11 | TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning | Hang Ni et.al. | 2504.08694 | null |
| 2025-04-11 | Fast-Slow-Thinking: Complex Task Solving with Large Language Models | Yiliu Sun et.al. | 2504.08690 | null |
| 2025-04-11 | Voice Interaction With Conversational AI Could Facilitate Thoughtful Reflection and Substantive Revision in Writing | Jiho Kim et.al. | 2504.08687 | null |
| 2025-04-11 | Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model | Team Seawead et.al. | 2504.08685 | null |
| 2025-04-11 | Variability-Driven User-Story Generation using LLM and Triadic Concept Analysis | Alexandre Bazin et.al. | 2504.08666 | null |
| 2025-04-11 | Quality evaluation of Tabby coding assistant using real source code snippets | Marta Borek et.al. | 2504.08650 | link |
| 2025-04-11 | Do LLMs trust AI regulation? Emerging behaviour of game-theoretic LLM agents | Alessio Buscemi et.al. | 2504.08640 | null |
| 2025-04-11 | Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging | Gabriele Lozupone et.al. | 2504.08635 | link |
| 2025-04-11 | MooseAgent: A LLM Based Multi-agent Framework for Automating Moose Simulation | Tao Zhang et.al. | 2504.08621 | link |
| 2025-04-11 | Analyzing 16,193 LLM Papers for Fun and Profits | Zhiqiu Xia et.al. | 2504.08619 | null |
| 2025-04-11 | Playpen: An Environment for Exploring Learning Through Conversational Interaction | Nicola Horst et.al. | 2504.08590 | link |
| 2025-04-11 | AstroLLaVA: towards the unification of astronomical data and natural language | Sharaf Zaman et.al. | 2504.08583 | null |
| 2025-04-11 | UoB-NLP at SemEval-2025 Task 11: Leveraging Adapters for Multilingual and Cross-Lingual Emotion Detection | Frances Laureano De Leon et.al. | 2504.08543 | null |
| 2025-04-11 | Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions | Tommaso Galliena et.al. | 2504.08531 | null |
| 2025-04-11 | On The Landscape of Spoken Language Models: A Comprehensive Survey | Siddhant Arora et.al. | 2504.08528 | null |
| 2025-04-10 | Cat, Rat, Meow: On the Alignment of Language Model and Human Term-Similarity Judgments | Lorenz Linhardt et.al. | 2504.07965 | null |
| 2025-04-10 | C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing | Zhongyang Li et.al. | 2504.07964 | link |
| 2025-04-10 | GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation | Lang Lin et.al. | 2504.07962 | null |
| 2025-04-10 | Detect Anything 3D in the Wild | Hanxue Zhang et.al. | 2504.07958 | link |
| 2025-04-10 | MM-IFEngine: Towards Multimodal Instruction Following | Shengyuan Ding et.al. | 2504.07957 | link |
| 2025-04-10 | VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning | Yukun Qi et.al. | 2504.07956 | null |
| 2025-04-10 | Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory | Mirac Suzgun et.al. | 2504.07952 | link |
| 2025-04-10 | We Are All Creators: Generative AI, Collective Knowledge, and the Path Towards Human-AI Synergy | Jordi Linares-Pellicer et.al. | 2504.07936 | null |
| 2025-04-10 | Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining | Rosie Zhao et.al. | 2504.07912 | link |
| 2025-04-10 | Porting an LLM based Application from ChatGPT to an On-Premise Environment | Teemu Paloniemi et.al. | 2504.07907 | null |
| 2025-04-10 | Redefining Machine Translation on Social Network Services with Large Language Models | Hongcheng Guo et.al. | 2504.07901 | link |
| 2025-04-10 | How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective | Qi Liu et.al. | 2504.07898 | link |
| 2025-04-10 | Fast Adaptation with Behavioral Foundation Models | Harshit Sikchi et.al. | 2504.07896 | null |
| 2025-04-10 | Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge | Riccardo Cantini et.al. | 2504.07887 | link |
| 2025-04-11 | An LLM-Driven Multi-Agent Debate System for Mendelian Diseases | Xinyang Zhou et.al. | 2504.07881 | null |
| 2025-04-10 | Token Level Routing Inference System for Edge Devices | Jianshu She et.al. | 2504.07878 | null |
| 2025-04-10 | SAMJAM: Zero-Shot Video Scene Graph Generation for Egocentric Kitchen Videos | Joshua Li et.al. | 2504.07867 | null |
| 2025-04-11 | Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs | Yichun Yin et.al. | 2504.07866 | null |
| 2025-04-10 | Robust Hallucination Detection in LLMs via Adaptive Token Selection | Mengjia Niu et.al. | 2504.07863 | null |
| 2025-04-10 | 2D-Curri-DPO: Two-Dimensional Curriculum Learning for Direct Preference Optimization | Mengyang Li et.al. | 2504.07856 | null |
| 2025-04-09 | Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning | Nikhil Shivakumar Nayak et.al. | 2504.07097 | link |
| 2025-04-09 | OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens | Jiacheng Liu et.al. | 2504.07096 | null |
| 2025-04-09 | Are We Done with Object-Centric Learning? | Alexander Rubinstein et.al. | 2504.07092 | link |
| 2025-04-09 | KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs | Elan Markowitz et.al. | 2504.07087 | null |
| 2025-04-09 | A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility | Andreas Hochlehnert et.al. | 2504.07086 | null |
| 2025-04-09 | Self-Steering Language Models | Gabriel Grand et.al. | 2504.07081 | null |
| 2025-04-09 | DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning | Atharva Pandey et.al. | 2504.07080 | null |
| 2025-04-09 | Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation | Israfel Salazar et.al. | 2504.07072 | null |
| 2025-04-09 | A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models | Zhouhang Xie et.al. | 2504.07070 | null |
| 2025-04-09 | HalluciNot: Hallucination Detection Through Context and Common Knowledge Verification | Bibek Paudel et.al. | 2504.07069 | null |
| 2025-04-09 | Teaching pathology foundation models to accurately predict gene expression with parameter efficient knowledge transfer | Shi Pan et.al. | 2504.07061 | null |
| 2025-04-09 | TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling | Liang-Hsuan Tseng et.al. | 2504.07053 | link |
| 2025-04-09 | To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning | Tian Qin et.al. | 2504.07052 | null |
| 2025-04-09 | Evaluating Retrieval Augmented Generative Models for Document Queries in Transportation Safety | Chad Melton et.al. | 2504.07022 | null |
| 2025-04-09 | LLM-IFT: LLM-Powered Information Flow Tracking for Secure Hardware | Nowfel Mashnoor et.al. | 2504.07015 | null |
| 2025-04-09 | Towards LLMs Robustness to Changes in Prompt Format Styles | Lilian Ngweta et.al. | 2504.06969 | null |
| 2025-04-09 | Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation | Thomas Kerdreux et.al. | 2504.06962 | null |
| 2025-04-09 | VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning | Xinhao Li et.al. | 2504.06958 | null |
| 2025-04-09 | Adaptive Computation Pruning for the Forgetting Transformer | Zhixuan Lin et.al. | 2504.06949 | null |
| 2025-04-09 | RuOpinionNE-2024: Extraction of Opinion Tuples from Russian News Texts | Natalia Loukachevitch et.al. | 2504.06947 | link |
| 2025-04-08 | GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization | Bojana Ranković et.al. | 2504.06265 | link |
| 2025-04-08 | OmniSVG: A Unified Scalable Vector Graphics Generation Model | Yiying Yang et.al. | 2504.06263 | null |
| 2025-04-08 | Hogwild! Inference: Parallel LLM Generation via Concurrent Attention | Gleb Rodionov et.al. | 2504.06261 | link |
| 2025-04-08 | FEABench: Evaluating Language Models on Multiphysics Reasoning Ability | Nayantara Mudur et.al. | 2504.06260 | link |
| 2025-04-08 | Orb-v3: atomistic simulation at scale | Benjamin Rhodes et.al. | 2504.06231 | link |
| 2025-04-08 | LExT: Towards Evaluating Trustworthiness of Natural Language Explanations | Krithi Shailya et.al. | 2504.06227 | null |
| 2025-04-08 | Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation | Biao Zhang et.al. | 2504.06225 | null |
| 2025-04-09 | Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation | Xiaoxing Hu et.al. | 2504.06220 | link |
| 2025-04-08 | Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs | Dongyang Fan et.al. | 2504.06219 | null |
| 2025-04-08 | From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models | Chejian Xu et.al. | 2504.06214 | null |
| 2025-04-08 | TxGemma: Efficient and Agentic LLMs for Therapeutics | Eric Wang et.al. | 2504.06196 | null |
| 2025-04-08 | A Self-Supervised Framework for Space Object Behaviour Characterisation | Ian Groves et.al. | 2504.06176 | null |
| 2025-04-08 | Assessing how hyperparameters impact Large Language Models' sarcasm detection performance | Montgomery Gole et.al. | 2504.06166 | null |
| 2025-04-09 | Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups | Rijul Magu et.al. | 2504.06160 | null |
| 2025-04-08 | A Large-Scale Analysis on Contextual Self-Supervised Video Representation Learning | Akash Kumar et.al. | 2504.06153 | null |
| 2025-04-08 | V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models | Xiangxi Zheng et.al. | 2504.06148 | link |
| 2025-04-08 | ARLO: A Tailorable Approach for Transforming Natural Language Software Requirements into Architecture using LLMs | Tooraj Helmi et.al. | 2504.06143 | null |
| 2025-04-08 | Adversarial Training of Reward Models | Alexander Bukharin et.al. | 2504.06141 | null |
| 2025-04-08 | A Multimedia Analytics Model for the Foundation Model Era | Marcel Worring et.al. | 2504.06138 | null |
| 2025-04-08 | QGen Studio: An Adaptive Question-Answer Generation, Training and Evaluation Platform | Movina Moses et.al. | 2504.06136 | null |
| 2025-04-07 | URECA: Unique Region Caption Anything | Sangbeom Lim et.al. | 2504.05305 | null |
| 2025-04-07 | InteractVLM: 3D Interaction Reasoning from 2D Foundational Models | Sai Kumar Dwivedi et.al. | 2504.05303 | link |
| 2025-04-07 | SmolVLM: Redefining small and efficient multimodal models | Andrés Marafioti et.al. | 2504.05299 | null |
| 2025-04-07 | Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations | Pedro Ferreira et.al. | 2504.05294 | null |
| 2025-04-07 | The challenge of uncertainty quantification of large language models in medicine | Zahra Atf et.al. | 2504.05278 | null |
| 2025-04-07 | Enhancing LLM-Based Short Answer Grading with Retrieval-Augmented Generation | Yucheng Chu et.al. | 2504.05276 | null |
| 2025-04-07 | Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models | Yang Yan et.al. | 2504.05262 | null |
| 2025-04-07 | Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models | Adrián Bazaga et.al. | 2504.05258 | null |
| 2025-04-07 | Explaining Low Perception Model Competency with High-Competency Counterfactuals | Sara Pohland et.al. | 2504.05254 | null |
| 2025-04-07 | LLM-based Automated Grading with Human-in-the-Loop | Hang Li et.al. | 2504.05239 | null |
| 2025-04-07 | NoveltyBench: Evaluating Creativity and Diversity in Language Models | Yiming Zhang et.al. | 2504.05228 | null |
| 2025-04-07 | A Reality Check of Vision-Language Pre-training in Radiology: Have We Progressed Using Text? | Julio Silva-Rodríguez et.al. | 2504.05227 | null |
| 2025-04-07 | Vision-Language Model Predictive Control for Manipulation Planning and Trajectory Generation | Jiaming Chen et.al. | 2504.05225 | link |
| 2025-04-08 | Leveraging LLMs for Utility-Focused Annotation: Reducing Manual Effort for Retrieval and RAG | Hengran Zhang et.al. | 2504.05220 | null |
| 2025-04-07 | Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling | Hengran Zhang et.al. | 2504.05216 | null |
| 2025-04-07 | Post-Training Language Models for Continual Relation Extraction | Sefika Efeoglu et.al. | 2504.05214 | null |
| 2025-04-07 | Quantum Program Linting with LLMs: Emerging Results from a Comparative Study | Seung Yeob Shin et.al. | 2504.05204 | null |
| 2025-04-07 | Training state-of-the-art pathology foundation models with orders of magnitude less data | Mikhail Karasikov et.al. | 2504.05186 | null |
| 2025-04-07 | Concise Reasoning via Reinforcement Learning | Mehdi Fatemi et.al. | 2504.05185 | link |
| 2025-04-07 | BRIDGES: Bridging Graph Modality and Large Language Models within EDA Tasks | Wei Li et.al. | 2504.05180 | null |
| 2025-04-04 | Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions | Ting-Hsuan Liao et.al. | 2504.03639 | null |
| 2025-04-04 | Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning | Xinyi Wang et.al. | 2504.03635 | null |
| 2025-04-04 | Align to Structure: Aligning Large Language Models with Structural Information | Zae Myung Kim et.al. | 2504.03622 | null |
| 2025-04-04 | VISTA-OCR: Towards generative and interactive end to end OCR models | Laziz Hamdi et.al. | 2504.03621 | null |
| 2025-04-04 | Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task | Leonardo Ranaldi et.al. | 2504.03616 | null |
| 2025-04-04 | AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset | Bingxiang He et.al. | 2504.03612 | null |
| 2025-04-04 | MedSAM2: Segment Anything in 3D Medical Images and Videos | Jun Ma et.al. | 2504.03600 | link |
| 2025-04-04 | EnrichIndex: Using LLMs to Enrich Retrieval Indices Offline | Peter Baile Chen et.al. | 2504.03598 | null |
| 2025-04-04 | PF3Det: A Prompted Foundation Feature Assisted Visual LiDAR 3D Detector | Kaidong Li et.al. | 2504.03563 | null |
| 2025-04-04 | Agentic Knowledgeable Self-awareness | Shuofei Qiao et.al. | 2504.03553 | link |
| 2025-04-04 | RANa: Retrieval-Augmented Navigation | Gianluca Monaci et.al. | 2504.03524 | null |
| 2025-04-04 | Neutralizing the Narrative: AI-Powered Debiasing of Online News Articles | Chen Wei Kuo et.al. | 2504.03520 | null |
| 2025-04-04 | SpectR: Dynamically Composing LM Experts with Spectral Routing | William Fleshman et.al. | 2504.03454 | null |
| 2025-04-04 | Optimizing Specific and Shared Parameters for Efficient Parameter Tuning | Van-Anh Nguyen et.al. | 2504.03450 | null |
| 2025-04-04 | LLMSched: Uncertainty-Aware Workload Scheduling for Compound LLM Applications | Botao Zhu et.al. | 2504.03444 | null |
| 2025-04-04 | Know What You do Not Know: Verbalized Uncertainty Estimation Robustness on Corrupted Images in Vision-Language Models | Mirko Borszukovszki et.al. | 2504.03440 | null |
| 2025-04-04 | Locations of Characters in Narratives: Andersen and Persuasion Datasets | Batuhan Ozyurt et.al. | 2504.03434 | link |
| 2025-04-04 | Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning | Sanghwan Bae et.al. | 2504.03380 | null |
| 2025-04-04 | MultiClear: Multimodal Soft Exoskeleton Glove for Transparent Object Grasping Assistance | Chen Hu et.al. | 2504.03379 | null |
| 2025-04-04 | Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency | Erik Johannes Husom et.al. | 2504.03360 | null |
| 2025-04-03 | STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection | Divya Velayudhan et.al. | 2504.02823 | null |
| 2025-04-03 | Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models | Mateusz Pach et.al. | 2504.02821 | link |
| 2025-04-03 | Generative Evaluation of Complex Reasoning in Large Language Models | Haowei Lin et.al. | 2504.02810 | link |
| 2025-04-03 | MegaMath: Pushing the Limits of Open Math Corpora | Fan Zhou et.al. | 2504.02807 | link |
| 2025-04-03 | F-ViTA: Foundation Model Guided Visible to Thermal Translation | Jay N. Paranjape et.al. | 2504.02801 | link |
| 2025-04-04 | A Survey of Large Language Models in Mental Health Disorder Detection on Social Media | Zhuohan Ge et.al. | 2504.02800 | null |
| 2025-04-03 | Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence | Anita Rau et.al. | 2504.02799 | null |
| 2025-04-03 | A Framework for Situating Innovations, Opportunities, and Challenges in Advancing Vertical Systems with Large AI Models | Gaurav Verma et.al. | 2504.02793 | null |
| 2025-04-03 | Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets | Chuning Zhu et.al. | 2504.02792 | null |
| 2025-04-03 | A Framework for Robust Cognitive Evaluation of LLMs | Karin de Langis et.al. | 2504.02789 | null |
| 2025-04-03 | From Consumption to Collaboration: Measuring Interaction Patterns to Augment Human Cognition in Open-Ended Tasks | Joshua Holstein et.al. | 2504.02780 | null |
| 2025-04-03 | BT-ACTION: A Test-Driven Approach for Modular Understanding of User Instruction Leveraging Behaviour Trees and LLMs | Alexander Leszczynski et.al. | 2504.02779 | link |
| 2025-04-03 | How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices? | Andres Algaba et.al. | 2504.02767 | link |
| 2025-04-03 | Robot-Led Vision Language Model Wellbeing Assessment of Children | Nida Itrat Abbasi et.al. | 2504.02765 | null |
| 2025-04-03 | Enhancing LLM Robustness to Perturbed Instructions: An Empirical Study | Aryan Agrawal et.al. | 2504.02733 | link |
| 2025-04-04 | Why do LLMs attend to the first token? | Federico Barbero et.al. | 2504.02732 | null |
| 2025-04-03 | ERPO: Advancing Safety Alignment via Ex-Ante Reasoning Preference Optimization | Kehua Feng et.al. | 2504.02725 | null |
| 2025-04-03 | TeleMoM: Consensus-Driven Telecom Intelligence via Mixture of Models | Xinquan Wang et.al. | 2504.02712 | null |
| 2025-04-03 | The Hidden Space of Safety: Understanding Preference-Tuned LLMs in Multilingual context | Nikhil Verma et.al. | 2504.02708 | null |
| 2025-04-03 | LLM for Complex Reasoning Task: An Exploratory Study in Fermi Problems | Zishuo Liu et.al. | 2504.02671 | null |
| 2025-04-02 | Slot-Level Robotic Placement via Visual Imitation from Single Human Video | Dandan Shan et.al. | 2504.01959 | null |
| 2025-04-02 | Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target Granularities | Jing Liu et.al. | 2504.01954 | null |
| 2025-04-02 | The LLM Wears Prada: Analysing Gender Bias and Stereotypes through Online Shopping Data | Massimiliano Luca et.al. | 2504.01951 | null |
| 2025-04-02 | Efficient Federated Learning Tiny Language Models for Mobile Network Feature Prediction | Daniel Becking et.al. | 2504.01947 | null |
| 2025-04-02 | OpenCodeReasoning: Advancing Data Distillation for Competitive Coding | Wasi Uddin Ahmad et.al. | 2504.01943 | null |
| 2025-04-02 | Critical Thinking: Which Kinds of Complexity Govern Optimal Reasoning Length? | Celine Lee et.al. | 2504.01935 | link |
| 2025-04-02 | A thorough benchmark of automatic text classification: From traditional approaches to large language models | Washington Cunha et.al. | 2504.01930 | link |
| 2025-04-02 | Gen-C: Populating Virtual Worlds with Generative Crowds | Andreas Panayiotou et.al. | 2504.01924 | null |
| 2025-04-02 | Is Less Really More? Fake News Detection with Limited Information | Zhaoyang Cao et.al. | 2504.01922 | link |
| 2025-04-02 | Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation | Baban Gain et.al. | 2504.01919 | null |
| 2025-04-02 | FineLIP: Extending CLIP's Reach via Fine-Grained Alignment with Longer Text Inputs | Mothilal Asokan et.al. | 2504.01916 | link |
| 2025-04-02 | Advancing AI-Scientist Understanding: Making LLM Think Like a Physicist with Interpretable Reasoning | Yinggan Xu et.al. | 2504.01911 | null |
| 2025-04-02 | Is Temporal Prompting All We Need For Limited Labeled Action Recognition? | Shreyank N Gowda et.al. | 2504.01890 | null |
| 2025-04-02 | TransientTables: Evaluating LLMs' Reasoning on Temporally Evolving Semi-structured Tables | Abhilash Shankarampeta et.al. | 2504.01879 | null |
| 2025-04-02 | From Code Generation to Software Testing: AI Copilot with Context-Based RAG | Yuchen Wang et.al. | 2504.01866 | null |
| 2025-04-02 | Cross-Lingual Consistency: A Novel Inference Framework for Advancing Reasoning in Large Language Models | Zhiwei Yu et.al. | 2504.01857 | null |
| 2025-04-02 | Code Red! On the Harmfulness of Applying Off-the-shelf Large Language Models to Programming Tasks | Ali Al-Kaswan et.al. | 2504.01850 | null |
| 2025-04-02 | LARGE: Legal Retrieval Augmented Generation Evaluation Tool | Minhu Park et.al. | 2504.01840 | link |
| 2025-04-02 | Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images | Nusrat Munia et.al. | 2504.01838 | link |
| 2025-04-02 | YourBench: Easy Custom Evaluation Sets for Everyone | Sumuk Shashidhar et.al. | 2504.01833 | link |
| 2025-03-31 | Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation | Shengqiong Wu et.al. | 2503.24379 | null |
| 2025-03-31 | ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning | Harsha Kokel et.al. | 2503.24378 | null |
| 2025-03-31 | Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models | Rui Wang et.al. | 2503.24377 | link |
| 2025-03-31 | Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 | Yi Chen et.al. | 2503.24376 | link |
| 2025-03-31 | Effectively Controlling Reasoning Models through Thinking Intervention | Tong Wu et.al. | 2503.24370 | null |
| 2025-03-31 | Adapting Vision Foundation Models for Real-time Ultrasound Image Segmentation | Xiaoran Zhang et.al. | 2503.24368 | null |
| 2025-03-31 | ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion | Rana Muhammad Shahroz Khan et.al. | 2503.24354 | null |
| 2025-03-31 | PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks | Fang Yan et.al. | 2503.24345 | null |
| 2025-03-31 | Can Test-Time Scaling Improve World Foundation Model? | Wenyan Cong et.al. | 2503.24320 | link |
| 2025-03-31 | BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models | Alok Abhishek et.al. | 2503.24310 | null |
| 2025-03-31 | A Systematic Evaluation of LLM Strategies for Mental Health Text Analysis: Fine-tuning vs. Prompt Engineering vs. RAG | Arshia Kermani et.al. | 2503.24307 | null |
| 2025-03-31 | Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning | Jiacheng Lin et.al. | 2503.24289 | link |
| 2025-03-31 | Style Quantization for Data-Efficient GAN Training | Jian Wang et.al. | 2503.24282 | null |
| 2025-03-31 | Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality | Sewoong Lee et.al. | 2503.24277 | link |
| 2025-03-31 | Enhancing Large Language Models (LLMs) for Telecommunications using Knowledge Graphs and Retrieval-Augmented Generation | Dun Yuan et.al. | 2503.24245 | null |
| 2025-03-31 | What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models | Qiyuan Zhang et.al. | 2503.24235 | link |
| 2025-03-31 | Synthetic News Generation for Fake News Classification | Abdul Sittar et.al. | 2503.24206 | null |
| 2025-03-31 | TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers' Guidance | Jingxian Xu et.al. | 2503.24198 | null |
| 2025-03-31 | Text2Tracks: Prompt-based Music Recommendation via Generative Retrieval | Enrico Palumbo et.al. | 2503.24193 | null |
| 2025-03-31 | Output Constraints as Attack Surface: Exploiting Structured Generation to Bypass LLM Safety Mechanisms | Shuoming Zhang et.al. | 2503.24191 | null |
| 2025-03-28 | Q-Insight: Understanding Image Quality via Visual Reinforcement Learning | Weiqi Li et.al. | 2503.22679 | link |
| 2025-03-28 | QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks? | Belinda Z. Li et.al. | 2503.22674 | link |
| 2025-03-28 | Exploring the Effectiveness of Multi-stage Fine-tuning for Cross-encoder Re-rankers | Francesca Pezzuti et.al. | 2503.22672 | link |
| 2025-03-28 | Understanding Co-speech Gestures in-the-wild | Sindhu B Hegde et.al. | 2503.22668 | null |
| 2025-03-28 | Unicorn: Text-Only Data Synthesis for Vision Language Model Training | Xiaomin Yu et.al. | 2503.22655 | link |
| 2025-03-28 | Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users | Antonia Karamolegkou et.al. | 2503.22610 | null |
| 2025-03-28 | On the Alignment of Post-Publication Reviews & Bibliometric and Altmetric Impact -- A Case Study on Expert Statements from the Science Media Center Germany | Dirk Tunger et.al. | 2503.22594 | null |
| 2025-03-28 | LLM-enabled Instance Model Generation | Fengjunjie Pan et.al. | 2503.22587 | null |
| 2025-03-28 | Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish | Kevin Cohen et.al. | 2503.22585 | link |
| 2025-03-28 | Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation | Sarubi Thillainathan et.al. | 2503.22582 | null |
| 2025-03-28 | Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization | Iñigo Pikabea et.al. | 2503.22577 | null |
| 2025-03-28 | Niyama : Breaking the Silos of LLM Inference Serving | Kanishk Goel et.al. | 2503.22562 | null |
| 2025-03-28 | Bridging the Dimensional Chasm: Uncover Layer-wise Dimensional Reduction in Transformers through Token Correlation | Zhuo-Yang Song et.al. | 2503.22547 | null |
| 2025-03-28 | Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities | Raman Dutt et.al. | 2503.22517 | null |
| 2025-03-28 | Assessing Foundation Models for Sea Ice Type Segmentation in Sentinel-1 SAR Imagery | Samira Alkaee Taleghan et.al. | 2503.22516 | null |
| 2025-03-28 | Probabilistic Uncertain Reward Model: A Natural Generalization of Bradley-Terry Reward Model | Wangtao Sun et.al. | 2503.22480 | null |
| 2025-03-28 | WorkTeam: Constructing Workflows from Natural Language with Multi-Agents | Hanchao Liu et.al. | 2503.22473 | null |
| 2025-03-28 | Evaluating LLM-based Agents for Multi-Turn Conversations: A Survey | Shengyue Guan et.al. | 2503.22458 | null |
| 2025-03-28 | Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning | Abdullah Vanlioglu et.al. | 2503.22456 | null |
| 2025-03-28 | STADE: Standard Deviation as a Pruning Metric | Diego Coello de Portugal Mecke et.al. | 2503.22451 | link |
| 2025-03-27 | Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model | Abdelrahman Shaker et.al. | 2503.21782 | link |
| 2025-03-27 | Video-R1: Reinforcing Video Reasoning in MLLMs | Kaituo Feng et.al. | 2503.21776 | link |
| 2025-03-27 | Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence | Haolin Liu et.al. | 2503.21766 | null |
| 2025-03-27 | Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video | David Yifan Yao et.al. | 2503.21761 | link |
| 2025-03-27 | MemInsight: Autonomous Memory Augmentation for LLM Agents | Rana Salama et.al. | 2503.21760 | null |
| 2025-03-27 | Fwd2Bot: LVLM Visual Token Compression with Double Forward Bottleneck | Adrian Bulat et.al. | 2503.21757 | null |
| 2025-03-27 | GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics | Arsham Gholamzadeh Khoee et.al. | 2503.21735 | null |
| 2025-03-27 | Effective Skill Unlearning through Intervention and Abstention | Yongce Li et.al. | 2503.21730 | link |
| 2025-03-27 | Collab: Controlled Decoding using Mixture of Agents for LLM Alignment | Souradip Chakraborty et.al. | 2503.21720 | null |
| 2025-03-27 | Outlier dimensions favor frequent tokens in language model | Iuri Macocco et.al. | 2503.21718 | null |
| 2025-03-27 | As easy as PIE: understanding when pruning causes language models to disagree | Pietro Tropeano et.al. | 2503.21714 | link |
| 2025-03-27 | Enhancing Repository-Level Software Repair via Repository-Aware Knowledge Graphs | Boyang Yang et.al. | 2503.21710 | null |
| 2025-03-27 | LLM-Gomoku: A Large Language Model-Based System for Strategic Gomoku with Self-Play and Reinforcement Learning | Hui Wang et.al. | 2503.21683 | null |
| 2025-03-27 | JiraiBench: A Bilingual Benchmark for Evaluating Large Language Models' Detection of Human Self-Destructive Behavior Content in Jirai Community | Yunze Xiao et.al. | 2503.21679 | null |
| 2025-03-27 | How do language models learn facts? Dynamics, curricula and hallucinations | Nicolas Zucchet et.al. | 2503.21676 | null |
| 2025-03-27 | Intelligent IoT Attack Detection Design via ODLLM with Feature Ranking-based Knowledge Base | Satvik Verma et.al. | 2503.21674 | link |
| 2025-03-27 | Model Assembly Learning with Heterogeneous Layer Weight Merging | Yi-Kai Zhang et.al. | 2503.21657 | null |
| 2025-03-27 | UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning | Zhengxi Lu et.al. | 2503.21620 | link |
| 2025-03-27 | Leveraging Language Models for Analyzing Longitudinal Experiential Data in Education | Ahatsham Hayat et.al. | 2503.21617 | null |
| 2025-03-27 | Evaluating book summaries from internal knowledge in Large Language Models: a cross-model and semantic consistency approach | Javier Coronado-Blázquez et.al. | 2503.21613 | null |
| 2025-03-26 | Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark | Sondos Mahmoud Bsharat et.al. | 2503.20786 | link |
| 2025-03-26 | Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency | Tianqi Liu et.al. | 2503.20785 | link |
| 2025-03-26 | Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields | Shijie Zhou et.al. | 2503.20776 | null |
| 2025-03-26 | ASGO: Adaptive Structured Gradient Optimization | Kang An et.al. | 2503.20762 | null |
| 2025-03-26 | MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search | Yunhai Hu et.al. | 2503.20757 | null |
| 2025-03-27 | Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning | Huajie Tan et.al. | 2503.20752 | null |
| 2025-03-26 | UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines | Chen Tang et.al. | 2503.20748 | null |
| 2025-03-26 | MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams | Yanpeng Sun et.al. | 2503.20745 | null |
| 2025-03-26 | Dynamic Motion Blending for Versatile Motion Editing | Nan Jiang et.al. | 2503.20724 | null |
| 2025-03-26 | From Annotation to Adaptation: Metrics, Synthetic Data, and Aspect Extraction for Aspect-Based Sentiment Analysis with Large Language Models | Nikita Neveditsin et.al. | 2503.20715 | null |
| 2025-03-26 | MMMORRF: Multimodal Multilingual Modularized Reciprocal Rank Fusion | Saron Samuel et.al. | 2503.20698 | null |
| 2025-03-26 | Graph-Enhanced Model-Free Reinforcement Learning Agents for Efficient Power Grid Topological Control | Eloy Anguiano Batanero et.al. | 2503.20688 | null |
| 2025-03-27 | Flip Learning: Weakly Supervised Erase to Segment Nodules in Breast Ultrasound | Yuhao Huang et.al. | 2503.20685 | null |
| 2025-03-27 | Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy | Yinan Sun et.al. | 2503.20673 | null |
| 2025-03-26 | TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews | Huimin Xu et.al. | 2503.20666 | null |
| 2025-03-26 | AutoRad-Lung: A Radiomic-Guided Prompting Autoregressive Vision-Language Model for Lung Nodule Malignancy Prediction | Sadaf Khademi et.al. | 2503.20662 | null |
| 2025-03-26 | AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports | Xiangwen Zhang et.al. | 2503.20654 | null |
| 2025-03-26 | Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging | Han Wu et.al. | 2503.20641 | link |
| 2025-03-26 | Collaborative Storytelling and LLM: A Linguistic Analysis of Automatically-Generated Role-Playing Game Sessions | Alessandro Maisto et.al. | 2503.20623 | null |
| 2025-03-26 | IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting | Hao Fu et.al. | 2503.20612 | link |
| 2025-03-25 | SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining | Xiang Xu et.al. | 2503.19912 | link |
| 2025-03-25 | CoLLM: A Large Language Model for Composed Image Retrieval | Chuong Huynh et.al. | 2503.19910 | link |
| 2025-03-25 | FullDiT: Multi-Task Video Generative Foundation Model with Full Attention | Xuan Ju et.al. | 2503.19907 | null |
| 2025-03-25 | CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning | Hao Yu et.al. | 2503.19900 | link |
| 2025-03-25 | A Multi-Agent Framework Integrating Large Language Models and Generative AI for Accelerated Metamaterial Design | Jie Tian et.al. | 2503.19889 | null |
| 2025-03-25 | CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation | Nengbo Wang et.al. | 2503.19878 | null |
| 2025-03-25 | Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators | Seungone Kim et.al. | 2503.19877 | null |
| 2025-03-25 | SLA-Awareness for AI-assisted coding | Kishanthan Thangarajah et.al. | 2503.19876 | null |
| 2025-03-25 | Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking | Xiaoyu Tian et.al. | 2503.19855 | null |
| 2025-03-25 | Towards Online Multi-Modal Social Interaction Understanding | Xinpeng Li et.al. | 2503.19851 | link |
| 2025-03-25 | FALCONEye: Finding Answers and Localizing Content in ONE-hour-long videos with multi-modal LLMs | Carlos Plou et.al. | 2503.19850 | null |
| 2025-03-25 | A Comparative Analysis of Word Segmentation, Part-of-Speech Tagging, and Named Entity Recognition for Historical Chinese Sources, 1900-1950 | Zhao Fang et.al. | 2503.19844 | null |
| 2025-03-25 | FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model | Jun Zhou et.al. | 2503.19839 | null |
| 2025-03-25 | Domain-incremental White Blood Cell Classification with Privacy-aware Continual Learning | Pratibha Kumari et.al. | 2503.19819 | null |
| 2025-03-25 | SeLIP: Similarity Enhanced Contrastive Language Image Pretraining for Multi-modal Head MRI | Zhiyang Liu et.al. | 2503.19801 | null |
| 2025-03-25 | SemEval-2025 Task 9: The Food Hazard Detection Challenge | Korbinian Randl et.al. | 2503.19800 | null |
| 2025-03-25 | PAVE: Patching and Adapting Video Large Language Models | Zhuoming Liu et.al. | 2503.19794 | link |
| 2025-03-25 | Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models | Kartik Thakral et.al. | 2503.19783 | null |
| 2025-03-25 | LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation | Vladan Stojnić et.al. | 2503.19777 | link |
| 2025-03-25 | OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations | Christina Kassab et.al. | 2503.19764 | null |
| 2025-03-24 | DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation | Karim Abou Zeid et.al. | 2503.18944 | link |
| 2025-03-24 | SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding | Mingze Xu et.al. | 2503.18943 | null |
| 2025-03-24 | Video-T1: Test-Time Scaling for Video Generation | Fangfu Liu et.al. | 2503.18942 | null |
| 2025-03-24 | Exploring Training and Inference Scaling Laws in Generative Retrieval | Hongru Cai et.al. | 2503.18941 | link |
| 2025-03-24 | CoMP: Continual Multimodal Pre-training for Vision Foundation Models | Yitong Chen et.al. | 2503.18931 | link |
| 2025-03-24 | Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training | Brian R. Bartoldson et.al. | 2503.18929 | null |
| 2025-03-24 | Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models | Meng Cao et.al. | 2503.18923 | null |
| 2025-03-24 | FFN Fusion: Rethinking Sequential Computation in Large Language Models | Akhiad Bercovich et.al. | 2503.18908 | null |
| 2025-03-24 | xKV: Cross-Layer SVD for KV-Cache Compression | Chi-Chih Chang et.al. | 2503.18893 | link |
| 2025-03-24 | AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration | Zhexuan Wang et.al. | 2503.18891 | link |
| 2025-03-24 | Toward building next-generation Geocoding systems: a systematic review | Zhengcong Yin et.al. | 2503.18888 | null |
| 2025-03-24 | I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders | Andrey Galichin et.al. | 2503.18878 | link |
| 2025-03-24 | Efficient Self-Supervised Adaptation for Medical Image Analysis | Moein Sorkhei et.al. | 2503.18873 | link |
| 2025-03-24 | Reimagining Memory Access for LLM Inference: Compression-Aware Memory Controller Design | Rui Xie et.al. | 2503.18869 | null |
| 2025-03-24 | Reasoning to Learn from Latent Thoughts | Yangjun Ruan et.al. | 2503.18866 | null |
| 2025-03-24 | Structuring Scientific Innovation: A Framework for Modeling and Discovering Impactful Knowledge Combinations | Junlan Chen et.al. | 2503.18865 | null |
| 2025-03-24 | MC-LLaVA: Multi-Concept Personalized Vision-Language Model | Ruichuan An et.al. | 2503.18854 | link |
| 2025-03-24 | Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations | Jeonghyeon Kim et.al. | 2503.18817 | link |
| 2025-03-24 | Defeating Prompt Injections by Design | Edoardo Debenedetti et.al. | 2503.18813 | null |
| 2025-03-24 | SKDU at De-Factify 4.0: Vision Transformer with Data Augmentation for AI-Generated Image Detection | Shrikant Malviya et.al. | 2503.18812 | link |
| 2025-03-21 | Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique | Yansi Li et.al. | 2503.17363 | null |
| 2025-03-21 | HCAST: Human-Calibrated Autonomy Software Tasks | David Rein et.al. | 2503.17354 | link |
| 2025-03-21 | NdLinear Is All You Need for Representation Learning | Alex Reneau et.al. | 2503.17353 | link |
| 2025-03-21 | OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement | Yihe Deng et.al. | 2503.17352 | link |
| 2025-03-21 | Beyond Semantics: Rediscovering Spatial Awareness in Vision-Language Models | Jianing Qi et.al. | 2503.17349 | null |
| 2025-03-21 | Capturing Individual Human Preferences with Reward Features | André Barreto et.al. | 2503.17338 | null |
| 2025-03-21 | Efficient Intent-Based Filtering for Multi-Party Conversations Using Knowledge Distillation from LLMs | Reem Gody et.al. | 2503.17336 | null |
| 2025-03-21 | CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application Vulnerabilities | Yuxuan Zhu et.al. | 2503.17332 | link |
| 2025-03-21 | LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language | Kun Chu et.al. | 2503.17309 | link |
| 2025-03-21 | Bugdar: AI-Augmented Secure Code Review for GitHub Pull Requests | John Naulty et.al. | 2503.17302 | null |
| 2025-03-21 | FastCuRL: Curriculum Reinforcement Learning with Progressive Context Extension for Efficient Training R1-like Reasoning Models | Mingyang Song et.al. | 2503.17287 | link |
| 2025-03-21 | CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement | Gaifan Zhang et.al. | 2503.17279 | null |
| 2025-03-21 | Revisiting End To End Sparse Autoencoder Training -- A Short Finetune is All You Need | Adam Karvonen et.al. | 2503.17272 | link |
| 2025-03-21 | SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging | Aladin Djuhera et.al. | 2503.17239 | link |
| 2025-03-21 | Slide-Level Prompt Learning with Vision Language Models for Few-Shot Multiple Instance Learning in Histopathology | Devavrat Tomar et.al. | 2503.17238 | link |
| 2025-03-21 | FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs | Albert Sawczyn et.al. | 2503.17229 | null |
| 2025-03-21 | Automating Adjudication of Cardiovascular Events Using Large Language Models | Sonish Sivarajkumar et.al. | 2503.17222 | null |
| 2025-03-21 | A Language Anchor-Guided Method for Robust Noisy Domain Generalization | Zilin Dai et.al. | 2503.17211 | null |
| 2025-03-21 | TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning | Sheng Wang et.al. | 2503.17195 | null |
| 2025-03-21 | LLMs Love Python: A Study of LLMs' Bias for Programming Languages and Libraries | Lukas Twist et.al. | 2503.17181 | link |
| 2025-03-20 | DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding | Keyan Chen et.al. | 2503.16426 | link |
| 2025-03-20 | Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models | Yang Sui et.al. | 2503.16419 | link |
| 2025-03-20 | M3: 3D-Spatial MultiModal Memory | Xueyan Zou et.al. | 2503.16413 | link |
| 2025-03-20 | The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination | Yifan Sun et.al. | 2503.16402 | link |
| 2025-03-20 | Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them | Guanyu Chen et.al. | 2503.16401 | null |
| 2025-03-20 | Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation | Yijia Luo et.al. | 2503.16385 | link |
| 2025-03-20 | LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images | Leyang Wang et.al. | 2503.16376 | null |
| 2025-03-20 | JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse | Muyao Li et.al. | 2503.16365 | null |
| 2025-03-20 | CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners | Yunzhi Yao et.al. | 2503.16356 | link |
| 2025-03-20 | Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences | Krithik Ramesh et.al. | 2503.16351 | null |
| 2025-03-20 | LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates | Ying Shen et.al. | 2503.16334 | null |
| 2025-03-20 | OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence | Long Yuan et.al. | 2503.16326 | null |
| 2025-03-20 | Issue2Test: Generating Reproducing Test Cases from Issue Reports | Noor Nashid et.al. | 2503.16320 | null |
| 2025-03-20 | Bridging Technology and Humanities: Evaluating the Impact of Large Language Models on Social Sciences Research with DeepSeek-R1 | Peiran Gu et.al. | 2503.16304 | null |
| 2025-03-20 | Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model | Zhaochong An et.al. | 2503.16282 | link |
| 2025-03-20 | Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens | Shuqi Lu et.al. | 2503.16278 | link |
| 2025-03-20 | Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data | Zijian Li et.al. | 2503.16260 | null |
| 2025-03-20 | Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models | Keda Tao et.al. | 2503.16257 | null |
| 2025-03-21 | Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning | Zhaowei Liu et.al. | 2503.16252 | link |
| 2025-03-20 | Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't | Quy-Anh Dang et.al. | 2503.16219 | link |
| 2025-03-19 | TULIP: Towards Unified Language-Image Pretraining | Zineng Tang et.al. | 2503.15485 | null |
| 2025-03-19 | SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks | Yifei Zhou et.al. | 2503.15478 | link |
| 2025-03-19 | What Makes a Reward Model a Good Teacher? An Optimization Perspective | Noam Razin et.al. | 2503.15477 | link |
| 2025-03-19 | Cube: A Roblox View of 3D Intelligence | Foundation AI Team et.al. | 2503.15475 | link |
| 2025-03-19 | EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining | Boshen Xu et.al. | 2503.15470 | link |
| 2025-03-19 | From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment | Jia-Nan Li et.al. | 2503.15463 | link |
| 2025-03-19 | SkyLadder: Better and Faster Pretraining via Context Window Scheduling | Tongyao Zhu et.al. | 2503.15450 | link |
| 2025-03-19 | VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning | Yang Tan et.al. | 2503.15438 | link |
| 2025-03-19 | Visual Position Prompt for MLLM based Visual Grounding | Wei Tang et.al. | 2503.15426 | link |
| 2025-03-19 | Probing the topology of the space of tokens with structured prompts | Michael Robinson et.al. | 2503.15421 | null |
| 2025-03-19 | Visual Persona: Foundation Model for Full-Body Human Customization | Jisu Nam et.al. | 2503.15406 | null |
| 2025-03-19 | FedSCA: Federated Tuning with Similarity-guided Collaborative Aggregation for Heterogeneous Medical Image Segmentation | Yumin Zhang et.al. | 2503.15390 | null |
| 2025-03-19 | EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models | Yinan Liang et.al. | 2503.15369 | null |
| 2025-03-19 | SemEval-2025 Task 1: AdMIRe -- Advancing Multimodal Idiomaticity Representation | Thomas Pickard et.al. | 2503.15358 | null |
| 2025-03-19 | SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models | I-Fan Lin et.al. | 2503.15351 | null |
| 2025-03-19 | TruthLens:A Training-Free Paradigm for DeepFake Detection | Ritabrata Chakraborty et.al. | 2503.15342 | null |
| 2025-03-19 | Uncertainty-Guided Chain-of-Thought for Code Generation with LLMs | Yuqi Zhu et.al. | 2503.15341 | null |
| 2025-03-19 | Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context | Junyi Ao et.al. | 2503.15338 | link |
| 2025-03-19 | Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport | Hao Tan et.al. | 2503.15337 | link |
| 2025-03-19 | Euclid Quick Data Release (Q1) Exploring galaxy properties with a multi-modal foundation model | Euclid Collaboration et.al. | 2503.15312 | link |
| 2025-03-18 | Aligning Multimodal LLM with Human Preference: A Survey | Tao Yu et.al. | 2503.14504 | link |
| 2025-03-18 | Engineering Scientific Assistants using Interactive Structured Induction of Programs | Shraddha Surana et.al. | 2503.14488 | null |
| 2025-03-18 | Gricean Norms as a Basis for Effective Collaboration | Fardin Saad et.al. | 2503.14484 | link |
| 2025-03-18 | Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM | Xinyu Fang et.al. | 2503.14478 | link |
| 2025-03-18 | Characterizing Data Visualization Literacy: a Systematic Literature Review | Sara Beschi et.al. | 2503.14468 | null |
| 2025-03-18 | RWKV-7 "Goose" with Expressive Dynamic State Evolution | Bo Peng et.al. | 2503.14456 | link |
| 2025-03-18 | EnvBench: A Benchmark for Automated Environment Setup | Aleksandra Eliseeva et.al. | 2503.14443 | link |
| 2025-03-18 | LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers | Nikhil Abhyankar et.al. | 2503.14434 | link |
| 2025-03-18 | PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play | Wei Fang et.al. | 2503.14432 | null |
| 2025-03-18 | ExDDV: A New Dataset for Explainable Deepfake Detection in Video | Vlad Hondru et.al. | 2503.14421 | link |
| 2025-03-18 | Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models | Siwei Zhang et.al. | 2503.14411 | null |
| 2025-03-18 | Large Language Models for Virtual Human Gesture Selection | Parisa Ghanad Torshizi et.al. | 2503.14408 | null |
| 2025-03-18 | DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers | Mert Bulent Sariyildiz et.al. | 2503.14405 | null |
| 2025-03-18 | From "Hallucination" to "Suture": Insights from Language Philosophy to Enhance Large Language Models | Qiantong Wang et.al. | 2503.14392 | null |
| 2025-03-18 | How much do LLMs learn from negative examples? | Shadi Hamdan et.al. | 2503.14391 | link |
| 2025-03-18 | Good/Evil Reputation Judgment of Celebrities by LLMs via Retrieval Augmented Generation | Rikuto Tsuchida et.al. | 2503.14382 | null |
| 2025-03-18 | On the Standard Performance Criteria for Applied Control Design: PID, MPC or Machine Learning Controller? | Pouria Sarhadi et.al. | 2503.14379 | link |
| 2025-03-18 | Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels | Maximilian Beck et.al. | 2503.14376 | link |
| 2025-03-18 | MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts | Runqi Meng et.al. | 2503.14355 | null |
| 2025-03-19 | MoonCast: High-Quality Zero-Shot Podcast Generation | Zeqian Ju et.al. | 2503.14345 | link |
| 2025-03-17 | MetaScale: Test-Time Scaling with Evolving Meta-Thoughts | Qin Liu et.al. | 2503.13447 | null |
| 2025-03-17 | MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation | Zhenyu Wu et.al. | 2503.13446 | null |
| 2025-03-17 | Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance | Noah Y. Siegel et.al. | 2503.13445 | null |
| 2025-03-17 | VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning | Ye Liu et.al. | 2503.13444 | link |
| 2025-03-17 | DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models | Haoyang Li et.al. | 2503.13443 | link |
| 2025-03-18 | MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling | Yingyue Li et.al. | 2503.13440 | link |
| 2025-03-17 | xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference | Maximilian Beck et.al. | 2503.13427 | link |
| 2025-03-17 | SuperBPE: Space Travel for Language Models | Alisa Liu et.al. | 2503.13423 | null |
| 2025-03-17 | A Comprehensive Survey on Multi-Agent Cooperative Decision-Making: Scenarios, Approaches, Challenges and Perspectives | Weiqiang Jin et.al. | 2503.13415 | null |
| 2025-03-18 | DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective | Dengyun Peng et.al. | 2503.13413 | link |
| 2025-03-17 | Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis | Alexander Ku et.al. | 2503.13401 | null |
| 2025-03-17 | MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research | James Burgess et.al. | 2503.13399 | link |
| 2025-03-17 | Aligned Probing: Relating Toxic Behavior and Model Internals | Andreas Waldis et.al. | 2503.13390 | null |
| 2025-03-17 | Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning | Mengyao Lyu et.al. | 2503.13383 | null |
| 2025-03-17 | Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions | Wan Ju Kang et.al. | 2503.13369 | null |
| 2025-03-17 | Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning | Hai-Long Sun et.al. | 2503.13360 | null |
| 2025-03-17 | Agents Play Thousands of 3D Video Games | Zhongwen Xu et.al. | 2503.13356 | null |
| 2025-03-17 | Valid Text-to-SQL Generation with Unification-based DeepStochLog | Ying Jiao et.al. | 2503.13342 | link |
| 2025-03-17 | LearnMate: Enhancing Online Education with LLM-Powered Personalized Learning Plans and Support | Xinyu Jessica Wang et.al. | 2503.13340 | null |
| 2025-03-17 | Reliable and Efficient Amortized Model-based Evaluation | Sang Truong et.al. | 2503.13335 | null |
| 2025-03-14 | Tit-for-Tat: Safeguarding Large Vision-Language Models Against Jailbreak Attacks via Adversarial Defense | Shuyang Hao et.al. | 2503.11619 | null |
| 2025-03-14 | ASMA-Tune: Unlocking LLMs' Assembly Code Comprehension via Structural-Semantic Instruction Tuning | Xinyi Wang et.al. | 2503.11617 | link |
| 2025-03-14 | Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages | Matteo Farina et.al. | 2503.11609 | link |
| 2025-03-14 | Do Construction Distributions Shape Formal Language Learning In German BabyLMs? | Bastian Bunzeck et.al. | 2503.11593 | null |
| 2025-03-14 | Pathology Image Compression with Pre-trained Autoencoders | Srikar Yellapragada et.al. | 2503.11591 | null |
| 2025-03-14 | Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs using Semantic Space | Zhiliang Chen et.al. | 2503.11586 | link |
| 2025-03-14 | SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion | Ahmed Nassar et.al. | 2503.11576 | null |
| 2025-03-14 | Synthesizing Access Control Policies using Large Language Models | Adarsh Vatsa et.al. | 2503.11573 | null |
| 2025-03-14 | Implicit Bias-Like Patterns in Reasoning Models | Messi H. J. Lee et.al. | 2503.11572 | null |
| 2025-03-14 | VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity | Jing Bi et.al. | 2503.11557 | null |
| 2025-03-14 | Similarity-Aware Token Pruning: Your VLM but Faster | Ahmadreza Jeddi et.al. | 2503.11549 | link |
| 2025-03-14 | Potential of large language model-powered nudges for promoting daily water and energy conservation | Zonghan Li et.al. | 2503.11531 | null |
| 2025-03-14 | Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models | Hao Cheng et.al. | 2503.11519 | null |
| 2025-03-14 | HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models | Ziqin Zhou et.al. | 2503.11513 | null |
| 2025-03-14 | V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning | Zixu Cheng et.al. | 2503.11495 | null |
| 2025-03-14 | A Review of DeepSeek Models' Key Innovative Techniques | Chengen Wang et.al. | 2503.11486 | null |
| 2025-03-14 | Integrating LLMs in Gamified Systems | Carlos J. Costa et.al. | 2503.11458 | null |
| 2025-03-14 | D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning | Jia Zhang et.al. | 2503.11441 | null |
| 2025-03-14 | Text Compression for Efficient Language Generation | David Gu et.al. | 2503.11426 | null |
| 2025-03-14 | Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models | Xu Liu et.al. | 2503.11411 | null |
| 2025-03-13 | GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing | Rongyao Fang et.al. | 2503.10639 | link |
| 2025-03-13 | A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1 | Zhaoyi Li et.al. | 2503.10635 | link |
| 2025-03-13 | HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model | Jiaming Liu et.al. | 2503.10631 | null |
| 2025-03-13 | UniGoal: Towards Universal Zero-shot Goal-oriented Navigation | Hang Yin et.al. | 2503.10630 | null |
| 2025-03-13 | Transformers without Normalization | Jiachen Zhu et.al. | 2503.10622 | null |
| 2025-03-13 | From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM | Kshitij Ambilduke et.al. | 2503.10620 | link |
| 2025-03-13 | Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search | Andy Zhou et.al. | 2503.10619 | null |
| 2025-03-13 | Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models | Andy Zhou et.al. | 2503.10617 | null |
| 2025-03-13 | R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization | Yi Yang et.al. | 2503.10615 | link |
| 2025-03-13 | CoSTA |
Advait Gupta et.al. | 2503.10613 | link |
| 2025-03-13 | TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention | Jinhao Duan et.al. | 2503.10602 | link |
| 2025-03-13 | GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding | Rui Hu et.al. | 2503.10596 | link |
| 2025-03-13 | Unlock the Power of Unlabeled Data in Language Driving Model | Chaoqun Wang et.al. | 2503.10586 | null |
| 2025-03-13 | VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search | Yiming Jia et.al. | 2503.10582 | null |
| 2025-03-13 | Unveiling the Mathematical Reasoning in DeepSeek Models: A Comparative Study of Large Language Models | Afrar Jahin et.al. | 2503.10573 | null |
| 2025-03-13 | ASIDE: Architectural Separation of Instructions and Data in Language Models | Egor Zverev et.al. | 2503.10566 | null |
| 2025-03-13 | Short-term AI literacy intervention does not reduce over-reliance on incorrect ChatGPT recommendations | Brett Puppart et.al. | 2503.10556 | null |
| 2025-03-13 | KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation | Zixian Liu et.al. | 2503.10546 | null |
| 2025-03-13 | DP-GPL: Differentially Private Graph Prompt Learning | Jing Xu et.al. | 2503.10544 | null |
| 2025-03-13 | Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More | Arvid Frydenlund et.al. | 2503.10542 | null |
| 2025-03-12 | MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System | Jihao Zhao et.al. | 2503.09600 | link |
| 2025-03-12 | How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation | Ruohao Guo et.al. | 2503.09598 | link |
| 2025-03-12 | SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment | Katrin Renz et.al. | 2503.09594 | null |
| 2025-03-12 | BIMBA: Selective-Scan Compression for Long-Range Video Question Answering | Md Mohaiminul Islam et.al. | 2503.09590 | link |
| 2025-03-12 | Cost-Optimal Grouped-Query Attention for Long-Context LLMs | Yingfa Chen et.al. | 2503.09579 | link |
| 2025-03-12 | Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models | Marianne Arriola et.al. | 2503.09573 | link |
| 2025-03-12 | Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks | Lutfi Eren Erdogan et.al. | 2503.09572 | null |
| 2025-03-13 | Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models | Qiguang Chen et.al. | 2503.09567 | null |
| 2025-03-12 | PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs | Oskar van der Wal et.al. | 2503.09543 | link |
| 2025-03-13 | Large Language Models for Multi-Facility Location Mechanism Design | Nguyen Thach et.al. | 2503.09533 | null |
| 2025-03-13 | SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability | Adam Karvonen et.al. | 2503.09532 | null |
| 2025-03-12 | Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning | Bowen Jin et.al. | 2503.09516 | link |
| 2025-03-12 | Reinforcement Learning is all You Need | Yongsheng Lian et.al. | 2503.09512 | null |
| 2025-03-12 | ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning | Ziyu Wan et.al. | 2503.09501 | link |
| 2025-03-12 | MindGYM: Enhancing Vision-Language Models via Synthetic Self-Challenging Questions | Zhe Xu et.al. | 2503.09499 | link |
| 2025-03-12 | Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection | Romain Thoreau et.al. | 2503.09493 | null |
| 2025-03-12 | Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness | Beier Zhu et.al. | 2503.09487 | null |
| 2025-03-12 | BAMBI: Developing Baby Language Models for Italian | Alice Suozzi et.al. | 2503.09481 | null |
| 2025-03-12 | SurgicalVLM-Agent: Towards an Interactive AI Co-Pilot for Pituitary Surgery | Jiayuan Huang et.al. | 2503.09474 | null |
| 2025-03-12 | Explicit Learning and the LLM in Machine Translation | Malik Marmonier et.al. | 2503.09454 | link |
| 2025-03-11 | QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension | Yongdong Luo et.al. | 2503.08689 | link |
| 2025-03-11 | Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs | Ariba Khan et.al. | 2503.08688 | link |
| 2025-03-11 | Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents | Haoyu Wang et.al. | 2503.08684 | link |
| 2025-03-11 | Self-Taught Self-Correction for Small Language Models | Viktor Moskvoretskii et.al. | 2503.08681 | null |
| 2025-03-11 | Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields | Tobias Kreiman et.al. | 2503.08674 | null |
| 2025-03-11 | Generating Robot Constitutions & Benchmarks for Semantic Safety | Pierre Sermanet et.al. | 2503.08663 | null |
| 2025-03-11 | Exploring the Word Sense Disambiguation Capabilities of Large Language Models | Pierpaolo Basile et.al. | 2503.08662 | null |
| 2025-03-11 | YuE: Scaling Open Foundation Models for Long-Form Music Generation | Ruibin Yuan et.al. | 2503.08638 | link |
| 2025-03-11 | LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization | Xianfeng Wu et.al. | 2503.08619 | link |
| 2025-03-11 | EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments | Dongping Li et.al. | 2503.08604 | link |
| 2025-03-11 | NSF-SciFy: Mining the NSF Awards Database for Scientific Claims | Delip Rao et.al. | 2503.08600 | null |
| 2025-03-11 | Proc4Gem: Foundation models for physical agency through procedural generation | Yixin Lin et.al. | 2503.08593 | null |
| 2025-03-11 | BiasEdit: Debiasing Stereotyped Language Models via Model Editing | Xin Xu et.al. | 2503.08588 | link |
| 2025-03-11 | HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding | Shehreen Azad et.al. | 2503.08585 | null |
| 2025-03-11 | RAG-Adapter: A Plug-and-Play RAG-enhanced Framework for Long Video Understanding | Xichen Tan et.al. | 2503.08576 | null |
| 2025-03-11 | DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process | Minjun Zhu et.al. | 2503.08569 | null |
| 2025-03-11 | Reasoning and Sampling-Augmented MCQ Difficulty Prediction via LLMs | Wanyong Feng et.al. | 2503.08551 | null |
| 2025-03-11 | Transferring Extreme Subword Style Using Ngram Model-Based Logit Scaling | Craig Messner et.al. | 2503.08550 | null |
| 2025-03-11 | Graph of AI Ideas: Leveraging Knowledge Graphs and LLMs for AI Research Idea Generation | Xian Gao et.al. | 2503.08549 | null |
| 2025-03-11 | TLA: Tactile-Language-Action Model for Contact-Rich Manipulation | Peng Hao et.al. | 2503.08548 | null |
| 2025-03-10 | Robusto-1 Dataset: Comparing Humans and VLMs on real out-of-distribution Autonomous Driving VQA from Peru | Dunant Cusipuma et.al. | 2503.07587 | null |
| 2025-03-10 | Talking to GDELT Through Knowledge Graphs | Audun Myers et.al. | 2503.07584 | null |
| 2025-03-10 | VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models | Jen-tse Huang et.al. | 2503.07575 | link |
| 2025-03-10 | AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning | Yangzhe Kong et.al. | 2503.07557 | null |
| 2025-03-10 | Junior Software Developers' Perspectives on Adopting LLMs for Software Engineering: a Systematic Literature Review | Samuel Ferino et.al. | 2503.07556 | null |
| 2025-03-10 | KSOD: Knowledge Supplement for LLMs On Demand | Haoran Li et.al. | 2503.07550 | null |
| 2025-03-10 | Bi-Directional Mental Model Reconciliation for Human-Robot Interaction with Large Language Models | Nina Moorman et.al. | 2503.07547 | null |
| 2025-03-10 | Queueing, Predictions, and LLMs: Challenges and Open Problems | Michael Mitzenmacher et.al. | 2503.07545 | null |
| 2025-03-10 | XIFBench: Evaluating Large Language Models on Multilingual Instruction Following | Zhenyu Li et.al. | 2503.07539 | null |
| 2025-03-10 | Building English ASR model with regional language support | Purvi Agrawal et.al. | 2503.07522 | null |
| 2025-03-10 | GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval | Justus-Jonas Erker et.al. | 2503.07519 | link |
| 2025-03-10 | TokenButler: Token Importance is Predictable | Yash Akhauri et.al. | 2503.07518 | link |
| 2025-03-10 | Language Models Fail to Introspect About Their Knowledge of Language | Siyuan Song et.al. | 2503.07513 | link |
| 2025-03-10 | Plume: Scaffolding Text Composition in Dashboards | Maxim Lisnic et.al. | 2503.07512 | null |
| 2025-03-10 | Sometimes the Model doth Preach: Quantifying Religious Bias in Open LLMs through Demographic Analysis in Asian Nations | Hari Shankar et.al. | 2503.07510 | link |
| 2025-03-10 | Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts | Shiu-hong Kao et.al. | 2503.07503 | null |
| 2025-03-10 | V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation | Guiwei Zhang et.al. | 2503.07493 | link |
| 2025-03-10 | LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition? | Bangyan Li et.al. | 2503.07487 | null |
| 2025-03-10 | Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction | Zongzheng Zhang et.al. | 2503.07485 | link |
| 2025-03-10 | VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models | Jiacheng Ruan et.al. | 2503.07478 | link |
| 2025-03-10 | Advancing Vietnamese Information Retrieval with Learning Objective and Benchmark | Phu-Vinh Nguyen et.al. | 2503.07470 | null |
| 2025-03-10 | YOLOE: Real-Time Seeing Anything | Ao Wang et.al. | 2503.07465 | link |
| 2025-03-10 | GenAIReading: Augmenting Human Cognition with Interactive Digital Textbooks Using Large Language Models and Image Generation Models | Ryugo Morita et.al. | 2503.07463 | null |
| 2025-03-10 | MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning | Xiangru Tang et.al. | 2503.07459 | link |
| 2025-03-10 | LLMs syntactically adapt their language use to their conversational partner | Florian Kandra et.al. | 2503.07457 | null |
| 2025-03-10 | Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration | Dylan J. Foster et.al. | 2503.07453 | null |
| 2025-03-10 | From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development -- An Opinion Paper | Sargam Yadav et.al. | 2503.07450 | null |
| 2025-03-10 | From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics | Jaewook Lee et.al. | 2503.07429 | null |
| 2025-03-10 | RePO: ReLU-based Preference Optimization | Junkang Wu et.al. | 2503.07426 | link |
| 2025-03-10 | REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding | Yan Tai et.al. | 2503.07413 | link |
| 2025-03-10 | Towards Safe Robot Foundation Models | Maximilian Tölle et.al. | 2503.07404 | null |
| 2025-03-10 | Keeping Representation Similarity in Finetuning for Medical Image Analysis | Wenqiang Zu et.al. | 2503.07399 | null |
| 2025-03-10 | Revisiting Noise in Natural Language Processing for Computational Social Science | Nadav Borenstein et.al. | 2503.07395 | null |
| 2025-03-10 | Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs | Gonzalo Mancera et.al. | 2503.07384 | null |
| 2025-03-10 | Process-Supervised LLM Recommenders via Flow-guided Tuning | Chongming Gao et.al. | 2503.07377 | link |
| 2025-03-10 | Artificial Utopia: Simulation and Intelligent Agents for a Democratised Future | Yannick Oswald et.al. | 2503.07364 | null |
| 2025-03-07 | Fairness-Aware Low-Rank Adaptation Under Demographic Privacy Constraints | Parameswaran Kamalaruban et.al. | 2503.05684 | null |
| 2025-03-07 | Understanding the Limits of Lifelong Knowledge Editing in LLMs | Lukas Thede et.al. | 2503.05683 | null |
| 2025-03-07 | A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval | Yu Zhang et.al. | 2503.05659 | link |
| 2025-03-07 | Learning LLM Preference over Intra-Dialogue Pairs: A Framework for Utterance-level Understandings | Xuanqing Liu et.al. | 2503.05620 | null |
| 2025-03-07 | A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models | Dong Shu et.al. | 2503.05613 | null |
| 2025-03-07 | From Theory to Application: A Practical Introduction to Neural Operators in Scientific Computing | Prashant K. Jha et.al. | 2503.05598 | link |
| 2025-03-07 | R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning | Huatong Song et.al. | 2503.05592 | null |
| 2025-03-07 | Quantifying the Robustness of Retrieval-Augmented Language Models Against Spurious Features in Grounding Data | Shiping Yang et.al. | 2503.05587 | null |
| 2025-03-07 | Evaluating open-source Large Language Models for automated fact-checking | Nicolo' Fontana et.al. | 2503.05565 | null |
| 2025-03-07 | Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance | Bryan Etzine et.al. | 2503.05551 | null |
| 2025-03-07 | Leveraging Approximate Caching for Faster Retrieval-Augmented Generation | Shai Bergman et.al. | 2503.05530 | null |
| 2025-03-07 | PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs | Roberto Cerina et.al. | 2503.05529 | null |
| 2025-03-07 | Cognitive Bias Detection Using Advanced Prompt Engineering | Frederic Lemieux et.al. | 2503.05516 | null |
| 2025-03-07 | Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs? | Qingyuan Liang et.al. | 2503.05507 | null |
| 2025-03-07 | Statistical Guarantees of Correctness Coverage for Medical Multiple-Choice Question Answering | Yusong Ke et.al. | 2503.05505 | null |
| 2025-03-07 | Benchmarking LLMs in Recommendation Tasks: A Comparative Evaluation with Conventional Recommenders | Qijiong Liu et.al. | 2503.05493 | null |
| 2025-03-07 | Maximum Hallucination Standards for Domain-Specific Large Language Models | Tingmingke Lu et.al. | 2503.05481 | null |
| 2025-03-07 | The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence | Noah Mamie et.al. | 2503.05473 | null |
| 2025-03-07 | Soft Policy Optimization: Online Off-Policy RL for Sequence Models | Taco Cohen et.al. | 2503.05453 | null |
| 2025-03-07 | LLM-based Iterative Approach to Metamodeling in Automotive | Nenad Petrovic et.al. | 2503.05449 | null |
| 2025-03-06 | L |
Zhuo Chen et.al. | 2503.04725 | link |
| 2025-03-06 | LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM | Sambal Shikhar et.al. | 2503.04724 | null |
| 2025-03-07 | Shifting Long-Context LLMs Research from Input to Output | Yuhao Wu et.al. | 2503.04723 | null |
| 2025-03-06 | Enough Coin Flips Can Make LLMs Act Bayesian | Ritwik Gupta et.al. | 2503.04722 | null |
| 2025-03-06 | Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities | Guan-Ting Lin et.al. | 2503.04721 | link |
| 2025-03-06 | Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining | Houyi Li et.al. | 2503.04715 | null |
| 2025-03-06 | Scaling Rich Style-Prompted Text-to-Speech Datasets | Anuj Diwan et.al. | 2503.04713 | link |
| 2025-03-06 | Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size | Alireza Behtash et.al. | 2503.04704 | null |
| 2025-03-06 | L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning | Pranjal Aggarwal et.al. | 2503.04697 | null |
| 2025-03-06 | UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets | Wenyu Wang et.al. | 2503.04693 | null |
| 2025-03-06 | Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases | Pengcheng Qiu et.al. | 2503.04691 | null |
| 2025-03-06 | LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue | Sangyeop Kim et.al. | 2503.04675 | null |
| 2025-03-06 | An Information-theoretic Multi-task Representation Learning Framework for Natural Language Understanding | Dou Hu et.al. | 2503.04667 | link |
| 2025-03-06 | CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models | Shengzhuang Chen et.al. | 2503.04655 | link |
| 2025-03-06 | Transferable Foundation Models for Geometric Tasks on Point Cloud Representations: Geometric Neural Operators | Blaine Quackenbush et.al. | 2503.04649 | link |
| 2025-03-06 | Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment | Wen Yang et.al. | 2503.04647 | link |
| 2025-03-06 | Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation | Aishik Konwer et.al. | 2503.04639 | null |
| 2025-03-06 | Mark Your LLM: Detecting the Misuse of Open-Source Large Language Models via Watermarking | Yijie Xu et.al. | 2503.04636 | null |
| 2025-03-06 | Better Process Supervision with Bi-directional Rewarding Signals | Wenxiang Chen et.al. | 2503.04618 | null |
| 2025-03-06 | Towards Data-Efficient Language Models: A Child-Inspired Approach to Language Learning | Mohammad Amin Ghanizadeh et.al. | 2503.04611 | null |
| 2025-03-05 | The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems | Richard Ren et.al. | 2503.03750 | null |
| 2025-03-05 | Process-based Self-Rewarding Language Models | Shimao Zhang et.al. | 2503.03746 | link |
| 2025-03-05 | CHOP: Mobile Operating Assistant with Constrained High-frequency Optimized Subtask Planning | Yuqi Zhou et.al. | 2503.03743 | link |
| 2025-03-05 | Towards Understanding Distilled Reasoning Models: A Representational Approach | David D. Baek et.al. | 2503.03730 | null |
| 2025-03-05 | Improving LLM Safety Alignment with Dual-Objective Optimization | Xuandong Zhao et.al. | 2503.03710 | link |
| 2025-03-05 | Effective LLM Knowledge Learning via Model Generalization | Mingkang Zhu et.al. | 2503.03705 | null |
| 2025-03-05 | A Practical Memory Injection Attack against LLM Agents | Shen Dong et.al. | 2503.03704 | null |
| 2025-03-05 | Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models | Jiyue Jiang et.al. | 2503.03702 | null |
| 2025-03-05 | Addressing Overprescribing Challenges: Fine-Tuning Large Language Models for Medication Recommendation Tasks | Zihao Zhao et.al. | 2503.03687 | link |
| 2025-03-05 | Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models | Bar Karov et.al. | 2503.03669 | link |
| 2025-03-05 | Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction | Gustaw Opiełka et.al. | 2503.03666 | link |
| 2025-03-05 | Robust Learning of Diverse Code Edits | Tushar Aggarwal et.al. | 2503.03656 | null |
| 2025-03-05 | Improving Neutral Point of View Text Generation through Parameter-Efficient Reinforcement Learning and a Small-Scale High-Quality Dataset | Jessica Hoffmann et.al. | 2503.03654 | null |
| 2025-03-05 | Token-Level Privacy in Large Language Models | Re'em Harel et.al. | 2503.03652 | null |
| 2025-03-05 | Psy-Copilot: Visual Chain of Thought for Counseling | Keqi Chen et.al. | 2503.03645 | null |
| 2025-03-05 | Large language models in finance: estimating financial sentiment for stock prediction | Kemal Kirtac et.al. | 2503.03612 | null |
| 2025-03-05 | Enhancing the Accuracy and Comprehensibility in Architectural Tactics Detection via Small Model-Augmented Prompt Engineering | Lingli Cao et.al. | 2503.03609 | link |
| 2025-03-05 | Psy-Insight: Explainable Multi-turn Bilingual Dataset for Mental Health Counseling | Keqi Chen et.al. | 2503.03607 | null |
| 2025-03-05 | Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders | Kristian Kuznetsov et.al. | 2503.03601 | null |
| 2025-03-05 | Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs | Haoran Fan et.al. | 2503.03594 | link |
| 2025-03-04 | Wikipedia in the Era of LLMs: Evolution and Risks | Siming Huang et.al. | 2503.02879 | link |
| 2025-03-04 | Language Models can Self-Improve at State-Value Estimation for Better Search | Ethan Mendes et.al. | 2503.02878 | link |
| 2025-03-04 | SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models | Dmitry Nechaev et.al. | 2503.02876 | link |
| 2025-03-04 | The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models | Ke Ji et.al. | 2503.02875 | null |
| 2025-03-04 | Prompting Generative AI with Interaction-Augmented Instructions | Leixian Shen et.al. | 2503.02874 | null |
| 2025-03-04 | FairSense-AI: Responsible AI Meets Sustainability | Shaina Raza et.al. | 2503.02865 | null |
| 2025-03-04 | Calibrating LLM Confidence with Semantic Steering: A Multi-Prompt Aggregation Framework | Ziang Zhou et.al. | 2503.02863 | null |
| 2025-03-04 | Privacy and Accuracy-Aware AI/ML Model Deduplication | Hong Guan et.al. | 2503.02862 | null |
| 2025-03-04 | (How) Do Language Models Track State? | Belinda Z. Li et.al. | 2503.02854 | null |
| 2025-03-04 | Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs' Decoding Layers | Zicong He et.al. | 2503.02851 | link |
| 2025-03-04 | Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs | Yuzhe Gu et.al. | 2503.02846 | link |
| 2025-03-04 | Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training | Paul Janson et.al. | 2503.02844 | null |
| 2025-03-04 | AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation | Songming Zhang et.al. | 2503.02832 | null |
| 2025-03-04 | Developing a PET/CT Foundation Model for Cross-Modal Anatomical and Functional Imaging | Yujin Oh et.al. | 2503.02824 | null |
| 2025-03-04 | "What If Smart Homes Could See Our Homes?": Exploring DIY Smart Home Building Experiences with VLM-Based Camera Sensors | Sojeong Yun et.al. | 2503.02816 | null |
| 2025-03-04 | Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression | Nathan Godey et.al. | 2503.02812 | link |
| 2025-03-04 | RAAD-LLM: Adaptive Anomaly Detection Using LLMs and RAG Integration | Alicia Russell-Gilbert et.al. | 2503.02800 | null |
| 2025-03-04 | Multimodal AI predicts clinical outcomes of drug combinations from preclinical data | Yepeng Huang et.al. | 2503.02781 | link |
| 2025-03-04 | Implicit Bias in LLMs: A Survey | Xinru Lin et.al. | 2503.02776 | null |
| 2025-03-04 | InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training | Dingdong Wang et.al. | 2503.02769 | null |
| 2025-02-28 | LLM Post-Training: A Deep Dive into Reasoning Large Language Models | Komal Kumar et.al. | 2502.21321 | link |
| 2025-02-28 | Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos | Zhiyu Tan et.al. | 2502.21314 | null |
| 2025-02-28 | FANformer: Improving Large Language Models Through Effective Periodicity Modeling | Yihong Dong et.al. | 2502.21309 | link |
| 2025-02-28 | Contextualizing biological perturbation experiments through language | Menghua Wu et.al. | 2502.21290 | link |
| 2025-02-28 | Adaptive Keyframe Sampling for Long Video Understanding | Xi Tang et.al. | 2502.21271 | null |
| 2025-03-03 | Foundation Models -- A Panacea for Artificial Intelligence in Pathology? | Nita Mulliqi et.al. | 2502.21264 | null |
| 2025-02-28 | Modeling Human Beliefs about AI Behavior for Scalable Oversight | Leon Lang et.al. | 2502.21262 | null |
| 2025-02-28 | PET Image Denoising via Text-Guided Diffusion: Integrating Anatomical Priors through Text Prompts | Boxiao Yu et.al. | 2502.21260 | null |
| 2025-02-28 | RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete | Yuheng Ji et.al. | 2502.21257 | null |
| 2025-02-28 | TimesBERT: A BERT-Style Foundation Model for Time Series Understanding | Haoran Zhang et.al. | 2502.21245 | null |
| 2025-03-04 | Semantic Volume: Quantifying and Detecting both External and Internal Uncertainty in LLMs | Xiaomin Li et.al. | 2502.21239 | null |
| 2025-02-28 | Transforming Tuberculosis Care: Optimizing Large Language Models For Enhanced Clinician-Patient Communication | Daniil Filienko et.al. | 2502.21236 | null |
| 2025-02-28 | ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs | Hao Ge et.al. | 2502.21231 | null |
| 2025-03-03 | ECLeKTic: a Novel Challenge Set for Evaluation of Cross-Lingual Knowledge Transfer | Omer Goldman et.al. | 2502.21228 | null |
| 2025-02-28 | Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought | Jianhao Huang et.al. | 2502.21212 | null |
| 2025-02-28 | Chronologically Consistent Large Language Models | Songrun He et.al. | 2502.21206 | null |
| 2025-02-28 | Mads-Peter Verner Christiansen et.al. | 2502.21179 | null | |
| 2025-03-03 | Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models | Ruta Binkyte et.al. | 2502.21123 | null |
| 2025-02-28 | Optimizing Large Language Models for ESG Activity Detection in Financial Texts | Mattia Birti et.al. | 2502.21112 | link |
| 2025-02-28 | Large Language Model-Based Benchmarking Experiment Settings for Evolutionary Multi-Objective Optimization | Lie Meng Pang et.al. | 2502.21108 | null |
| 2025-02-27 | R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts | Zhongyang Li et.al. | 2502.20395 | link |
| 2025-02-27 | Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis | Jeffrey Yang Fan Chiang et.al. | 2502.20383 | null |
| 2025-02-27 | Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers | Shalev Lifshitz et.al. | 2502.20379 | null |
| 2025-02-27 | PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation | Albert Gong et.al. | 2502.20377 | link |
| 2025-02-27 | Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization | Ryan C. Barron et.al. | 2502.20364 | link |
| 2025-02-27 | Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs | Kuan Lok Zhou et.al. | 2502.20356 | null |
| 2025-02-27 | KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model | Kai Zhang et.al. | 2502.20350 | null |
| 2025-02-27 | Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models | Yi Jing et.al. | 2502.20344 | null |
| 2025-02-27 | Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners | Daniele Paliotta et.al. | 2502.20339 | null |
| 2025-02-27 | Expertise Is What We Want | Alan Ashworth et.al. | 2502.20335 | null |
| 2025-02-27 | Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models | Yukang Yang et.al. | 2502.20332 | null |
| 2025-02-27 | Long-Context Inference with Retrieval-Augmented Speculative Decoding | Guanzheng Chen et.al. | 2502.20330 | link |
| 2025-02-27 | LangProBe: a Language Programs Benchmark | Shangyin Tan et.al. | 2502.20315 | null |
| 2025-02-27 | EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants | Franck Cappello et.al. | 2502.20309 | link |
| 2025-02-27 | M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging | Jinghao Feng et.al. | 2502.20301 | null |
| 2025-02-27 | An exploration of features to improve the generalisability of fake news detection models | Nathaniel Hoy et.al. | 2502.20299 | null |
| 2025-02-27 | Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription | Benjamin Gutteridge et.al. | 2502.20295 | link |
| 2025-02-27 | Visual Adaptive Prompting for Compositional Zero-Shot Learning | Kyle Stein et.al. | 2502.20292 | null |
| 2025-02-27 | Conformal Tail Risk Control for Large Language Model Alignment | Catherine Yu-Chi Chen et.al. | 2502.20285 | null |
| 2025-02-27 | Evaluating Human Trust in LLM-Based Planners: A Preliminary Study | Shenghui Chen et.al. | 2502.20284 | null |
| 2025-02-26 | Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models | Lucy Xiaoyang Shi et.al. | 2502.19417 | null |
| 2025-02-26 | Norm Growth and Stability Challenges in Localized Sequential Knowledge Editing | Akshat Gupta et.al. | 2502.19416 | null |
| 2025-02-26 | Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation | Shiven Sinha et.al. | 2502.19414 | link |
| 2025-02-26 | Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs | Christoph Schuhmann et.al. | 2502.19413 | null |
| 2025-02-26 | Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs | Dayu Yang et.al. | 2502.19411 | link |
| 2025-02-26 | Less or More: Towards Glanceable Explanations for LLM Recommendations Using Ultra-Small Devices | Xinru Wang et.al. | 2502.19410 | null |
| 2025-02-26 | ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models | Danae Sánchez Villegas et.al. | 2502.19409 | null |
| 2025-02-26 | Learning Code-Edit Embedding to Model Student Debugging Behavior | Hasnain Heickal et.al. | 2502.19407 | null |
| 2025-02-26 | General Reasoning Requires Learning to Reason from the Get-go | Seungwook Han et.al. | 2502.19402 | null |
| 2025-02-26 | TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding | Max Ku et.al. | 2502.19400 | null |
| 2025-02-26 | LiDAR Registration with Visual Foundation Models | Niclas Vödisch et.al. | 2502.19374 | null |
| 2025-02-26 | Deep Learning For Time Series Analysis With Application On Human Motion | Ali Ismail-Fawaz et.al. | 2502.19364 | null |
| 2025-02-26 | DataMan: Data Manager for Pre-training Large Language Models | Ru Peng et.al. | 2502.19363 | null |
| 2025-02-26 | Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? | Yancheng He et.al. | 2502.19361 | link |
| 2025-02-26 | Controlled Diversity: Length-optimized Natural Language Generation | Diana Marie Schenke et.al. | 2502.19347 | null |
| 2025-02-26 | Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets | Tohida Rehman et.al. | 2502.19339 | null |
| 2025-02-26 | I Know What I Don't Know: Improving Model Cascades Through Confidence Tuning | Stephan Rabanser et.al. | 2502.19335 | null |
| 2025-02-26 | Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems | Hao Peng et.al. | 2502.19328 | link |
| 2025-02-26 | Shh, don't say that! Domain Certification in LLMs | Cornelius Emde et.al. | 2502.19320 | null |
| 2025-02-26 | Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond | Qizhou Wang et.al. | 2502.19301 | null |
| 2025-02-25 | DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers | Xueguang Ma et.al. | 2502.18460 | link |
| 2025-02-25 | LLM-Based Design Pattern Detection | Christian Schindler et.al. | 2502.18458 | null |
| 2025-02-25 | Evaluating the Effectiveness of Small Language Models in Detecting Refactoring Bugs | Rohit Gheyi et.al. | 2502.18454 | null |
| 2025-02-25 | FRIDA to the Rescue! Analyzing Synthetic Data Effectiveness in Object-Based Common Sense Reasoning for Disaster Response | Mollie Shichman et.al. | 2502.18452 | null |
| 2025-02-25 | SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution | Yuxiang Wei et.al. | 2502.18449 | null |
| 2025-02-25 | olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models | Jake Poznanski et.al. | 2502.18443 | link |
| 2025-02-25 | MAPoRL: Multi-Agent Post-Co-Training for Collaborative Large Language Models with Reinforcement Learning | Chanwoo Park et.al. | 2502.18439 | null |
| 2025-02-25 | Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions | Yizhe Zhang et.al. | 2502.18435 | null |
| 2025-02-25 | Exploring Gender Disparities in Automatic Speech Recognition Technology | Hend ElGhazaly et.al. | 2502.18434 | null |
| 2025-02-25 | TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning | Frederikus Hudi et.al. | 2502.18431 | link |
| 2025-02-25 | PyEvalAI: AI-assisted evaluation of Jupyter Notebooks for immediate personalized feedback | Nils Wandel et.al. | 2502.18425 | null |
| 2025-02-25 | Compressing Language Models for Specialized Domains | Miles Williams et.al. | 2502.18424 | null |
| 2025-02-25 | Rank1: Test-Time Compute for Reranking in Information Retrieval | Orion Weller et.al. | 2502.18418 | link |
| 2025-02-25 | OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference | Xiangyu Zhao et.al. | 2502.18411 | link |
| 2025-02-25 | Enhancing DNA Foundation Models to Address Masking Inefficiencies | Monireh Safari et.al. | 2502.18405 | null |
| 2025-02-25 | Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods | Nicola Cecere et.al. | 2502.18389 | null |
| 2025-02-25 | How Far are LLMs from Real Search? A Comprehensive Study on Efficiency, Completeness, and Inherent Capabilities | Minhua Lin et.al. | 2502.18387 | null |
| 2025-02-25 | MindMem: Multimodal for Predicting Advertisement Memorability Using LLMs and Deep Learning | Sepehr Asgarian et.al. | 2502.18371 | null |
| 2025-02-25 | Responsible AI Agents | Deven R. Desai et.al. | 2502.18359 | null |
| 2025-02-25 | Which Contributions Deserve Credit? Perceptions of Attribution in Human-AI Co-Creation | Jessica He et.al. | 2502.18357 | null |
| 2025-02-24 | Introducing Visual Perception Token into Multimodal Large Language Model | Runpeng Yu et.al. | 2502.17425 | link |
| 2025-02-24 | MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs | Jiarui Zhang et.al. | 2502.17422 | link |
| 2025-02-24 | LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification | Penghui Yang et.al. | 2502.17421 | link |
| 2025-02-24 | The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence | Tom Wollschläger et.al. | 2502.17420 | null |
| 2025-02-24 | From System 1 to System 2: A Survey of Reasoning Large Language Models | Zhong-Zhi Li et.al. | 2502.17419 | link |
| 2025-02-24 | Reasoning with Latent Thoughts: On the Power of Looped Transformers | Nikunj Saunshi et.al. | 2502.17416 | null |
| 2025-02-24 | COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs | Liming Liu et.al. | 2502.17410 | link |
| 2025-02-24 | Large Language Models are Powerful EHR Encoders | Stefan Hegselmann et.al. | 2502.17403 | link |
| 2025-02-24 | Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models | Alon Albalak et.al. | 2502.17387 | link |
| 2025-02-24 | Bridging Gaps in Natural Language Processing for Yorùbá: A Systematic Review of a Decade of Progress and Prospects | Toheeb A. Jimoh et.al. | 2502.17364 | null |
| 2025-02-24 | A Closer Look at TabPFN v2: Strength, Limitation, and Extension | Han-Jia Ye et.al. | 2502.17361 | null |
| 2025-02-24 | RELICT: A Replica Detection Framework for Medical Image Generation | Orhun Utku Aydin et.al. | 2502.17360 | link |
| 2025-02-24 | DIS-CO: Discovering Copyrighted Content in VLMs Training Data | André V. Duarte et.al. | 2502.17358 | link |
| 2025-02-24 | Distributional Scaling Laws for Emergent Capabilities | Rosie Zhao et.al. | 2502.17356 | null |
| 2025-02-24 | On Relation-Specific Neurons in Large Language Models | Yihong Liu et.al. | 2502.17355 | link |
| 2025-02-24 | How Scientists Use Large Language Models to Program | Gabrielle O'Brien et.al. | 2502.17348 | null |
| 2025-02-24 | Time series forecasting based on optimized LLM for fault prediction in distribution power grid insulators | João Pedro Matos-Carvalho et.al. | 2502.17341 | null |
| 2025-02-24 | Tokenized SAEs: Disentangling SAE Reconstructions | Thomas Dooms et.al. | 2502.17332 | null |
| 2025-02-24 | HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization | Zhenghao Liu et.al. | 2502.17315 | link |
| 2025-02-24 | `Generalization is hallucination' through the lens of tensor completions | Liang Ze Wong et.al. | 2502.17305 | null |
| 2025-02-21 | ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval | Guanqi Zhan et.al. | 2502.15682 | null |
| 2025-02-21 | Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training | Jaydeep Borkar et.al. | 2502.15680 | link |
| 2025-02-21 | BOSS: Benchmark for Observation Space Shift in Long-Horizon Task | Yue Yang et.al. | 2502.15679 | null |
| 2025-02-21 | Testing the limits of fine-tuning to improve reasoning in vision language models | Luca M. Schulze Buschoff et.al. | 2502.15678 | null |
| 2025-02-21 | FLEKE: Federated Locate-then-Edit Knowledge Editing | Zongkai Zhao et.al. | 2502.15677 | link |
| 2025-02-21 | AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind | Zhining Zhang et.al. | 2502.15676 | link |
| 2025-02-21 | Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing | Shoumik Saha et.al. | 2502.15666 | link |
| 2025-02-21 | Machine-generated text detection prevents language model collapse | George Drayson et.al. | 2502.15654 | link |
| 2025-02-21 | Empowering LLMs with Logical Reasoning: A Comprehensive Survey | Fengxiang Cheng et.al. | 2502.15652 | null |
| 2025-02-21 | Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models | Anirudh Sundar et.al. | 2502.15639 | null |
| 2025-02-21 | Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification | Vasilii Feofanov et.al. | 2502.15637 | link |
| 2025-02-21 | The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer | Marthe Ballon et.al. | 2502.15631 | link |
| 2025-02-21 | Extraction multi-étiquettes de relations en utilisant des couches de Transformer | Ngoc Luyen Le et.al. | 2502.15619 | null |
| 2025-02-21 | Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing | Qi Le et.al. | 2502.15618 | link |
| 2025-02-21 | PDeepPP:A Deep learning framework with Pretrained Protein language for peptide classification | Jixiu Zhai et.al. | 2502.15610 | link |
| 2025-02-21 | On the Robustness of Transformers against Context Hijacking for Linear Classification | Tianle Li et.al. | 2502.15609 | null |
| 2025-02-21 | Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance | Akos Nagy et.al. | 2502.15604 | null |
| 2025-02-21 | Do Multilingual LLMs Think In English? | Lisa Schut et.al. | 2502.15603 | null |
| 2025-02-21 | WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents | Xinhang Liu et.al. | 2502.15601 | null |
| 2025-02-21 | SafeInt: Shielding Large Language Models from Jailbreak Attacks via Safety-Aware Representation Intervention | Jiaqi Wu et.al. | 2502.15594 | null |
| 2025-02-20 | LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention | Shang Yang et.al. | 2502.14866 | link |
| 2025-02-20 | Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning | Shuyue Stella Li et.al. | 2502.14860 | link |
| 2025-02-20 | FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling | Weilin Zhao et.al. | 2502.14856 | null |
| 2025-02-20 | Prompt-to-Leaderboard | Evan Frick et.al. | 2502.14855 | link |
| 2025-02-20 | GATE: Graph-based Adaptive Tool Evolution Across Diverse Tasks | Jianwen Luo et.al. | 2502.14848 | link |
| 2025-02-20 | Red-Teaming LLM Multi-Agent Systems via Communication Attacks | Pengfei He et.al. | 2502.14847 | null |
| 2025-02-20 | Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation | Yue Yang et.al. | 2502.14846 | null |
| 2025-02-20 | Revealing and Mitigating Over-Attention in Knowledge Editing | Pinzheng Wang et.al. | 2502.14838 | link |
| 2025-02-20 | LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models | Shangqing Tu et.al. | 2502.14834 | link |
| 2025-02-20 | Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs | Danni Liu et.al. | 2502.14830 | link |
| 2025-02-20 | Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps | Martin Tutek et.al. | 2502.14829 | link |
| 2025-02-20 | Exploring Advanced Techniques for Visual Question Answering: A Comprehensive Comparison | Aiswarya Baby et.al. | 2502.14827 | null |
| 2025-02-20 | A Survey of Model Architectures in Information Retrieval | Zhichao Xu et.al. | 2502.14822 | null |
| 2025-02-20 | eC-Tab2Text: Aspect-Based Text Generation from e-Commerce Product Tables | Luis Antonio Gutiérrez Guanilo et.al. | 2502.14820 | null |
| 2025-02-20 | Dynamic Low-Rank Sparse Adaptation for Large Language Models | Weizhong Huang et.al. | 2502.14816 | link |
| 2025-02-20 | FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis | Fadillah Maani et.al. | 2502.14807 | link |
| 2025-02-20 | From RAG to Memory: Non-Parametric Continual Learning for Large Language Models | Bernal Jiménez Gutiérrez et.al. | 2502.14802 | link |
| 2025-02-20 | A Multi-Agent Perspective on Modern Information Retrieval | Haya Nachimovsky et.al. | 2502.14796 | null |
| 2025-02-20 | Rapid Word Learning Through Meta In-Context Learning | Wentao Wang et.al. | 2502.14791 | null |
| 2025-02-20 | SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features | Michael Tschannen et.al. | 2502.14786 | link |
| 2025-02-19 | Where's the Bug? Attention Probing for Scalable Fault Localization | Adam Stein et.al. | 2502.13966 | null |
| 2025-02-19 | Autellix: An Efficient Serving Engine for LLM Agents as General Programs | Michael Luo et.al. | 2502.13965 | null |
| 2025-02-19 | MuDAF: Long-Context Multi-Document Attention Focusing through Contrastive Learning on Attention Heads | Weihao Liu et.al. | 2502.13963 | link |
| 2025-02-19 | Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering | William Jurayj et.al. | 2502.13962 | null |
| 2025-02-19 | LIDDIA: Language-based Intelligent Drug Discovery Agent | Reza Averly et.al. | 2502.13959 | null |
| 2025-02-19 | Neurosymbolic artificial intelligence via large language models and coherence-driven inference | Steve Huntsman et.al. | 2502.13953 | null |
| 2025-02-19 | Why Safeguarded Ships Run Aground? Aligned Large Language Models' Safety Mechanisms Tend to Be Anchored in The Template Region | Chak Tou Leong et.al. | 2502.13946 | null |
| 2025-02-19 | A Chain-of-Thought Subspace Meta-Learning for Few-shot Image Captioning with Large Vision and Language Models | Hao Huang et.al. | 2502.13942 | null |
| 2025-02-19 | Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images | Shengguang Wu et.al. | 2502.13928 | null |
| 2025-02-19 | Beyond Single Frames: Can LMMs Comprehend Temporal and Contextual Narratives in Image Sequences? | Xiaochen Wang et.al. | 2502.13925 | null |
| 2025-02-19 | LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization | Guanzheng Chen et.al. | 2502.13922 | link |
| 2025-02-19 | Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis | Jiahao Gai et.al. | 2502.13921 | null |
| 2025-02-19 | Exploring Personalized Health Support through Data-Driven, Theory-Guided LLMs: A Case Study in Sleep Health | Xingbo Wang et.al. | 2502.13920 | link |
| 2025-02-19 | TESS 2: A Large-Scale Generalist Diffusion Language Model | Jaesung Tae et.al. | 2502.13917 | link |
| 2025-02-19 | How Do LLMs Perform Two-Hop Reasoning in Context? | Tianyu Guo et.al. | 2502.13913 | null |
| 2025-02-19 | Lost in Sequence: Do Large Language Models Understand Sequential Recommendation? | Sein Kim et.al. | 2502.13909 | link |
| 2025-02-19 | Judging the Judges: A Collection of LLM-Generated Relevance Judgements | Hossein A. Rahmani et.al. | 2502.13908 | link |
| 2025-02-19 | DataSciBench: An LLM Agent Benchmark for Data Science | Dan Zhang et.al. | 2502.13897 | link |
| 2025-02-19 | NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants | Yiran Qin et.al. | 2502.13894 | null |
| 2025-02-19 | Refining embeddings with fill-tuning: data-efficient generalised performance improvements for materials foundation models | Matthew P. Wilson et.al. | 2502.13886 | link |
| 2025-02-18 | Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization | Shuo Xing et.al. | 2502.13146 | link |
| 2025-02-18 | Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation | Bencheng Liao et.al. | 2502.13145 | link |
| 2025-02-18 | Pre-training Auto-regressive Robotic Models with 4D Representations | Dantong Niu et.al. | 2502.13142 | null |
| 2025-02-18 | UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models | Huawei Lin et.al. | 2502.13141 | link |
| 2025-02-18 | AIDE: AI-Driven Exploration in the Space of Code | Zhengyao Jiang et.al. | 2502.13138 | link |
| 2025-02-18 | Theorem Prover as a Judge for Synthetic Data Generation | Joshua Ong Jun Leang et.al. | 2502.13137 | null |
| 2025-02-18 | Sleepless Nights, Sugary Days: Creating Synthetic Users with Health Conditions for Realistic Coaching Agent Interactions | Taedong Yun et.al. | 2502.13135 | null |
| 2025-02-18 | Learning to Defer for Causal Discovery with Imperfect Experts | Oscar Clivio et.al. | 2502.13132 | null |
| 2025-02-18 | Rethinking Diverse Human Preference Learning through Principal Component Analysis | Feng Luo et.al. | 2502.13131 | null |
| 2025-02-18 | Magma: A Foundation Model for Multimodal AI Agents | Jianwei Yang et.al. | 2502.13130 | link |
| 2025-02-18 | Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning | Jingyang Lin et.al. | 2502.13127 | null |
| 2025-02-18 | RuozhiBench: Evaluating LLMs with Logical Fallacies and Misleading Premises | Zenan Zhai et.al. | 2502.13125 | link |
| 2025-02-18 | Adapting Psycholinguistic Research for LLMs: Gender-inclusive Language in a Coreference Context | Marion Bartl et.al. | 2502.13120 | null |
| 2025-02-18 | STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models | Narun Raman et.al. | 2502.13119 | null |
| 2025-02-18 | Performance Evaluation of Large Language Models in Statistical Programming | Xinyi Song et.al. | 2502.13117 | link |
| 2025-02-18 | MatterChat: A Multi-Modal LLM for Material Science | Yingheng Tang et.al. | 2502.13107 | null |
| 2025-02-18 | Understanding and Rectifying Safety Perception Distortion in VLMs | Xiaohan Zou et.al. | 2502.13095 | null |
| 2025-02-18 | Text2World: Benchmarking Large Language Models for Symbolic World Model Generation | Mengkang Hu et.al. | 2502.13092 | null |
| 2025-02-18 | KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits | Xin Xia et.al. | 2502.13076 | null |
| 2025-02-18 | Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity | Yuri Kuratov et.al. | 2502.13063 | link |
| 2025-02-17 | Idiosyncrasies in Large Language Models | Mingjie Sun et.al. | 2502.12150 | link |
| 2025-02-17 | HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation | Ling Yang et.al. | 2502.12148 | link |
| 2025-02-17 | Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control | Jinyan Su et.al. | 2502.12145 | link |
| 2025-02-17 | Small Models Struggle to Learn from Strong Reasoners | Yuetai Li et.al. | 2502.12143 | null |
| 2025-02-17 | SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs | Yige Xu et.al. | 2502.12134 | link |
| 2025-02-17 | Transformer Dynamics: A neuroscientific approach to interpretability of large language models | Jesseba Fernando et.al. | 2502.12131 | null |
| 2025-02-17 | Scaling Autonomous Agents via Automatic Reward Modeling And Planning | Zhenfang Chen et.al. | 2502.12130 | null |
| 2025-02-17 | On the Query Complexity of Verifier-Assisted Language Generation | Edoardo Botta et.al. | 2502.12123 | null |
| 2025-02-17 | Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA | Patryk Marszałek et.al. | 2502.12122 | link |
| 2025-02-17 | LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws | Prasanna Mayilvahanan et.al. | 2502.12120 | null |
| 2025-02-17 | PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection | Jinhe Bi et.al. | 2502.12119 | null |
| 2025-02-17 | A-MEM: Agentic Memory for LLM Agents | Wujiang Xu et.al. | 2502.12110 | link |
| 2025-02-17 | Personality Structured Interview for Large Language Model Simulation in Personality Research | Pengda Wang et.al. | 2502.12109 | null |
| 2025-02-17 | Relational Norms for Human-AI Cooperation | Brian D. Earp et.al. | 2502.12102 | null |
| 2025-02-17 | Token Communications: A Unified Framework for Cross-modal Context-aware Semantic Communications | Li Qiao et.al. | 2502.12096 | null |
| 2025-02-17 | Descriminative-Generative Custom Tokens for Vision-Language Models | Pramuditha Perera et.al. | 2502.12095 | null |
| 2025-02-17 | Meta-Statistical Learning: Supervised Learning of Statistical Inference | Maxime Peyrard et.al. | 2502.12088 | null |
| 2025-02-17 | APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs | Yuxiang Huang et.al. | 2502.12085 | link |
| 2025-02-17 | VLM |
Jianshu Zhang et.al. | 2502.12084 | null |
| 2025-02-17 | AdaSplash: Adaptive Sparse Flash Attention | Nuno Gonçalves et.al. | 2502.12082 | link |
| 2025-02-14 | MM-RLHF: The Next Step Forward in Multimodal LLM Alignment | Yi-Fan Zhang et.al. | 2502.10391 | null |
| 2025-02-14 | Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction | WonJin Yoon et.al. | 2502.10388 | null |
| 2025-02-14 | Unknown Word Detection for English as a Second Language (ESL) Learners Using Gaze and Pre-trained Language Models | Jiexin Ding et.al. | 2502.10378 | null |
| 2025-02-14 | Robustness tests for biomedical foundation models should tailor to specification | R. Patrick Xian et.al. | 2502.10374 | link |
| 2025-02-14 | Enhancing Multilingual LLM Pretraining with Model-Based Data Selection | Bettina Messmer et.al. | 2502.10361 | null |
| 2025-02-14 | Organize the Web: Constructing Domains Enhances Pre-Training Data Curation | Alexander Wettig et.al. | 2502.10341 | null |
| 2025-02-14 | Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering | Nick Ferguson et.al. | 2502.10338 | null |
| 2025-02-14 | LLM-Powered Preference Elicitation in Combinatorial Assignment | Ermis Soumalias et.al. | 2502.10308 | null |
| 2025-02-14 | SPIRIT: Short-term Prediction of solar IRradIance for zero-shot Transfer learning using Foundation Models | Aditya Mishra et.al. | 2502.10307 | null |
| 2025-02-14 | Open-Source AI-Powered Optimization in Scalene: Advancing Python Performance Profiling with DeepSeek-R1 and LLaMA 3.2 | Saem Hasan et.al. | 2502.10299 | null |
| 2025-02-14 | DeltaProduct: Increasing the Expressivity of DeltaNet Through Products of Householders | Julien Siems et.al. | 2502.10297 | link |
| 2025-02-14 | Probing Perceptual Constancy in Large Vision Language Models | Haoran Sun et.al. | 2502.10273 | null |
| 2025-02-14 | Are Large Language Models the future crowd workers of Linguistics? | Iris Ferrazzo et.al. | 2502.10266 | null |
| 2025-02-14 | Large Language Models and Synthetic Data for Monitoring Dataset Mentions in Research Papers | Aivin V. Solatorio et.al. | 2502.10263 | link |
| 2025-02-14 | VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models | Gokul Karthik Kumar et.al. | 2502.10250 | null |
| 2025-02-14 | Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model | Guoqing Ma et.al. | 2502.10248 | link |
| 2025-02-14 | Efficient Zero-Order Federated Finetuning of Language Models for Resource-Constrained Devices | Mohamed Aboelenien Ahmed et.al. | 2502.10239 | null |
| 2025-02-14 | AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting | Abdelhakim Benechehab et.al. | 2502.10235 | link |
| 2025-02-14 | Do Large Language Models Reason Causally Like Us? Even Better? | Hanna M. Dettki et.al. | 2502.10215 | null |
| 2025-02-14 | Can Post-Training Quantization Benefit from an Additional QLoRA Integration? | Xiliang Zhu et.al. | 2502.10202 | null |
| 2025-02-13 | Theoretical Benefit and Limitation of Diffusion Language Model | Guhao Feng et.al. | 2502.09622 | null |
| 2025-02-13 | MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency | Dongzhi Jiang et.al. | 2502.09621 | null |
| 2025-02-13 | Exploring the Potential of Encoder-free Architectures in 3D LMMs | Yiwen Tang et.al. | 2502.09620 | link |
| 2025-02-13 | Human-LLM Coevolution: Evidence from Academic Writing | Mingmeng Geng et.al. | 2502.09606 | null |
| 2025-02-13 | SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models | Yung-Sung Chuang et.al. | 2502.09604 | link |
| 2025-02-13 | GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis | Angelos Zavras et.al. | 2502.09598 | link |
| 2025-02-13 | Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs | Siyan Zhao et.al. | 2502.09597 | link |
| 2025-02-13 | KIMAs: A Configurable Knowledge Integrated Multi-Agent System | Zitao Li et.al. | 2502.09596 | null |
| 2025-02-13 | Logical forms complement probability in understanding language model (and human) performance | Yixuan Wang et.al. | 2502.09589 | null |
| 2025-02-13 | Polymind: Parallel Visual Diagramming with Large Language Models to Support Prewriting Through Microtasks | Qian Wan et.al. | 2502.09577 | null |
| 2025-02-13 | MorphNLI: A Stepwise Approach to Natural Language Inference Using Text Morphing | Vlad Andrei Negru et.al. | 2502.09567 | null |
| 2025-02-13 | Zero-shot generation of synthetic neurosurgical data with large language models | Austin A. Barr et.al. | 2502.09566 | link |
| 2025-02-13 | MDCrow: Automating Molecular Dynamics Workflows with Large Language Models | Quintina Campbell et.al. | 2502.09565 | link |
| 2025-02-13 | EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents | Rui Yang et.al. | 2502.09560 | null |
| 2025-02-13 | Explainable AI-assisted Optimization for Feynman Integral Reduction | Zhuo-Yang Song et.al. | 2502.09544 | null |
| 2025-02-13 | Mind the Gap! Choice Independence in Using Multilingual LLMs for Persuasive Co-Writing Tasks in Different Languages | Shreyan Biswas et.al. | 2502.09532 | null |
| 2025-02-13 | When and How Does CLIP Enable Domain and Compositional Generalization? | Elias Kempf et.al. | 2502.09507 | link |
| 2025-02-13 | Improve LLM-based Automatic Essay Scoring with Linguistic Features | Zhaoyi Joey Hou et.al. | 2502.09497 | null |
| 2025-02-13 | Foundation Neural-Network Quantum States | Riccardo Rende et.al. | 2502.09488 | null |
| 2025-02-13 | Objective quantification of mood states using large language models | Jakub Onysk et.al. | 2502.09487 | null |
| 2025-02-12 | SwiftSketch: A Diffusion Model for Image-to-Vector Sketch Generation | Ellie Arar et.al. | 2502.08642 | null |
| 2025-02-12 | Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples | Andrianos Michail et.al. | 2502.08638 | null |
| 2025-02-12 | Ensemble based approach to quantifying uncertainty of LLM based classifications | Srijith Rajamohan et.al. | 2502.08631 | null |
| 2025-02-12 | Continuous Cardiac Arrest Prediction in ICU using PPG Foundation Model | Saurabh Kataria et.al. | 2502.08612 | null |
| 2025-02-12 | Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors | Vishwanath Pratap Singh et.al. | 2502.08587 | null |
| 2025-02-12 | Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks | Ang Li et.al. | 2502.08586 | null |
| 2025-02-12 | COAST: Intelligent Time-Adaptive Neural Operators | Zhikai Wu et.al. | 2502.08574 | null |
| 2025-02-12 | QA-Expand: Multi-Question Answer Generation for Enhanced Query Expansion in Information Retrieval | Wonduk Seo et.al. | 2502.08557 | null |
| 2025-02-12 | Human-Centric Foundation Models: Perception, Generation and Agentic Modeling | Shixiang Tang et.al. | 2502.08556 | link |
| 2025-02-12 | Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies | Sunnie S. Y. Kim et.al. | 2502.08554 | null |
| 2025-02-12 | LLMs can implicitly learn from mistakes in-context | Lisa Alazraki et.al. | 2502.08550 | null |
| 2025-02-12 | Representation Learning to Advance Multi-institutional Studies with Electronic Health Record Data | Doudou Zhou et.al. | 2502.08547 | null |
| 2025-02-12 | Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval | Kevin Flanagan et.al. | 2502.08544 | link |
| 2025-02-12 | LLM Pretraining with Continuous Concepts | Jihoon Tack et.al. | 2502.08524 | null |
| 2025-02-12 | The Paradox of Stochasticity: Limited Creativity and Computational Decoupling in Temperature-Varied LLM Outputs of Structured Fictional Data | Evgenii Evstafev et.al. | 2502.08515 | null |
| 2025-02-12 | Faithful, Unfaithful or Ambiguous? Multi-Agent Debate with Initial Stance for Summary Evaluation | Mahnaz Koupaee et.al. | 2502.08514 | link |
| 2025-02-12 | Measuring Diversity in Synthetic Datasets | Yuchang Zhu et.al. | 2502.08512 | link |
| 2025-02-12 | Explanation based In-Context Demonstrations Retrieval for Multilingual Grammatical Error Correction | Wei Li et.al. | 2502.08507 | link |
| 2025-02-12 | Salamandra Technical Report | Aitor Gonzalez-Agirre et.al. | 2502.08489 | link |
| 2025-02-12 | One-Shot Federated Learning with Classifier-Free Diffusion Models | Obaidullah Zaland et.al. | 2502.08488 | null |
| 2025-02-11 | DarwinLM: Evolutionary Structured Pruning of Large Language Models | Shengkun Tang et.al. | 2502.07780 | link |
| 2025-02-11 | Auditing Prompt Caching in Language Model APIs | Chenchen Gu et.al. | 2502.07776 | link |
| 2025-02-11 | Automatic Robot Task Planning by Integrating Large Language Model with Genetic Programming | Azizjon Kobilov et.al. | 2502.07772 | null |
| 2025-02-11 | Breaking Down Bias: On The Limits of Generalizable Pruning Strategies | Sibo Ma et.al. | 2502.07771 | null |
| 2025-02-11 | Great Power Brings Great Responsibility: Personalizing Conversational AI for Diverse Problem-Solvers | Italo Santos et.al. | 2502.07763 | null |
| 2025-02-11 | Scalable Fingerprinting of Large Language Models | Anshul Nasery et.al. | 2502.07760 | null |
| 2025-02-11 | Towards Efficient Optimizer Design for LLM via Structured Fisher Approximation with a Low-Rank Extension | Wenbo Gong et.al. | 2502.07752 | null |
| 2025-02-11 | WHODUNIT: Evaluation benchmark for culprit detection in mystery stories | Kshitij Gupta et.al. | 2502.07747 | link |
| 2025-02-11 | The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing | Dirk Bergemann et.al. | 2502.07736 | null |
| 2025-02-11 | Economics of Sourcing Human Data | Sebastin Santy et.al. | 2502.07732 | null |
| 2025-02-11 | Verifying LLM-Generated Code in the Context of Software Verification with Ada/SPARK | Marcos Cramer et.al. | 2502.07728 | null |
| 2025-02-11 | Making Language Models Robust Against Negation | MohammadHossein Rezaei et.al. | 2502.07717 | link |
| 2025-02-11 | Magic 1-For-1: Generating One Minute Video Clips within One Minute | Hongwei Yi et.al. | 2502.07701 | link |
| 2025-02-11 | A Framework for LLM-powered Design Assistants | Swaroop Panda et.al. | 2502.07698 | null |
| 2025-02-11 | Large Language Models as Proxies for Theories of Human Linguistic Cognition | Imry Ziv et.al. | 2502.07687 | null |
| 2025-02-11 | SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models | Shihao Xia et.al. | 2502.07644 | null |
| 2025-02-11 | FoQA: A Faroese Question-Answering Dataset | Annika Simonsen et.al. | 2502.07642 | null |
| 2025-02-11 | Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving | Yong Lin et.al. | 2502.07640 | link |
| 2025-02-11 | Exploring Mobile Touch Interaction with Large Language Models | Tim Zindulka et.al. | 2502.07629 | null |
| 2025-02-11 | Scaling Pre-training to One Hundred Billion Data for Vision Language Models | Xiao Wang et.al. | 2502.07617 | null |
| 2025-02-10 | EVEv2: Improved Baselines for Encoder-Free Vision-Language Models | Haiwen Diao et.al. | 2502.06788 | link |
| 2025-02-10 | Visual Agentic AI for Spatial Reasoning with a Dynamic API | Damiano Marsili et.al. | 2502.06787 | null |
| 2025-02-10 | DeepCrossAttention: Supercharging Transformer Residual Connections | Mike Heddes et.al. | 2502.06785 | null |
| 2025-02-10 | Towards Internet-Scale Training For Agents | Brandon Trabucco et.al. | 2502.06776 | null |
| 2025-02-10 | Enhancing Trust in Language Model-Based Code Optimization through RLHF: A Research Design | Jingzhi Gong et.al. | 2502.06769 | null |
| 2025-02-10 | Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs | Ryan Synk et.al. | 2502.06766 | link |
| 2025-02-10 | Rationalization Models for Text-to-SQL | Gaetano Rossiello et.al. | 2502.06759 | null |
| 2025-02-10 | Accelerating Data Processing and Benchmarking of AI Models for Pathology | Andrew Zhang et.al. | 2502.06750 | link |
| 2025-02-10 | Gradient Multi-Normalization for Stateless and Scalable LLM Training | Meyer Scetbon et.al. | 2502.06742 | null |
| 2025-02-10 | VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data | Thomas Zeng et.al. | 2502.06737 | null |
| 2025-02-10 | Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining | Daouda Sow et.al. | 2502.06733 | null |
| 2025-02-10 | Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling | Runze Liu et.al. | 2502.06703 | link |
| 2025-02-10 | EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks | Michael Arbel et.al. | 2502.06684 | null |
| 2025-02-10 | Boosting Self-Efficacy and Performance of Large Language Models via Verbal Efficacy Stimulations | Rui Chen et.al. | 2502.06669 | null |
| 2025-02-10 | Automatic Evaluation of Healthcare LLMs Beyond Question-Answering | Anna Arias-Duart et.al. | 2502.06666 | null |
| 2025-02-10 | Evaluation of Deep Audio Representations for Hearables | Fabian Gröger et.al. | 2502.06664 | null |
| 2025-02-10 | EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models | Xingrun Xing et.al. | 2502.06663 | null |
| 2025-02-10 | Unbiased Evaluation of Large Language Models from a Causal Perspective | Meilin Chen et.al. | 2502.06655 | null |
| 2025-02-10 | In-Context Learning (and Unlearning) of Length Biases | Stephanie Schoch et.al. | 2502.06653 | null |
| 2025-02-10 | Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A | Anna Leschanowsky et.al. | 2502.06652 | null |
| 2025-02-07 | Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray | Yunhang Shen et.al. | 2502.05177 | link |
| 2025-02-07 | Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach | Jonas Geiping et.al. | 2502.05171 | link |
| 2025-02-07 | NoLiMa: Long-Context Evaluation Beyond Literal Matching | Ali Modarressi et.al. | 2502.05167 | link |
| 2025-02-07 | Multitwine: Multi-Object Compositing with Text and Layout Control | Gemma Canet Tarrés et.al. | 2502.05165 | null |
| 2025-02-07 | DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails | Yihe Deng et.al. | 2502.05163 | link |
| 2025-02-07 | A Lightweight Method to Disrupt Memorized Sequences in LLM | Parjanya Prajakta Prashant et.al. | 2502.05159 | null |
| 2025-02-07 | Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation | Steffen Eger et.al. | 2502.05151 | link |
| 2025-02-07 | CodeSCM: Causal Analysis for Multi-Modal Code Generation | Mukur Gupta et.al. | 2502.05150 | link |
| 2025-02-07 | An Annotated Reading of 'The Singer of Tales' in the LLM Era | Kush R. Varshney et.al. | 2502.05148 | null |
| 2025-02-07 | Chest X-ray Foundation Model with Global and Local Representations Integration | Zefan Yang et.al. | 2502.05142 | link |
| 2025-02-07 | Refining Integration-by-Parts Reduction of Feynman Integrals with Machine Learning | Matt von Hippel et.al. | 2502.05121 | null |
| 2025-02-07 | Flexible and Efficient Grammar-Constrained Decoding | Kanghee Park et.al. | 2502.05111 | null |
| 2025-02-07 | Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs | Rohit Saxena et.al. | 2502.05092 | null |
| 2025-02-07 | DCFormer: Efficient 3D Vision-Language Modeling with Decomposed Convolutions | Gorkem Can Ates et.al. | 2502.05091 | null |
| 2025-02-07 | Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs | Thierry Bossy et.al. | 2502.05087 | link |
| 2025-02-07 | Causality can systematically address the monsters under the bench(marks) | Felix Leeb et.al. | 2502.05085 | null |
| 2025-02-07 | ChallengeMe: An Adversarial Learning-enabled Text Summarization Framework | Xiaoyu Deng et.al. | 2502.05084 | null |
| 2025-02-07 | Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures | Tushar Pandey et.al. | 2502.05078 | link |
| 2025-02-07 | nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow | Geliang Ouyang et.al. | 2502.05036 | link |
| 2025-02-07 | EnseSmells: Deep ensemble and programming language models for automated code smells detection | Anh Ho et.al. | 2502.05012 | link |
| 2025-02-06 | Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment | Zuyan Liu et.al. | 2502.04328 | link |
| 2025-02-06 | Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions | Yik Siu Chan et.al. | 2502.04322 | link |
| 2025-02-06 | ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features | Alec Helbling et.al. | 2502.04320 | link |
| 2025-02-06 | sshELF: Single-Shot Hierarchical Extrapolation of Latent Features for 3D Reconstruction from Sparse-Views | Eyvaz Najafli et.al. | 2502.04318 | null |
| 2025-02-06 | ChamaleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters | Kamer Ali Yuksel et.al. | 2502.04315 | link |
| 2025-02-06 | Great Models Think Alike and this Undermines AI Oversight | Shashwat Goel et.al. | 2502.04313 | link |
| 2025-02-06 | ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization | Yinjie Wang et.al. | 2502.04306 | link |
| 2025-02-06 | Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization | Yuanye Liu et.al. | 2502.04295 | link |
| 2025-02-06 | PILAF: Optimal Human Preference Sampling for Reward Modeling | Yunzhen Feng et.al. | 2502.04270 | null |
| 2025-02-06 | How does a Multilingual LM Handle Multiple Languages? | Santhosh Kakarla et.al. | 2502.04269 | null |
| 2025-02-06 | Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion | Marco Mistretta et.al. | 2502.04263 | link |
| 2025-02-06 | Efficient Randomized Experiments Using Foundation Models | Piersilvio De Bartolomeis et.al. | 2502.04262 | link |
| 2025-02-06 | MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion | Xintong Hao et.al. | 2502.04235 | null |
| 2025-02-06 | Can LLMs Hack Enterprise Networks? Autonomous Assumed Breach Penetration-Testing Active Directory Networks | Andreas Happe et.al. | 2502.04227 | link |
| 2025-02-06 | Keep It Light! Simplifying Image Clustering Via Text-Free Adapters | Yicen Li et.al. | 2502.04226 | null |
| 2025-02-06 | Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents | Ilia Karmanov et.al. | 2502.04223 | null |
| 2025-02-06 | Sports and Women's Sports: Gender Bias in Text Generation with Olympic Data | Laura Biester et.al. | 2502.04218 | null |
| 2025-02-06 | Algorithmic causal structure emerging through compression | Liang Wendong et.al. | 2502.04210 | null |
| 2025-02-06 | "Short-length" Adversarial Training Helps LLMs Defend "Long-length" Jailbreak Attacks: Theoretical and Empirical Evidence | Shaopeng Fu et.al. | 2502.04204 | link |
| 2025-02-06 | The Best Instruction-Tuning Data are Those That Fit | Dylan Zhang et.al. | 2502.04194 | null |
| 2025-02-05 | Do Large Language Model Benchmarks Test Reliability? | Joshua Vendrow et.al. | 2502.03461 | link |
| 2025-02-05 | Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training | Boyao Wang et.al. | 2502.03460 | null |
| 2025-02-05 | SKI Models: Skeleton Induced Vision-Language Embeddings for Understanding Activities of Daily Living | Arkaprava Sinha et.al. | 2502.03459 | null |
| 2025-02-05 | A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs) | Yiye Chen et.al. | 2502.03450 | null |
| 2025-02-05 | BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving | Ran Xin et.al. | 2502.03438 | null |
| 2025-02-05 | On Fairness of Unified Multimodal Large Language Model for Image Generation | Ming Liu et.al. | 2502.03429 | null |
| 2025-02-05 | Harnessing Large Language Models for Curated Code Reviews | Oussama Ben Sghaier et.al. | 2502.03425 | link |
| 2025-02-05 | Think or Step-by-Step? UnZIPping the Black Box in Zero-Shot Prompts | Nikta Gohari Sadr et.al. | 2502.03418 | null |
| 2025-02-05 | SPRI: Aligning Large Language Models with Context-Situated Principles | Hongli Zhan et.al. | 2502.03397 | null |
| 2025-02-05 | Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications | Issar Arab et.al. | 2502.03395 | null |
| 2025-02-05 | LIMO: Less is More for Reasoning | Yixin Ye et.al. | 2502.03387 | link |
| 2025-02-05 | Transformers and Their Roles as Time Series Foundation Models | Dennis Wu et.al. | 2502.03383 | null |
| 2025-02-05 | High-Fidelity Simultaneous Speech-To-Speech Translation | Tom Labiausse et.al. | 2502.03382 | link |
| 2025-02-05 | Demystifying Long Chain-of-Thought Reasoning in LLMs | Edward Yeo et.al. | 2502.03373 | link |
| 2025-02-05 | PalimpChat: Declarative and Interactive AI analytics | Chunwei Liu et.al. | 2502.03368 | null |
| 2025-02-05 | Minerva: A Programmable Memory Test Benchmark for Language Models | Menglin Xia et.al. | 2502.03358 | null |
| 2025-02-05 | RadVLM: A Multitask Conversational Vision-Language Model for Radiology | Nicolas Deperrois et.al. | 2502.03333 | null |
| 2025-02-05 | ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model | Qiguang Chen et.al. | 2502.03325 | null |
| 2025-02-05 | Out-of-Distribution Detection using Synthetic Data Generation | Momin Abbas et.al. | 2502.03323 | null |
| 2025-02-05 | Simplifying Formal Proof-Generating Models with ChatGPT and Basic Searching Techniques | Sangjun Han et.al. | 2502.03321 | null |
| 2025-02-04 | Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling | Xiaowen Qiu et.al. | 2502.02590 | null |
| 2025-02-04 | COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation | Xueqing Deng et.al. | 2502.02589 | null |
| 2025-02-04 | A comparison of translation performance between DeepL and Supertext | Alex Flückiger et.al. | 2502.02577 | link |
| 2025-02-04 | Are Language Models Up to Sequential Optimization Problems? From Evaluation to a Hegelian-Inspired Enhancement | Soheil Abbasloo et.al. | 2502.02573 | null |
| 2025-02-04 | Learning the RoPEs: Better 2D and 3D Position Encodings with STRING | Connor Schenck et.al. | 2502.02562 | null |
| 2025-02-04 | Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation | Junha Lee et.al. | 2502.02548 | null |
| 2025-02-04 | LLMs for Generation of Architectural Components: An Exploratory Empirical Study in the Serverless World | Shrikara Arun et.al. | 2502.02539 | null |
| 2025-02-04 | Adaptive Self-improvement LLM Agentic System for ML Library Development | Genghan Zhang et.al. | 2502.02534 | link |
| 2025-02-04 | Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies | Han Zhou et.al. | 2502.02533 | null |
| 2025-02-04 | Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search | Maohao Shen et.al. | 2502.02508 | null |
| 2025-02-04 | Analyzing Similarity Metrics for Data Selection for Language Model Pretraining | Dylan Sam et.al. | 2502.02494 | null |
| 2025-02-04 | EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization | Yize Wu et.al. | 2502.02493 | null |
| 2025-02-04 | Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study | Menglong Cui et.al. | 2502.02481 | null |
| 2025-02-04 | Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification | Valentina Vadori et.al. | 2502.02471 | link |
| 2025-02-04 | Modular Training of Neural Networks aids Interpretability | Satvik Golechha et.al. | 2502.02470 | null |
| 2025-02-04 | SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency | Qianhao Yuan et.al. | 2502.02458 | link |
| 2025-02-04 | IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning | Quan Zhang et.al. | 2502.02454 | null |
| 2025-02-04 | Personalization Toolkit: Training Free Personalization of Large Vision Language Models | Soroush Seifi et.al. | 2502.02452 | null |
| 2025-02-04 | Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study | Calvin Yixiang Cheng et.al. | 2502.02451 | link |
| 2025-02-04 | Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models | Haoran Ye et.al. | 2502.02444 | null |
| 2025-01-31 | Low-Rank Adapting Models for Sparse Autoencoders | Matthew Chen et.al. | 2501.19406 | link |
| 2025-01-31 | Vintix: Action Model via In-Context Reinforcement Learning | Andrey Polubarov et.al. | 2501.19400 | link |
| 2025-01-31 | Scalable-Softmax Is Superior for Attention | Ken M. Nakanishi et.al. | 2501.19399 | null |
| 2025-01-31 | Do LLMs Strategically Reveal, Conceal, and Infer Information? A Theoretical and Empirical Analysis in The Chameleon Game | Mustafa O. Karabag et.al. | 2501.19398 | link |
| 2025-02-03 | s1: Simple test-time scaling | Niklas Muennighoff et.al. | 2501.19393 | link |
| 2025-01-31 | Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models | Alina Shutova et.al. | 2501.19392 | link |
| 2025-01-31 | Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models | Wenzhi Fang et.al. | 2501.19389 | link |
| 2025-01-31 | Decoding-based Regression | Xingyou Song et.al. | 2501.19383 | link |
| 2025-01-31 | TableMaster: A Recipe to Advance Table Understanding with Language Models | Lang Cao et.al. | 2501.19378 | null |
| 2025-02-03 | SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions | Dominik Wagner et.al. | 2501.19377 | null |
| 2025-01-31 | We're Different, We're the Same: Creative Homogeneity Across LLMs | Emily Wenger et.al. | 2501.19361 | null |
| 2025-01-31 | Mechanical Properties of the Meninges: Large Language Model Assisted Systematic Review of over 25,000 Studies | Brandon P. Chelstrom et.al. | 2501.19359 | null |
| 2025-01-31 | The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking | Yuchun Miao et.al. | 2501.19358 | null |
| 2025-01-31 | Towards Adaptive Self-Improvement for Smarter Energy Systems | Alexander Sommer et.al. | 2501.19340 | null |
| 2025-01-31 | PixelWorld: Towards Perceiving Everything as Pixels | Zhiheng Lyu et.al. | 2501.19339 | null |
| 2025-01-31 | Homogeneity Bias as Differential Sampling Uncertainty in Language Models | Messi H. J. Lee et.al. | 2501.19337 | null |
| 2025-01-31 | Reward-Guided Speculative Decoding for Efficient LLM Reasoning | Baohao Liao et.al. | 2501.19324 | null |
| 2025-01-31 | MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems | Anirudh Chari et.al. | 2501.19318 | null |
| 2025-01-31 | LLM-based Affective Text Generation Quality Based on Different Quantization Values | Yarik Menchaca Resendiz et.al. | 2501.19317 | null |
| 2025-01-31 | An Efficient Approach for Machine Translation on Low-resource Languages: A Case Study in Vietnamese-Chinese | Tran Ngoc Son et.al. | 2501.19314 | null |
| 2025-01-30 | Foundational Models for 3D Point Clouds: A Survey and Outlook | Vishal Thengane et.al. | 2501.18594 | null |
| 2025-01-30 | Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models | Hao Dong et.al. | 2501.18592 | link |
| 2025-01-30 | Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs | Yue Wang et.al. | 2501.18585 | null |
| 2025-01-30 | Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling | Dan M. Kluger et.al. | 2501.18577 | link |
| 2025-01-30 | Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH | Evgenii Evstafev et.al. | 2501.18576 | null |
| 2025-01-30 | BounTCHA: A CAPTCHA Utilizing Boundary Identification in AI-extended Videos | Lehao Lin et.al. | 2501.18565 | null |
| 2025-01-30 | SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation | Haoquan Fang et.al. | 2501.18564 | link |
| 2025-01-30 | Semantic Web and Creative AI -- A Technical Report from ISWS 2023 | Raia Abu Ahmad et.al. | 2501.18542 | null |
| 2025-01-30 | Loss Functions and Operators Generated by f-Divergences | Vincent Roulet et.al. | 2501.18537 | null |
| 2025-01-30 | Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges | Manveer Singh Tamber et.al. | 2501.18536 | link |
| 2025-01-30 | Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models | Yi Ding et.al. | 2501.18533 | null |
| 2025-01-30 | Differentially Private Steering for Large Language Model Alignment | Anmol Goel et.al. | 2501.18532 | link |
| 2025-01-30 | Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models | Guanqun Cao et.al. | 2501.18516 | null |
| 2025-01-30 | Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch | Arthur Douillard et.al. | 2501.18512 | null |
| 2025-01-30 | WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training | Benjamin Feuer et.al. | 2501.18511 | link |
| 2025-01-30 | CLEAR: Cue Learning using Evolution for Accurate Recognition Applied to Sustainability Data Extraction | Peter J. Bentley et.al. | 2501.18504 | null |
| 2025-01-30 | A Tool for In-depth Analysis of Code Execution Reasoning of Large Language Models | Changshu Liu et.al. | 2501.18482 | null |
| 2025-01-30 | CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization | Yanxia Deng et.al. | 2501.18475 | null |
| 2025-01-30 | Tuning Vision Foundation Model via Test-Time Prompt-Guided Training for VFSS Segmentations | Chengxi Zeng et.al. | 2501.18474 | null |
| 2025-01-30 | A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models | Shiho Noda et.al. | 2501.18463 | link |
| 2025-01-29 | Learning Beyond the Surface: How Far Can Continual Pre-Training with LoRA Enhance LLMs' Domain-Specific Insight Learning? | Pouya Pezeshkpour et.al. | 2501.17840 | link |
| 2025-01-29 | Matrix Product Sketching via Coordinated Sampling | Majid Daliri et.al. | 2501.17836 | null |
| 2025-01-29 | Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology | Sobhan Hemati et.al. | 2501.17822 | null |
| 2025-01-29 | Leveraging Multimodal LLM for Inspirational User Interface Search | Seokhyeon Park et.al. | 2501.17799 | link |
| 2025-01-29 | BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights | Chan-Jan Hsu et.al. | 2501.17790 | null |
| 2025-01-29 | Reasoning Over the Glyphs: Evaluation of LLM's Decipherment of Rare Scripts | Yu-Fei Shih et.al. | 2501.17785 | null |
| 2025-01-29 | AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing | Peter Pak et.al. | 2501.17784 | null |
| 2025-01-29 | 2SSP: A Two-Stage Framework for Structured Pruning of LLMs | Fabrizio Sandri et.al. | 2501.17771 | link |
| 2025-01-29 | Hybrid Graphs for Table-and-Text based Question Answering using LLMs | Ankush Agarwal et.al. | 2501.17767 | null |
| 2025-01-29 | On the Partitioning of GPU Power among Multi-Instances | Tirth Vamja et.al. | 2501.17752 | null |
| 2025-01-29 | Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation | Aitor Arrieta et.al. | 2501.17749 | null |
| 2025-01-29 | A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches | Ana R. Baião et.al. | 2501.17729 | null |
| 2025-01-29 | Using Code Generation to Solve Open Instances of Combinatorial Design Problems | Christopher D. Rosin et.al. | 2501.17725 | link |
| 2025-01-29 | RICoTA: Red-teaming of In-the-wild Conversation with Test Attempts | Eujeong Choi et.al. | 2501.17715 | link |
| 2025-01-29 | Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate | Yubo Wang et.al. | 2501.17703 | null |
| 2025-01-29 | Planning with Vision-Language Models and a Use Case in Robot-Assisted Teaching | Xuzhe Dang et.al. | 2501.17665 | null |
| 2025-01-29 | Exploring Vision Language Models for Multimodal and Multilingual Stance Detection | Jake Vasilakes et.al. | 2501.17654 | null |
| 2025-01-29 | Tonguescape: Exploring Language Models Understanding of Vowel Articulation | Haruki Sakajo et.al. | 2501.17643 | link |
| 2025-01-29 | Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation | Lin Chen et.al. | 2501.17642 | null |
| 2025-01-29 | In-Context Meta LoRA Generation | Yihua Shao et.al. | 2501.17635 | null |
| 2025-01-28 | SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training | Tianzhe Chu et.al. | 2501.17161 | null |
| 2025-01-28 | AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders | Zhengxuan Wu et.al. | 2501.17148 | link |
| 2025-01-28 | FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data | Deren Lei et.al. | 2501.17144 | link |
| 2025-01-28 | ASTRAL: Automated Safety Testing of Large Language Models | Miriam Ugarte et.al. | 2501.17132 | null |
| 2025-01-28 | Scenario Understanding of Traffic Scenes Through Large Visual Language Models | Rivera Esteban et.al. | 2501.17131 | null |
| 2025-01-28 | Histoires Morales: A French Dataset for Assessing Moral Alignment | Thibaud Leteno et.al. | 2501.17117 | link |
| 2025-01-28 | Optimizing Large Language Model Training Using FP4 Quantization | Ruizhe Wang et.al. | 2501.17116 | null |
| 2025-01-28 | Unlocking Transparent Alignment Through Enhanced Inverse Constitutional AI for Principle Extraction | Carl-Leander Henneking et.al. | 2501.17112 | null |
| 2025-01-28 | COS(M+O)S: Curiosity and RL-Enhanced MCTS for Exploring Story Space via Language Models | Tobias Materzok et.al. | 2501.17104 | null |
| 2025-01-28 | Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving | Evgenii Evstafev et.al. | 2501.17084 | null |
| 2025-01-28 | Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding | Akash Kumar et.al. | 2501.17053 | null |
| 2025-01-28 | How Linguistics Learned to Stop Worrying and Love the Language Models | Richard Futrell et.al. | 2501.17047 | null |
| 2025-01-28 | Enhanced Retrieval of Long Documents: Leveraging Fine-Grained Block Representations with Large Language Models | Minghan Li et.al. | 2501.17039 | null |
| 2025-01-28 | Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies | Manojkumar Parmar et.al. | 2501.17030 | null |
| 2025-01-28 | Automated Refactoring of Non-Idiomatic Python Code: A Differentiated Replication with LLMs | Alessandro Midolo et.al. | 2501.17024 | link |
| 2025-01-28 | Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement | Kei Katsumata et.al. | 2501.17022 | link |
| 2025-01-28 | Large Language Models for Code Generation: The Practitioners Perspective | Zeeshan Rasheed et.al. | 2501.16998 | link |
| 2025-01-28 | Artificial Intelligence Clones | Annie Liang et.al. | 2501.16996 | null |
| 2025-01-28 | FedEFM: Federated Endovascular Foundation Model with Unseen Data | Tuong Do et.al. | 2501.16992 | null |
| 2025-01-28 | Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection | Xiangyu Gao et.al. | 2501.16981 | null |
| 2025-01-27 | LUCY: Linguistic Understanding and Control Yielding Early Stage of Her | Heting Gao et.al. | 2501.16327 | link |
| 2025-01-27 | Evaluating The Performance of Using Large Language Models to Automate Summarization of CT Simulation Orders in Radiation Oncology | Meiyun Cao et.al. | 2501.16309 | null |
| 2025-01-27 | RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval | Long Nguyen et.al. | 2501.16303 | null |
| 2025-01-27 | Matryoshka Re-Ranker: A Flexible Re-Ranking Architecture With Configurable Depth and Width | Zheng Liu et.al. | 2501.16302 | null |
| 2025-01-27 | Large Models in Dialogue for Active Perception and Anomaly Detection | Tzoulio Chamiti et.al. | 2501.16300 | link |
| 2025-01-27 | FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers | Renshan Zhang et.al. | 2501.16297 | null |
| 2025-01-27 | Brain-Adapter: Enhancing Neurological Disorder Analysis with Adapter-Tuning Multimodal Large Language Models | Jing Zhang et.al. | 2501.16282 | null |
| 2025-01-27 | Do LLMs Have Visualization Literacy? An Evaluation on Modified Visualizations to Test Generalization in Data Interpretation | Jiayi Hong et.al. | 2501.16277 | link |
| 2025-01-27 | URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT | Long Nguyen et.al. | 2501.16276 | null |
| 2025-01-27 | Return of the Encoder: Maximizing Parameter Efficiency for SLMs | Mohamed Elfeki et.al. | 2501.16273 | link |
| 2025-01-27 | A foundation model for human-AI collaboration in medical literature mining | Zifeng Wang et.al. | 2501.16255 | null |
| 2025-01-27 | Multi-Agent Geospatial Copilots for Remote Sensing Workflows | Chaehong Lee et.al. | 2501.16254 | null |
| 2025-01-27 | Zero-Shot Decision Tree Construction via Large Language Models | Lucas Carrasco et.al. | 2501.16247 | null |
| 2025-01-27 | CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation | Xiaochuan Ma et.al. | 2501.16246 | null |
| 2025-01-27 | Phase Transitions in Large Language Models and the |
Youran Sun et.al. | 2501.16241 | null |
| 2025-01-27 | AiGet: Transforming Everyday Moments into Hidden Knowledge Discovery with AI Assistance on Smart Glasses | Runze Cai et.al. | 2501.16240 | link |
| 2025-01-27 | Distilling foundation models for robust and efficient models in digital pathology | Alexandre Filiot et.al. | 2501.16239 | null |
| 2025-01-27 | Language-Based Bayesian Optimization Research Assistant (BORA) | Abdoulatif Cissé et.al. | 2501.16224 | null |
| 2025-01-27 | Enhancing Visual Inspection Capability of Multi-Modal Large Language Models on Medical Time Series with Supportive Conformalized and Interpretable Small Specialized Models | Huayu Li et.al. | 2501.16215 | link |
| 2025-01-27 | Provence: efficient and robust context pruning for retrieval-augmented generation | Nadezhda Chirkova et.al. | 2501.16214 | null |
| 2025-01-24 | HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation | Xin Zhou et.al. | 2501.14729 | link |
| 2025-01-24 | Do LLMs Provide Consistent Answers to Health-Related Questions across Languages? | Ipek Baris Schlicht et.al. | 2501.14719 | null |
| 2025-01-24 | Towards Better Understanding Table Instruction Tuning: Decoupling the Effects from Data versus Models | Naihao Deng et.al. | 2501.14717 | null |
| 2025-01-24 | FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing | James Seale Smith et.al. | 2501.14713 | null |
| 2025-01-24 | The Karp Dataset | Mason DiCicco et.al. | 2501.14705 | null |
| 2025-01-24 | Rethinking Table Instruction Tuning | Naihao Deng et.al. | 2501.14693 | null |
| 2025-01-24 | Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST | Fuping Wu et.al. | 2501.14685 | null |
| 2025-01-24 | An Empirical Study on LLM-based Classification of Requirements-related Provisions in Food-safety Regulations | Shabnam Hassani et.al. | 2501.14683 | null |
| 2025-01-24 | Diffusion based Text-to-Music Generationwith Global and Local Text based Conditioning | Jisi Zhang et.al. | 2501.14680 | null |
| 2025-01-24 | MedAgentBench: Dataset for Benchmarking LLMs as Agents in Medical Applications | Yixing Jiang et.al. | 2501.14654 | link |
| 2025-01-24 | Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion | Ziyao Xu et.al. | 2501.14649 | link |
| 2025-01-24 | Recommending Actionable Strategies: A Semantic Approach to Integrating Analytical Frameworks with Decision Heuristics | Renato Ghisellini et.al. | 2501.14634 | null |
| 2025-01-24 | Extracting Problem Structure with LLMs for Optimized SAT Local Search | André Schilder et.al. | 2501.14630 | null |
| 2025-01-24 | ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations | Tianming Liang et.al. | 2501.14607 | null |
| 2025-01-24 | Knowledge Graphs Construction from Criminal Court Appeals: Insights from the French Cassation Court | Alexander V. Belikov et.al. | 2501.14579 | null |
| 2025-01-24 | ZETA: Leveraging Z-order Curves for Efficient Top-k Attention | Qiuhao Zeng et.al. | 2501.14577 | null |
| 2025-01-24 | Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding | Zhongyi Shui et.al. | 2501.14548 | link |
| 2025-01-24 | Leveraging ChatGPT's Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research | Hamid Sarmadi et.al. | 2501.14546 | null |
| 2025-01-24 | VERUS-LM: a Versatile Framework for Combining LLMs with Symbolic Reasoning | Benjamin Callewaert et.al. | 2501.14540 | null |
| 2025-01-24 | Design and Implementation of a Psychiatry Resident Training System Based on Large Language Models | Zhenguang Zhong et.al. | 2501.14530 | link |
| 2025-01-23 | CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation | Guofeng Cui et.al. | 2501.13927 | null |
| 2025-01-23 | The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities | Chan-Jan Hsu et.al. | 2501.13921 | link |
| 2025-01-23 | Analysis of Indic Language Capabilities in LLMs | Aatman Vaidya et.al. | 2501.13912 | null |
| 2025-01-23 | Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models | Linh Tran et.al. | 2501.13904 | null |
| 2025-01-23 | Exploring Finetuned Audio-LLM on Heart Murmur Features | Adrian Florea et.al. | 2501.13884 | null |
| 2025-01-23 | The machine learning platform for developers of large systems | Alexey Naikov et.al. | 2501.13881 | null |
| 2025-01-23 | A RAG-Based Institutional Assistant | Gustavo Kuratomi et.al. | 2501.13880 | null |
| 2025-01-23 | Dual-Modal Prototype Joint Learning for Compositional Zero-Shot Learning | Shiyu Zhang et.al. | 2501.13859 | null |
| 2025-01-23 | Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes | Shiling Deng et.al. | 2501.13851 | link |
| 2025-01-23 | Think Outside the Data: Colonial Biases and Systemic Issues in Automated Moderation Pipelines for Low-Resource Languages | Farhana Shahid et.al. | 2501.13836 | null |
| 2025-01-23 | On the Reasoning Capacity of AI Models and How to Quantify It | Santosh Kumar Radha et.al. | 2501.13833 | null |
| 2025-01-23 | Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing | Hao Zhang et.al. | 2501.13831 | null |
| 2025-01-23 | Hallucinations Can Improve Large Language Models in Drug Discovery | Shuzhou Yuan et.al. | 2501.13824 | null |
| 2025-01-23 | Large Language Model driven Policy Exploration for Recommender Systems | Jie Wang et.al. | 2501.13816 | null |
| 2025-01-23 | Enhancing LLMs for Governance with Human Oversight: Evaluating and Aligning LLMs on Expert Classification of Climate Misinformation for Detecting False or Misleading Claims about Climate Change | Mowafak Allaham et.al. | 2501.13802 | null |
| 2025-01-23 | PromptMono: Cross Prompting Attention for Self-Supervised Monocular Depth Estimation in Challenging Environments | Changhao Wang et.al. | 2501.13796 | null |
| 2025-01-23 | Training-Free Zero-Shot Temporal Action Detection with Vision-Language Models | Chaolei Han et.al. | 2501.13795 | link |
| 2025-01-23 | Parameter-Efficient Fine-Tuning for Foundation Models | Dan Zhang et.al. | 2501.13787 | link |
| 2025-01-23 | Not Every AI Problem is a Data Problem: We Should Be Intentional About Data Scaling | Tanya Rodchenko et.al. | 2501.13779 | null |
| 2025-01-23 | Explainable XR: Understanding User Behaviors of XR Environments using LLM-assisted Analytics Framework | Yoonsang Kim et.al. | 2501.13778 | link |
| 2025-01-22 | VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding | Boqiang Zhang et.al. | 2501.13106 | link |
| 2025-01-22 | Refining Input Guardrails: Enhancing LLM-as-a-Judge Efficiency Through Chain-of-Thought Fine-Tuning and Alignment | Melissa Kazemi Rad et.al. | 2501.13080 | null |
| 2025-01-22 | Autonomy-of-Experts Models | Ang Lv et.al. | 2501.13074 | null |
| 2025-01-22 | Does Table Source Matter? Benchmarking and Improving Multimodal Scientific Table Understanding and Reasoning | Bohao Yang et.al. | 2501.13042 | link |
| 2025-01-22 | Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament | Yantao Liu et.al. | 2501.13007 | link |
| 2025-01-22 | Large Language Model-Based Semantic Communication System for Image Transmission | Soheyb Ribouh et.al. | 2501.12988 | null |
| 2025-01-22 | LLM4WM: Adapting LLM for Wireless Multi-Tasking | Xuanyu Liu et.al. | 2501.12983 | null |
| 2025-01-22 | OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language Models | Chongren Sun et.al. | 2501.12975 | link |
| 2025-01-22 | Accessible Smart Contracts Verification: Synthesizing Formal Models with Tamed LLMs | Jan Corazza et.al. | 2501.12972 | link |
| 2025-01-22 | It's complicated. The relationship of algorithmic fairness and non-discrimination regulations in the EU AI Act | Kristof Meding et.al. | 2501.12962 | null |
| 2025-01-22 | Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference | Weizhi Fei et.al. | 2501.12959 | null |
| 2025-01-22 | GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models | Pengxiang Zhao et.al. | 2501.12956 | null |
| 2025-01-22 | Correctness Assessment of Code Generated by Large Language Models Using Internal Representations | Tuan-Dung Bui et.al. | 2501.12934 | link |
| 2025-01-22 | DynamicEarth: How Far are We from Open-Vocabulary Change Detection? | Kaiyu Li et.al. | 2501.12931 | null |
| 2025-01-22 | A Functional Software Reference Architecture for LLM-Integrated Systems | Alessio Bucaioni et.al. | 2501.12904 | null |
| 2025-01-22 | Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration | Offa Kingsleigh et.al. | 2501.12901 | null |
| 2025-01-22 | Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback | Yafu Li et.al. | 2501.12895 | link |
| 2025-01-22 | Generative AI Misuse Potential in Cyber Security Education: A Case Study of a UK Degree Program | Carlton Shepherd et.al. | 2501.12883 | null |
| 2025-01-22 | WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge | Jingyuan Chen et.al. | 2501.12877 | null |
| 2025-01-22 | HierPromptLM: A Pure PLM-based Framework for Representation Learning on Heterogeneous Text-rich Networks | Qiuyu Zhu et.al. | 2501.12857 | null |
| 2025-01-21 | InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling | Yi Wang et.al. | 2501.12386 | link |
| 2025-01-21 | MMVU: Measuring Expert-Level Multi-Discipline Video Understanding | Yilun Zhao et.al. | 2501.12380 | link |
| 2025-01-21 | Expertise elevates AI usage: experimental evidence comparing laypeople and professional artists | Thomas F. Eisenmann et.al. | 2501.12374 | link |
| 2025-01-21 | Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL | Yeounoh Chung et.al. | 2501.12372 | link |
| 2025-01-21 | Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models | Samira Abnar et.al. | 2501.12370 | null |
| 2025-01-21 | InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model | Yuhang Zang et.al. | 2501.12368 | link |
| 2025-01-21 | Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2 | Md. Rakibul Islam et.al. | 2501.12356 | null |
| 2025-01-21 | Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration | Thomas Walshe et.al. | 2501.12332 | null |
| 2025-01-21 | Cinepro: Robust Training of Foundation Models for Cancer Detection in Prostate Ultrasound Cineloops | Mohamed Harmanani et.al. | 2501.12331 | link |
| 2025-01-21 | VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model | Xianwei Zhuang et.al. | 2501.12327 | link |
| 2025-01-21 | LLM-Assisted Knowledge Graph Completion for Curriculum and Domain Modelling in Personalized Higher Education Recommendations | Hasan Abu-Rasheed et.al. | 2501.12300 | null |
| 2025-01-21 | MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing Networks | Qishen Zhou et.al. | 2501.12281 | link |
| 2025-01-21 | Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement | Maosong Cao et.al. | 2501.12273 | link |
| 2025-01-21 | CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification | Cristiano Patrício et.al. | 2501.12266 | null |
| 2025-01-21 | FOCUS: First Order Concentrated Updating Scheme | Yizhou Liu et.al. | 2501.12243 | null |
| 2025-01-21 | InsTALL: Context-aware Instructional Task Assistance with Multi-modal Large Language Models | Pha Nguyen et.al. | 2501.12231 | null |
| 2025-01-21 | CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning | Yuanheng Fang et.al. | 2501.12226 | null |
| 2025-01-21 | Leveraging Large Language Models for Realizing Truly Intelligent User Interfaces | Allard Oelen et.al. | 2501.12221 | null |
| 2025-01-21 | You Can't Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense | Wuyuao Mai et.al. | 2501.12210 | null |
| 2025-01-21 | Fixing Imbalanced Attention to Mitigate In-Context Hallucination of Large Vision-Language Model | Kazi Hasan Ibn Arif et.al. | 2501.12206 | link |
| 2025-01-17 | FaceXBench: Evaluating Multimodal LLMs on Face Understanding | Kartik Narayan et.al. | 2501.10360 | link |
| 2025-01-17 | Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems | Weibo Gao et.al. | 2501.10332 | link |
| 2025-01-17 | BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation | Suvodip Dey et.al. | 2501.10328 | link |
| 2025-01-17 | Large language models for automated scholarly paper review: A survey | Zhenzhen Zhuang et.al. | 2501.10326 | null |
| 2025-01-17 | Hierarchical Autoregressive Transformers: Combining Byte-~and Word-Level Processing for Robust, Adaptable Language Models | Pit Neitemeier et.al. | 2501.10322 | null |
| 2025-01-17 | HiMix: Reducing Computational Complexity in Large Vision-Language Models | Xuange Zhang et.al. | 2501.10318 | null |
| 2025-01-17 | Addressing Popularity Bias in Third-Party Library Recommendations Using LLMs | Claudio Di Sipio et.al. | 2501.10313 | null |
| 2025-01-17 | Computational Protein Science in the Era of Large Language Models (LLMs) | Wenqi Fan et.al. | 2501.10282 | null |
| 2025-01-17 | Test Wars: A Comparative Study of SBST, Symbolic Execution, and LLM-Based Approaches to Unit Test Generation | Azat Abdullin et.al. | 2501.10200 | null |
| 2025-01-17 | Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education | William Hersh et.al. | 2501.10186 | null |
| 2025-01-17 | Multi-stage Training of Bilingual Islamic LLM for Neural Passage Retrieval | Vera Pavlova et.al. | 2501.10175 | null |
| 2025-01-17 | Dual Debiasing: Remove Stereotypes and Keep Factual Gender for Fair Language Modeling and Translation | Tomasz Limisiewicz et.al. | 2501.10150 | null |
| 2025-01-17 | A Vision-Language Framework for Multispectral Scene Representation Using Language-Grounded Features | Enes Karanfil et.al. | 2501.10144 | null |
| 2025-01-17 | Exploring the Impact of Generative Artificial Intelligence in Education: A Thematic Analysis | Abhishek Kaushik et.al. | 2501.10134 | null |
| 2025-01-17 | ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario | Lucen Zhong et.al. | 2501.10132 | link |
| 2025-01-17 | PaSa: An LLM Agent for Comprehensive Academic Paper Search | Yichen He et.al. | 2501.10120 | link |
| 2025-01-17 | LLM Reasoner and Automated Planner: A new NPC approach | Israel Puerta-Merino et.al. | 2501.10106 | null |
| 2025-01-17 | Universal Actions for Enhanced Embodied Foundation Models | Jinliang Zheng et.al. | 2501.10105 | link |
| 2025-01-17 | Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks | Michael Schwingshackl et.al. | 2501.10080 | link |
| 2025-01-17 | SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning | Yuecheng Liu et.al. | 2501.10074 | null |
| 2025-01-16 | Distilling Multi-modal Large Language Models for Autonomous Driving | Deepti Hegde et.al. | 2501.09757 | null |
| 2025-01-16 | Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues | Youngjoon Jang et.al. | 2501.09754 | null |
| 2025-01-16 | OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking | Zekun Xi et.al. | 2501.09751 | link |
| 2025-01-16 | Enhancing Lexicon-Based Text Embeddings with Large Language Models | Yibin Lei et.al. | 2501.09749 | null |
| 2025-01-16 | Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models | Bihui Jin et.al. | 2501.09745 | null |
| 2025-01-16 | Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps | Nanye Ma et.al. | 2501.09732 | null |
| 2025-01-16 | A Simple Aerial Detection Baseline of Multimodal Language Models | Qingyun Li et.al. | 2501.09720 | link |
| 2025-01-16 | CyberMentor: AI Powered Learning Tool Platform to Address Diverse Student Needs in Cybersecurity Education | Tianyu Wang et.al. | 2501.09709 | link |
| 2025-01-16 | Domain Adaptation of Foundation LLMs for e-Commerce | Christian Herold et.al. | 2501.09706 | null |
| 2025-01-16 | Cueless EEG imagined speech for subject identification: dataset and benchmarks | Ali Derakhshesh et.al. | 2501.09700 | link |
| 2025-01-16 | Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key | Zhihe Yang et.al. | 2501.09695 | link |
| 2025-01-16 | Simulated Interactive Debugging | Yannic Noller et.al. | 2501.09694 | null |
| 2025-01-16 | Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models | Fengli Xu et.al. | 2501.09686 | null |
| 2025-01-16 | Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review | Masatoshi Uehara et.al. | 2501.09685 | null |
| 2025-01-16 | Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark | Alexis Roger et.al. | 2501.09672 | null |
| 2025-01-16 | A Survey of Research in Large Language Models for Electronic Design Automation | Jingyu Pan et.al. | 2501.09655 | null |
| 2025-01-16 | The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models | Jonathan Katzy et.al. | 2501.09653 | null |
| 2025-01-16 | CarMem: Enhancing Long-Term Memory in LLM Voice Assistants through Category-Bounding | Johannes Kirmayr et.al. | 2501.09645 | link |
| 2025-01-16 | LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading | Kuan-Ming Liu et.al. | 2501.09636 | null |
| 2025-01-16 | Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework | Yushen Lin et.al. | 2501.09631 | null |
| 2025-01-15 | Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians | Ishan Amin et.al. | 2501.09009 | link |
| 2025-01-15 | Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails | Shaona Ghosh et.al. | 2501.09004 | null |
| 2025-01-15 | Vision Foundation Models for Computed Tomography | Suraj Pai et.al. | 2501.09001 | link |
| 2025-01-15 | CityLoc: 6 DoF Localization of Text Descriptions in Large-Scale Scenes with Gaussian Representation | Qi Ma et.al. | 2501.08982 | null |
| 2025-01-15 | Development and Validation of the Provider Documentation Summarization Quality Instrument for Large Language Models | Emma Croxford et.al. | 2501.08977 | null |
| 2025-01-15 | Learning to Extract Cross-Domain Aspects and Understanding Sentiments Using Large Language Models | Karukriti Kaushik Ghosh et.al. | 2501.08974 | null |
| 2025-01-15 | Analyzing the Ethical Logic of Six Large Language Models | W. Russell Neuman et.al. | 2501.08951 | null |
| 2025-01-15 | Applying General Turn-taking Models to Conversational Human-Robot Interaction | Gabriel Skantze et.al. | 2501.08946 | null |
| 2025-01-15 | Disentangling Exploration of Large Language Models by Optimal Exploitation | Tim Grams et.al. | 2501.08925 | null |
| 2025-01-15 | GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge | Liam Dugan et.al. | 2501.08913 | link |
| 2025-01-15 | Leveraging Large Language Models as Knowledge-Driven Agents for Reliable Retrosynthesis Planning | Qinyu Ma et.al. | 2501.08897 | link |
| 2025-01-15 | Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving | Tengpeng Li et.al. | 2501.08861 | link |
| 2025-01-15 | Exploring Task-Level Optimal Prompts for Visual In-Context Learning | Yan Zhu et.al. | 2501.08841 | null |
| 2025-01-15 | IDEA: Image Description Enhanced CLIP-Adapter | Zhipeng Ye et.al. | 2501.08816 | link |
| 2025-01-15 | How Developers Interact with AI: A Taxonomy of Human-AI Collaboration in Software Engineering | Christoph Treude et.al. | 2501.08774 | null |
| 2025-01-15 | Admitting Ignorance Helps the Video Question Answering Models to Answer | Haopeng Li et.al. | 2501.08771 | null |
| 2025-01-15 | Enhanced Large Language Models for Effective Screening of Depression and Anxiety | June M. Liu et.al. | 2501.08769 | null |
| 2025-01-15 | Leveraging LLM Agents for Translating Network Configurations | Yunze Wei et.al. | 2501.08760 | null |
| 2025-01-15 | Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models | Hong-Viet Tran et.al. | 2501.08758 | null |
| 2025-01-15 | The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities | Irina Bigoulaeva et.al. | 2501.08716 | link |
| 2025-01-14 | PokerBench: Training Large Language Models to become Professional Poker Players | Richard Zhuang et.al. | 2501.08328 | link |
| 2025-01-14 | Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks | Miran Heo et.al. | 2501.08326 | null |
| 2025-01-14 | ADAM-1: AI and Bioinformatics for Alzheimer's Detection and Microbiome-Clinical Data Integrations | Ziyuan Huang et.al. | [2501.08324](http://arxiv.org/abs/2501.0 |