GitHub - Xuchen-Li/cv-arxiv-daily: Automatically update arXiv papers about SOT & VLT, Multi-modal Learning, LLM and Video Understanding using Github Actions.

Updated on 2026.02.11

Table of Contents

Single Object & Visual Language Tracking
Large Language Model
Video Understanding
Multi-modal Learning

Single Object & Visual Language Tracking

Publish Date	Title	Authors	PDF	Code
2025-07-22	Explicit Context Reasoning with Supervision for Visual Tracking	Fansheng Zeng et.al.	2507.16191	null
2025-07-21	Is Tracking really more challenging in First Person Egocentric Vision?	Matteo Dunnhofer et.al.	2507.16015	null
2025-07-23	EndoControlMag: Robust Endoscopic Vascular Motion Magnification with Periodic Reference Resetting and Hierarchical Tissue-aware Dual-Mask Contro	An Wang et.al.	2507.15292	null
2025-07-11	SAM2RL: Towards Reinforcement Learning Memory Control in Segment Anything Model 2	Alen Adamyan et.al.	2507.08548	null
2025-07-10	Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking	Qiangqiang Wu et.al.	2507.07483	null
2025-07-09	Token Bottleneck: One Token to Remember Dynamics	Taekyung Kim et.al.	2507.06543	null
2025-07-08	What You Have is What You Track: Adaptive and Robust Multimodal Tracking	Yuedong Tan et.al.	2507.05899	null
2025-07-08	Stable Tracking-in-the-Loop Control of Cable-Driven Surgical Manipulators under Erroneous Kinematic Chains	Neelay Joglekar et.al.	2507.05663	null
2025-07-07	Spatial and Semantic Embedding Integration for Stereo Sound Event Localization and Detection in Regular Videos	Davide Berghi et.al.	2507.04845	null
2025-07-05	Sensitive and accurate femtosecond pulse characterization via two-photon absorption in Fabry-Pérot laser diodes	Adrian F. Chlebowski et.al.	2507.03978	null
2025-07-01	UMDATrack: Unified Multi-Domain Adaptive Tracking Under Adverse Weather Conditions	Siyuan Yao et.al.	2507.00648	null
2025-07-01	ATSTrack: Enhancing Visual-Language Tracking by Aligning Temporal and Spatial Scales	Yihao Zhen et.al.	2507.00454	null
2025-06-30	Mamba-FETrack V2: Revisiting State Space Model for Frame-Event based Visual Object Tracking	Shiao Wang et.al.	2506.23783	null
2025-07-22	R1-Track: Direct Application of MLLMs to Visual Object Tracking via Reinforcement Learning	Biao Wang et.al.	2506.21980	null
2025-06-25	Exploiting Lightweight Hierarchical ViT and Dynamic Framework for Efficient Visual Tracking	Ben Kang et.al.	2506.20381	null
2025-06-17	Comparison of Two Methods for Stationary Incident Detection Based on Background Image	Deepak Ghimire et.al.	2506.14256	null
2025-06-03	MVTD: A Benchmark Dataset for Maritime Visual Object Tracking	Ahsan Baidar Bakht et.al.	2506.02866	null
2025-05-31	Towards Effective and Efficient Adversarial Defense with Diffusion Models for Robust Visual Tracking	Long Xu et.al.	2506.00325	link
2025-05-29	CLDTracker: A Comprehensive Language Description for Visual Tracking	Mohamad Alansari et.al.	2505.23704	link
2025-05-29	TrackVLA: Embodied Visual Tracking in the Wild	Shaoan Wang et.al.	2505.23189	null
2025-05-28	TwinTrack: Bridging Vision and Contact Physics for Real-Time Tracking of Unknown Dynamic Objects	Wen Yang et.al.	2505.22882	null
2025-05-27	Fully Spiking Neural Networks for Unified Frame-Event Object Tracking	Jingjun Yang et.al.	2505.20834	null
2025-05-28	VLM Can Be a Good Assistant: Enhancing Embodied Visual Tracking with Self-Improving Vision-Language Models	Kui Wu et.al.	2505.20718	null
2025-05-27	Hierarchical Instruction-aware Embodied Visual Tracking	Kui Wu et.al.	2505.20710	null
2025-06-01	HAND Me the Data: Fast Robot Adaptation via Hand Path Retrieval	Matthew Hong et.al.	2505.20455	null
2025-05-28	Progressive Scaling Visual Object Tracking	Jack Hong et.al.	2505.19990	null
2025-05-23	Adapting SAM 2 for Visual Object Tracking: 1st Place Solution for MMVPR Challenge Multi-Modal Tracking	Cheng-Yen Yang et.al.	2505.18111	null
2025-05-22	Efficient Motion Prompt Learning for Robust Visual Tracking	Jie Zhao et.al.	2505.16321	link
2025-05-19	Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach	Shiao Wang et.al.	2505.12903	link
2025-05-13	Towards Adaptive Meta-Gradient Adversarial Examples for Visual Tracking	Wei-Long Tian et.al.	2505.08999	link
2025-05-11	DeepSORT-Driven Visual Tracking Approach for Gesture Recognition in Interactive Systems	Tong Zhang et.al.	2505.07110	null
2025-05-09	CGTrack: Cascade Gating Network with Hierarchical Feature Aggregation for UAV Tracking	Weihong Li et.al.	2505.05936	link
2025-05-07	Predicting Road Surface Anomalies by Visual Tracking of a Preceding Vehicle	Petr Jahoda et.al.	2505.04392	null
2025-04-19	Adversarial Attack for RGB-Event based Visual Object Tracking	Qiang Chen et.al.	2504.14423	link
2025-05-05	SAM2MOT: A Novel Paradigm of Multi-Object Tracking by Segmentation	Junjie Jiang et.al.	2504.04519	link
2025-03-24	SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking	Wenrui Cai et.al.	2503.18338	link
2025-03-22	MUST: The First Dataset and Unified Framework for Multispectral UAV Single Object Tracking	Haolin Qin et.al.	2503.17699	link
2025-03-21	Dynamic Attention Mechanism in Spatiotemporal Memory Networks for Object Tracking	Meng Zhou et.al.	2503.16768	null
2025-03-17	UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network	Siyuan Yao et.al.	2503.12888	link
2025-03-16	A Plug-and-Play Learning-based IMU Bias Factor for Robust Visual-Inertial Odometry	Yang Yi et.al.	2503.12527	null
2025-03-14	Towards General Multimodal Visual Tracking	Andong Lu et.al.	2503.11218	null
2025-03-09	Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking	Chaocan Xue et.al.	2503.06625	link
2025-03-09	Dynamic Updates for Language Adaptation in Visual-Language Tracking	Xiaohai Li et.al.	2503.06621	link
2025-02-28	Technical Report for ReID-SAM on SkiTB Visual Tracking Challenge 2025	Kunjun Li et.al.	2503.01907	null
2025-03-01	Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking	Jiawen Zhu et.al.	2503.00516	link
2025-02-27	MITracker: Multi-View Integration for Visual Object Tracking	Mengjie Xu et.al.	2502.20111	null
2025-02-27	CFTrack: Enhancing Lightweight Visual Tracking through Contrastive Learning and Feature Matching	Juntao Liang et.al.	2502.19705	null
2025-02-26	Enhanced Transformer-Based Tracking for Skiing Events: Overcoming Multi-Camera Challenges, Scale Variations and Rapid Motion -- SkiTB Visual Tracking Challenge 2025	Akhil Penta et.al.	2502.18867	null
2025-02-25	UASTrack: A Unified Adaptive Selection Framework with Modality-Customization in Single Object Tracking	He Wang et.al.	2502.18220	null
2025-02-08	Event Stream-based Visual Object Tracking: HDETrack V2 and A High-Definition Benchmark	Shiao Wang et.al.	2502.05574	link
2025-01-13	Robust Single Object Tracking in LiDAR Point Clouds under Adverse Weather Conditions	Xiantong Zhao et.al.	2501.07133	null
2025-01-05	DeTrack: In-model Latent Denoising Learning for Visual Object Tracking	Xinyu Zhou et.al.	2501.02467	null
2025-01-13	FusionSORT: Fusion Methods for Online Multi-object Visual Tracking	Nathanael L. Baisa et.al.	2501.00843	link
2025-01-01	Less is More: Token Context-aware Learning for Object Tracking	Chenlong Xu et.al.	2501.00758	link
2024-12-28	Learning Adaptive and View-Invariant Vision Transformer with Multi-Teacher Knowledge Distillation for Real-Time UAV Tracking	You Wu et.al.	2412.20002	link
2024-12-26	SUTrack: Towards Simple and Unified Single Object Tracking	Xin Chen et.al.	2412.19138	link
2024-12-15	Exploring Enhanced Contextual Information for Video-Level Object Tracking	Ben Kang et.al.	2412.11023	link
2024-12-13	Visual Object Tracking across Diverse Data Modalities: A Review	Mengmeng Wang et.al.	2412.09991	null
2025-03-07	MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues	Zhaofeng Hu et.al.	2412.02734	link
2024-12-03	GSOT3D: Towards Generic 3D Single Object Tracking in the Wild	Yifan Jiao et.al.	2412.02129	link
2025-02-06	Improving Accuracy and Generalization for Efficient Visual Tracking	Ram Zaveri et.al.	2411.18855	null
2024-11-27	A comparison of extended object tracking with multi-modal sensors in indoor environment	Jiangtao Shuai et.al.	2411.18476	null
2024-12-04	A Distractor-Aware Memory for Visual Object Tracking with SAM2	Jovana Videnovic et.al.	2411.17576	link
2024-11-23	How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking	Xuchen Li et.al.	2411.15600	null
2024-11-24	ClickTrack: Towards Real-time Interactive Single Object Tracking	Kuiran Wang et.al.	2411.13183	null
2024-11-30	SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory	Cheng-Yen Yang et.al.	2411.11922	link
2024-12-09	Vision Eagle Attention: a new lens for advancing image classification	Mahmudul Hasan et.al.	2411.10564	link
2024-11-14	MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation	Jonas Serych et.al.	2411.09551	link
2024-11-12	Visual Tracking with Intermittent Visibility: Switched Control Design and Implementation	Yangge Li et.al.	2411.08144	null
2024-12-16	ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model	Yiming Sun et.al.	2411.01756	null
2024-10-30	IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking	Run Luo et.al.	2410.23907	null
2024-10-27	NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object Tracking	Yu Liu et.al.	2410.20421	link
2024-10-19	The Solution for Single Object Tracking Task of Perception Test Challenge 2024	Zhiqiang Zhong et.al.	2410.16329	null
2024-10-13	Gaussian Splatting Visual MPC for Granular Media Manipulation	Wei-Cheng Tseng et.al.	2410.09740	null
2024-10-09	DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM	Xuchen Li et.al.	2410.02492	null
2024-09-30	Opt-in Camera: Person Identification in Video via UWB Localization and Its Application to Opt-in Systems	Matthew Ishige et.al.	2409.19891	null
2024-09-27	Improving Visual Object Tracking through Visual Prompting	Shih-Fang Chen et.al.	2409.18901	link
2024-09-26	General Compression Framework for Efficient Transformer Object Tracking	Lingyi Hong et.al.	2409.17564	null
2024-09-25	Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2	Chunhui Zhang et.al.	2409.16902	link
2024-09-25	Conditional Generative Denoiser for Nighttime UAV Tracking	Yucheng Wang et.al.	2409.16834	link
2024-09-25	Progressive Representation Learning for Real-Time UAV Tracking	Changhong Fu et.al.	2409.16652	link
2024-09-25	Enhancing Nighttime UAV Tracking with Light Distribution Suppression	Liangliang Yao et.al.	2409.16631	link
2024-09-19	WeHelp: A Shared Autonomy System for Wheelchair Users	Abulikemu Abuduweili et.al.	2409.12159	link
2024-09-18	Distilling Channels for Efficient Deep Tracking	Shiming Ge et.al.	2409.11785	null
2024-09-13	Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark	Xuchen Li et.al.	2409.08887	null
2024-09-10	VBIT: Towards Enhancing Privacy Control Over IoT Devices	Jad Al Aaraj et.al.	2409.06233	null
2024-09-03	Ultra-broadband room-temperature Fourier transform spectrometer with watt-level power consumption	Jakub Mnich et.al.	2409.01875	null
2024-08-25	Camouflaged_Object_Tracking__A_Benchmark	Xiaoyu Guo et.al.	2408.13877	link
2024-08-21	Low-Light Object Tracking: A Benchmark	Pengzhi Zhong et.al.	2408.11463	link
2024-08-20	MambaEVT: Event Stream based Visual Object Tracking using State Space Model	Xiao Wang et.al.	2408.10487	link
2024-08-05	VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking	Yuxuan Lu et.al.	2408.02263	null
2024-09-06	3D Single-object Tracking in Point Clouds with High Temporal Variation	Qiao Wu et.al.	2408.02049	null
2024-09-09	SiamMo: Siamese Motion-Centric 3D Object Tracking	Yuxiang Yang et.al.	2408.01688	link
2024-08-02	Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach	Yabin Zhu et.al.	2408.00969	link
2024-08-06	Broadband THz wave generation and detection in organic crystal PNPA at MHz repetition rates	Lukasz A. Sterczewski et.al.	2407.20745	null
2024-07-16	Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers	Zhengbo Zhang et.al.	2407.08394	null
2024-07-11	PINN-Ray: A Physics-Informed Neural Network to Model Soft Robotic Fin Ray Fingers	Xing Wang et.al.	2407.08222	null
2024-07-07	Addressing single object tracking in satellite imagery through prompt-engineered solutions	Athena Psalta et.al.	2407.05518	null
2024-07-07	Learning Motion Blur Robust Vision Transformers with Dynamic Early Exit for Real-Time UAV Tracking	You Wu et.al.	2407.05383	null
2024-07-09	P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds	Jiahao Nie et.al.	2407.05238	link
2024-07-07	Tracking Reflected Objects: A Benchmark	Xiaoyu Guo et.al.	2407.05235	null
2024-07-04	TrackPGD: A White-box Attack using Binary Masks against Robust Transformer Trackers	Fatemeh Nourilenjan Nokabadi et.al.	2407.03946	link
2024-07-02	FlowTrack: Point-level Flow Network for 3D Single Object Tracking	Shuo Li et.al.	2407.01959	null
2024-09-07	eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking	Yucheng Chen et.al.	2406.20024	null
2024-06-14	Constrained Motion Planning for a Robotic Endoscope Holder based on Hierarchical Quadratic Programming	Jacinto Colan et.al.	2406.09982	null
2024-06-14	Robust compressive tracking via online weighted multiple instance learning	Sandeep Singh Sengar et.al.	2406.09914	null
2024-07-01	Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking	Xiangyang Yang et.al.	2406.08037	null
2024-06-07	Multi-Granularity Language-Guided Multi-Object Tracking	Yuhao Li et.al.	2406.04844	link
2024-06-02	Robust Visual Tracking via Iterative Gradient Descent and Threshold Selection	Zhuang Qi et.al.	2406.00589	null
2024-05-28	Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion	Hongze Sun et.al.	2405.17903	link
2024-05-27	LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking	Shaohua Dong et.al.	2405.17660	null
2024-05-31	Awesome Multi-modal Object Tracking	Chunhui Zhang et.al.	2405.14200	link
2024-05-20	DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM	Xuchen Li et.al.	2405.12139	null
2024-05-16	A Novel Bounding Box Regression Method for Single Object Tracking	Omar Abdelaziz et.al.	2405.10444	null
2024-05-16	Beyond Traditional Single Object Tracking: A Survey	Omar Abdelaziz et.al.	2405.10439	null
2024-05-08	TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking	Pengcheng Shao et.al.	2405.05004	link
2024-04-22	360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos	Yinzhe Xu et.al.	2404.13953	link
2024-05-25	An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training	Jin Gao et.al.	2404.12210	link
2024-04-16	Attention-Aware Visualization: Tracking and Responding to User Perception Over Time	Arvind Srinivasan et.al.	2404.10732	null
2024-04-15	Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL	Fangwei Zhong et.al.	2404.09857	null
2024-04-15	Learning Tracking Representations from Single Point Annotations	Qiangqiang Wu et.al.	2404.09504	null
2024-04-11	PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds	Weisheng Xu et.al.	2404.07495	link
2024-05-02	Longitudinal Analysis and Quantitative Assessment of Child Development through Mobile Interaction	Juan Carlos Ruiz-Garcia et.al.	2404.06919	link
2024-04-09	LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks	Jianlang Chen et.al.	2404.06247	link
2024-04-08	Semi-Supervised Novelty Detection for Precise Ultra-Wideband Error Signal Prediction	Umberto Albertin et.al.	2404.05351	null
2024-03-29	Context-Aware Integration of Language and Visual References for Natural Language Tracking	Yanyan Shao et.al.	2403.19975	null
2024-03-27	TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes	Liangyu Xu et.al.	2403.18238	null
2024-03-26	OmniVid: A Generative Framework for Universal Video Understanding	Junke Wang et.al.	2403.17935	link
2024-03-26	Exploring Dynamic Transformer for Efficient Object Tracking	Jiawen Zhu et.al.	2403.17651	null
2024-03-29	Elysium: Exploring Object-level Perception in Videos via MLLM	Han Wang et.al.	2403.16558	link
2024-03-25	Multi-attention Associate Prediction Network for Visual Tracking	Xinglong Sun et.al.	2403.16395	null
2024-03-28	SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking	Xiaojun Hou et.al.	2403.16002	link
2024-03-23	Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking	Shaoyu Sun et.al.	2403.15831	null
2024-03-19	TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO	Chaoran Xiong et.al.	2403.12504	link
2024-03-18	Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model	Jan Krejčí et.al.	2403.11978	null
2024-03-16	A Spectrum-based Image Denoising Method with Edge Feature Enhancement	Peter Luvton et.al.	2403.11036	null
2024-03-15	Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers	Jinxia Xie et.al.	2403.10574	null
2024-03-14	OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning	Lingyi Hong et.al.	2403.09634	null
2024-02-27	ACTrack: Adding Spatio-Temporal Condition for Visual Object Tracking	Yushan Han et.al.	2403.07914	null
2024-04-03	Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline	Xiao Wang et.al.	2403.05839	link
2024-03-08	Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance	Liting Lin et.al.	2403.05231	link
2024-03-08	Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy	Yuelin Zhang et.al.	2403.05146	link
2024-03-06	VastTrack: Vast Category Visual Object Tracking	Liang Peng et.al.	2403.03493	link
2024-02-28	Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks	Zhewei Wu et.al.	2402.17976	null
2024-02-26	SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking	Yu Lin et.al.	2402.16249	link
2024-02-26	Reading Relevant Feature from Global Representation Memory for Visual Object Tracking	Xinyu Zhou et.al.	2402.14392	null
2024-02-13	Optimized Information Flow for Transformer Tracking	Janani Kugarajeevan et.al.	2402.08195	link
2024-02-07	BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision	Xin Zhao et.al.	2402.04519	null
2024-02-04	Spatio-temporal Prompting Network for Robust Video Feature Extraction	Guanxiong Sun et.al.	2402.02574	link
2024-01-24	Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region	Shengjing Tian et.al.	2401.13285	null
2024-01-23	Correlation-Embedded Transformer Tracking: A Single-Branch Framework	Fei Xie et.al.	2401.12743	link
2024-01-20	Unifying Visual and Vision-Language Tracking via Contrastive Learning	Yinchao Ma et.al.	2401.11228	link
2024-01-20	Towards Category Unification of 3D Single Object Tracking on Point Clouds	Jiahao Nie et.al.	2401.11204	null
2024-01-18	Multi-task Learning for Joint Re-identification, Team Affiliation, and Role Classification for Sports Visual Tracking	Amir M. Mansourian et.al.	2401.09942	null
2024-01-12	Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements	Muhammad Wasim Nawaz et.al.	2401.06396	null
2024-01-18	Hold 'em and Fold 'em: Towards Human-scale, Feedback-Controlled Soft Origami Robots	Immanuel Ampomah Mensah et.al.	2401.04650	null
2024-01-06	Explicit Visual Prompts for Visual Object Tracking	Liangtao Shi et.al.	2401.03142	link
2024-01-03	ODTrack: Online Dense Temporal Token Learning for Visual Tracking	Yaozong Zheng et.al.	2401.01686	link
2023-12-27	X Modality Assisting RGBT Object Tracking	Zhaisheng Ding et.al.	2312.17273	null
2023-12-22	Cross-Modal Object Tracking via Modality-Aware Fusion Network and A Large-Scale Dataset	Lei Liu et.al.	2312.14446	link
2023-12-18	Multi-Correlation Siamese Transformer Network with Dense Connection for 3D Single Object Tracking	Shihao Feng et.al.	2312.11051	link
2023-12-17	Robust 3D Tracking with Quality-Aware Shape Completion	Jingwen Zhang et.al.	2312.10608	null
2023-12-15	Tracking Skiers from the Top to the Bottom	Matteo Dunnhofer et.al.	2312.09723	null
2023-12-11	M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking	Jiaming Liu et.al.	2312.06117	link
2023-12-07	Instance Tracking in 3D Scenes from Egocentric Videos	Yunhan Zhao et.al.	2312.04117	link
2024-02-19	Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking	Jiawei Ge et.al.	2311.17085	null
2023-11-21	Visual tracking brain computer interface	Changxing Huang et.al.	2311.12592	null
2024-01-10	ViKi-HyCo: A Hybrid-Control approach for complex car-like maneuvers	Edison P. Velasco Sánchez et.al.	2311.07268	null

(back to top)

Large Language Model

Publish Date	Title	Authors	PDF	Code
2025-07-23	Pretraining on the Test Set Is No Longer All You Need: A Debate-Driven Approach to QA Benchmarks	Linbo Cao et.al.	2507.17747	null
2025-07-23	Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains	Anisha Gunjal et.al.	2507.17746	null
2025-07-23	Megrez2 Technical Report	Boxun Li et.al.	2507.17728	null
2025-07-23	BetterCheck: Towards Safeguarding VLMs for Automotive Perception Systems	Malsha Ashani Mahawatta Dona et.al.	2507.17722	null
2025-07-23	AI Telephone Surveying: Automating Quantitative Data Collection with an AI Interviewer	Danny D. Leybzon et.al.	2507.17718	null
2025-07-23	HydraOpt: Navigating the Efficiency-Performance Trade-off of Adapter Merging	Taha Ceritli et.al.	2507.17706	null
2025-07-23	Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models	Changxin Tian et.al.	2507.17702	null
2025-07-23	Thinking Isn't an Illusion: Overcoming the Limitations of Reasoning Models via Tool Augmentations	Zhao Song et.al.	2507.17699	null
2025-07-23	Symbiotic Agents: A Novel Paradigm for Trustworthy AGI-driven Networks	Ilias Chatzistefanidis et.al.	2507.17695	null
2025-07-23	Simulating multiple human perspectives in socio-ecological systems using large language models	Yongchao Zeng et.al.	2507.17680	null
2025-07-23	See the Forest and the Trees: A Synergistic Reasoning Framework for Knowledge-Based Visual Question Answering	Junjie Wang et.al.	2507.17659	null
2025-07-23	Who Attacks, and Why? Using LLMs to Identify Negative Campaigning in 18M Tweets across 19 Countries	Victor Hartman et.al.	2507.17636	null
2025-07-23	A Hybrid Early-Exit Algorithm for Large Language Models Based on Space Alignment Decoding (SPADE)	Bowen Zheng et.al.	2507.17618	null
2025-07-23	Decoding Consumer Preferences Using Attention-Based Language Models	Joshua Foster et.al.	2507.17564	null
2025-07-23	BoSS: Beyond-Semantic Speech	Qing Wang et.al.	2507.17563	null
2025-07-23	CodeReasoner: Enhancing the Code Reasoning Ability with Reinforcement Learning	Lingxiao Tang et.al.	2507.17548	null
2025-07-23	Anticipate, Simulate, Reason (ASR): A Comprehensive Generative AI Framework for Combating Messaging Scams	Xue Wen Tan et.al.	2507.17543	null
2025-07-23	AssertFlip: Reproducing Bugs via Inversion of LLM-Generated Passing Tests	Lara Khatib et.al.	2507.17542	null
2025-07-23	Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning	Xinyao Liu et.al.	2507.17539	null
2025-07-23	InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation	Shuai Yang et.al.	2507.17520	null
2025-07-22	Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning	Junhao Shen et.al.	2507.16814	null
2025-07-22	LingBench++: A Linguistically-Informed Benchmark and Reasoning Framework for Multi-Step and Cross-Cultural Inference with LLMs	Da-Chen Lian et.al.	2507.16809	null
2025-07-22	Rethinking LLM-Based RTL Code Optimization Via Timing Logic Metamorphosis	Zhihao Xu et.al.	2507.16808	null
2025-07-22	Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty	Mehul Damani et.al.	2507.16806	null
2025-07-23	Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning	Yanjun Zheng et.al.	2507.16802	null
2025-07-23	Test-Time-Matching: Decouple Personality, Memory, and Linguistic Style in LLM-based Role-Playing Language Agent	Xiaoyu Zhan et.al.	2507.16799	null
2025-07-22	Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning	Helena Casademunt et.al.	2507.16795	null
2025-07-22	ChatChecker: A Framework for Dialogue System Testing and Evaluation Through Non-cooperative User Simulation	Roman Mayr et.al.	2507.16792	null
2025-07-22	Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning	Hongyin Luo et.al.	2507.16784	null
2025-07-22	Cooling Matters: Benchmarking Large Language Models and Vision-Language Models on Liquid-Cooled Versus Air-Cooled H100 GPU Systems	Imran Latif et.al.	2507.16781	null
2025-07-22	When LLMs Copy to Think: Uncovering Copy-Guided Attacks in Reasoning LLMs	Yue Li et.al.	2507.16773	null
2025-07-22	WGRAMMAR: Leverage Prior Knowledge to Accelerate Structured Decoding	Ran Wang et.al.	2507.16768	null
2025-07-22	Never Come Up Empty: Adaptive HyDE Retrieval for Improving LLM Developer Support	Fangjian Lei et.al.	2507.16754	null
2025-07-22	CMP: A Composable Meta Prompt for SAM-Based Cross-Domain Few-Shot Segmentation	Shuai Chen et.al.	2507.16753	null
2025-07-22	Collaborative Inference and Learning between Edge SLMs and Cloud LLMs: A Survey of Algorithms, Execution, and Open Challenges	Senyao Li et.al.	2507.16731	null
2025-07-23	Deliberative Searcher: Improving LLM Reliability via Reinforcement Learning with constraints	Zhenyun Yin et.al.	2507.16727	null
2025-07-22	SALM: Spatial Audio Language Model with Structured Embeddings for Understanding and Editing	Jinbo Hu et.al.	2507.16724	null
2025-07-22	Enhancing Remote Sensing Vision-Language Models Through MLLM and LLM-Based High-Quality Image-Text Dataset Generation	Yiguo He et.al.	2507.16716	null
2025-07-22	Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory	Guowei Lan et.al.	2507.16713	null
2025-07-22	Advancing Risk and Quality Assurance: A RAG Chatbot for Improved Regulatory Compliance	Lars Hillebrand et.al.	2507.16711	null
2025-07-21	Diffusion Beats Autoregressive in Data-Constrained Settings	Mihir Prabhudesai et.al.	2507.15857	null
2025-07-21	Gemini 2.5 Pro Capable of Winning Gold at IMO 2025	Yichen Huang et.al.	2507.15855	null
2025-07-22	SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction	Zhixiong Zhang et.al.	2507.15852	null
2025-07-21	The Other Mind: How Language Models Exhibit Human Temporal Cognition	Lingyu Li et.al.	2507.15851	null
2025-07-21	3LM: Bridging Arabic, STEM, and Code through Benchmarking	Basma El Amel Boussaha et.al.	2507.15850	null
2025-07-21	The Impact of Language Mixing on Bilingual LLM Reasoning	Yihao Li et.al.	2507.15849	null
2025-07-21	FASTGEN: Fast and Cost-Effective Synthetic Tabular Data Generation with LLMs	Anh Nguyen et.al.	2507.15839	null
2025-07-21	Just Ask for Music (JAM): Multimodal and Personalized Natural Language Music Recommendation	Alessandro B. Melchiorre et.al.	2507.15826	null
2025-07-21	ACS: An interactive framework for conformal selection	Yu Gui et.al.	2507.15825	null
2025-07-21	Can Your Model Separate Yolks with a Water Bottle? Benchmarking Physical Commonsense Understanding in Video Generation Models	Enes Sanli et.al.	2507.15824	null
2025-07-21	Do AI models help produce verified bug fixes?	Li Huang et.al.	2507.15822	null
2025-07-21	LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra	Seth Karten et.al.	2507.15815	null
2025-07-21	True Multimodal In-Context Learning Needs Attention to the Visual Context	Shuo Chen et.al.	2507.15807	null
2025-07-21	ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction	Danhui Chen et.al.	2507.15803	null
2025-07-21	Regularized Low-Rank Adaptation for Few-Shot Organ Segmentation	Ghassen Baklouti et.al.	2507.15793	null
2025-07-21	Small LLMs Do Not Learn a Generalizable Theory of Mind via Reinforcement Learning	Sneheel Sarangi et.al.	2507.15788	null
2025-07-21	Reservoir Computing as a Language Model	Felix Köster et.al.	2507.15779	null
2025-07-21	Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR	Jiakang Wang et.al.	2507.15778	null
2025-07-21	Left Leaning Models: AI Assumptions on Economic Policy	Maxim Chupilkin et.al.	2507.15771	null
2025-07-21	A Framework for Analyzing Abnormal Emergence in Service Ecosystems Through LLM-based Agent Intention Mining	Yifan Shen et.al.	2507.15770	null
2025-07-18	Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning	Shashanka Venkataramanan et.al.	2507.14137	null
2025-07-18	CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning	Xiaoya Li et.al.	2507.14111	null
2025-07-18	Automated Interpretation of Non-Destructive Evaluation Contour Maps Using Large Language Models for Bridge Condition Assessment	Viraj Nishesh Darji et.al.	2507.14107	null
2025-07-18	Generative AI-Driven High-Fidelity Human Motion Simulation	Hari Iyer et.al.	2507.14097	null
2025-07-18	Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) track	Brian Ondov et.al.	2507.14096	null
2025-07-18	DPMT: Dual Process Multi-scale Theory of Mind Framework for Real-time Human-AI Collaboration	Xiyun Li et.al.	2507.14088	null
2025-07-18	DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits	Garapati Keerthana et.al.	2507.14079	null
2025-07-18	VLA-Mark: A cross modal watermark for large vision-language alignment model	Shuliang Liu et.al.	2507.14067	null
2025-07-18	Foundation Models as Class-Incremental Learners for Dermatological Image Classification	Mohamed Elkhayat et.al.	2507.14050	null
2025-07-18	EdgeVLA: Efficient Vision-Language-Action Models	Paweł Budzianowski et.al.	2507.14049	null
2025-07-18	Evaluating the Effectiveness of Cost-Efficient Large Language Models in Benchmark Biomedical Tasks	Israt Jahan et.al.	2507.14045	null
2025-07-18	Architecting Human-AI Cocreation for Technical Services -- Interaction Modes and Contingency Factors	Jochen Wulf et.al.	2507.14034	null
2025-07-18	KROMA: Ontology Matching with Knowledge Retrieval and Large Language Models	Lam Nguyen et.al.	2507.14032	null
2025-07-18	Moodifier: MLLM-Enhanced Emotion-Driven Image Editing	Jiarong Ye et.al.	2507.14024	null
2025-07-18	Efficient Temporal Tokenization for Mobility Prediction with Large Language Models	Haoyu He et.al.	2507.14017	null
2025-07-18	OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models	Ningyong Wu et.al.	2507.13993	null
2025-07-18	Leveraging Pathology Foundation Models for Panoptic Segmentation of Melanoma in H&E Images	Jiaqi Lv et.al.	2507.13974	null
2025-07-18	Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need	Bhishma Dedhia et.al.	2507.13966	null
2025-07-18	DUALRec: A Hybrid Sequential and Language Model Framework for Context-Aware Movie Recommendation	Yitong Li et.al.	2507.13957	null
2025-07-18	Cross-modal Causal Intervention for Alzheimer's Disease Prediction	Yutao Jin et.al.	2507.13956	null
2025-07-17	VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding	Shihao Wang et.al.	2507.13353	null
2025-07-17	VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning	Senqiao Yang et.al.	2507.13348	null
2025-07-17	Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes	Tyler Loakman et.al.	2507.13335	null
2025-07-17	A Survey of Context Engineering for Large Language Models	Lingrui Mei et.al.	2507.13334	null
2025-07-17	The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner	Zhouqi Hua et.al.	2507.13332	null
2025-07-17	Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It	Yulu Qin et.al.	2507.13328	null
2025-07-17	GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM	Kyeongjin Ahn et.al.	2507.13323	null
2025-07-17	HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals	Guimin Hu et.al.	2507.13318	null
2025-07-17	Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark	Junsu Kim et.al.	2507.13314	null
2025-07-17	The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations	Carlos Arriaga et.al.	2507.13302	null
2025-07-17	AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research	Yilun Zhao et.al.	2507.13300	null
2025-07-17	Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management	Luis Gasco et.al.	2507.13275	null
2025-07-17	Automating Steering for Safe Multimodal Large Language Models	Lyucheng Wu et.al.	2507.13255	null
2025-07-17	HATS: Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models	Ashray Gupta et.al.	2507.13238	null
2025-07-17	Enhancing Cross-task Transfer of Large Language Models via Activation Steering	Xinyu Tang et.al.	2507.13236	null
2025-07-18	MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling	Etienne Le Naour et.al.	2507.13207	null
2025-07-18	Automatically assessing oral narratives of Afrikaans and isiXhosa children	Retief Louw et.al.	2507.13205	null
2025-07-17	GEMMAS: Graph-based Evaluation Metrics for Multi Agent Systems	Jisoo Lee et.al.	2507.13190	null
2025-07-17	Black Box Deployed -- Functional Criteria for Artificial Moral Agents in the LLM Era	Matthew E. Brophy et.al.	2507.13175	null
2025-07-17	Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities	Hao Sun et.al.	2507.13158	null
2025-07-16	Language Models Improve When Pretraining Data Matches Target Tasks	David Mizrahi et.al.	2507.12466	null
2025-07-16	PhysX: Physical-Grounded 3D Asset Generation	Ziang Cao et.al.	2507.12465	null
2025-07-16	CytoSAE: Interpretable Cell Embeddings for Hematology	Muhammed Furkan Dasdelen et.al.	2507.12464	null
2025-07-16	Mitigating Object Hallucinations via Sentence-Level Early Intervention	Shangpin Peng et.al.	2507.12455	null
2025-07-16	Characterizing State Space Model (SSM) and SSM-Transformer Hybrid Language Model Performance with Long Context Length	Saptarshi Mitra et.al.	2507.12442	null
2025-07-16	Describe Anything Model for Visual Question Answering on Text-rich Images	Yen-Linh Vu et.al.	2507.12441	null
2025-07-16	Can We Predict Alignment Before Models Finish Thinking? Towards Monitoring Misaligned Reasoning Models	Yik Siu Chan et.al.	2507.12428	null
2025-07-16	Advancing Retrieval-Augmented Generation for Structured Enterprise and Internal Data	Chandana Cheerla et.al.	2507.12425	null
2025-07-16	SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?	Xinyi He et.al.	2507.12415	null
2025-07-16	AutoVDC: Automated Vision Data Cleaning Using Vision-Language Models	Santosh Vasa et.al.	2507.12414	null
2025-07-16	ROC-n-reroll: How verifier imperfection affects test-time scaling	Florian E. Dorner et.al.	2507.12399	null
2025-07-16	Assessing the Value of Visual Input: A Benchmark of Multimodal Large Language Models for Robotic Path Planning	Jacinto Colan et.al.	2507.12391	null
2025-07-16	Probing for Arithmetic Errors in Language Models	Yucheng Sun et.al.	2507.12379	null
2025-07-16	Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker	Rachna Saxena et.al.	2507.12378	null
2025-07-16	Web-Browsing LLMs Can Access Social Media Profiles and Infer User Demographics	Meysam Alizadeh et.al.	2507.12372	null
2025-07-16	Beyond Single Models: Enhancing LLM Detection of Ambiguity in Requests through Debate	Ana Davila et.al.	2507.12370	null
2025-07-16	GitChameleon: Evaluating AI Code Generation Against Python Library Version Incompatibilities	Diganta Misra et.al.	2507.12367	null
2025-07-16	Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models	Samuel Lavoie et.al.	2507.12318	null
2025-07-16	Thought Purity: Defense Paradigm For Chain-of-Thought Attack	Zihao Xue et.al.	2507.12314	null
2025-07-16	Chain-of-Descriptions: Improving Code LLMs for VHDL Code Generation and Summarization	Prashanth Vijayaraghavan et.al.	2507.12308	null
2025-07-15	Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation	Zhen Xu et.al.	2507.11540	null
2025-07-15	Streaming 4D Visual Geometry Transformer	Dong Zhuo et.al.	2507.11539	null
2025-07-15	DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering	Yinsheng Li et.al.	2507.11527	null
2025-07-15	LLM-based ambiguity detection in natural language instructions for collaborative surgical robots	Ana Davila et.al.	2507.11525	null
2025-07-15	AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air	Shiyi Yang et.al.	2507.11515	null
2025-07-15	LRMR: LLM-Driven Relational Multi-node Ranking for Lymph Node Metastasis Assessment in Rectal Cancer	Yaoxian Dong et.al.	2507.11457	null
2025-07-16	Reasoning Strategies in Large Language Models: Can They Follow, Prefer, and Optimize?	Yanjian Zhang et.al.	2507.11423	null
2025-07-15	Quantifying the Energy Consumption and Carbon Emissions of LLM Inference via Simulations	Miray Özcan et.al.	2507.11417	null
2025-07-15	Seq vs Seq: An Open Suite of Paired Encoders and Decoders	Orion Weller et.al.	2507.11412	null
2025-07-15	KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?	Soumadeep Saha et.al.	2507.11408	null
2025-07-15	EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes	LG AI Research et.al.	2507.11407	null
2025-07-15	DCR: Quantifying Data Contamination in LLMs Evaluation	Cheng Xu et.al.	2507.11405	null
2025-07-15	Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs	Gabriel Bo et.al.	2507.11371	null
2025-07-15	From Chaos to Automation: Enabling the Use of Unstructured Data for Robotic Process Automation	Kelly Kurowski et.al.	2507.11364	null
2025-07-15	What is the Best Process Model Representation? A Comparative Analysis for Process Modeling with Large Language Models	Alexis Brissard et.al.	2507.11356	null
2025-07-15	Foundation Models for Logistics: Toward Certifiable, Conversational Planning Interfaces	Yunhao Yang et.al.	2507.11352	null
2025-07-15	RefModel: Detecting Refactorings using Foundation Models	Pedro Simões et.al.	2507.11346	null
2025-07-15	Guiding LLM Decision-Making with Fairness Reward Models	Zara Hall et.al.	2507.11344	null
2025-07-15	MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network	Jianfei Jiang et.al.	2507.11333	null
2025-07-16	Automated Novelty Evaluation of Academic Paper: A Collaborative Approach Integrating Human and Large Language Model Knowledge	Wenqing Wu et.al.	2507.11330	null
2025-07-14	EmbRACE-3K: Embodied Reasoning and Action in Complex Environments	Mingxian Lin et.al.	2507.10548	null
2025-07-14	Fusing LLM Capabilities with Routing Data	Tao Feng et.al.	2507.10540	null
2025-07-14	Graph World Model	Tao Feng et.al.	2507.10539	null
2025-07-14	CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks	Hongchao Jiang et.al.	2507.10535	null
2025-07-14	Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination	Mingqi Wu et.al.	2507.10532	null
2025-07-14	Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation	Sangmin Bae et.al.	2507.10524	null
2025-07-14	Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI	Jiangkai Wu et.al.	2507.10510	null
2025-07-14	Scene-Aware Conversational ADAS with Generative AI for Real-Time Driver Assistance	Kyungtae Han et.al.	2507.10500	null
2025-07-14	Can You Detect the Difference?	İsmail Tarım et.al.	2507.10475	null
2025-07-14	MLAR: Multi-layer Large Language Model-based Robotic Process Automation Applicant Tracking	Mohamed T. Younes et.al.	2507.10472	null
2025-07-14	An Empirical Evaluation of AI-Powered Non-Player Characters' Perceived Realism and Performance in Virtual Reality Environments	Mikko Korkiakoski et.al.	2507.10469	null
2025-07-14	Logic layer Prompt Control Injection (LPCI): A Novel Security Vulnerability Class in Agentic Systems	Hammad Atta et.al.	2507.10457	null
2025-07-14	CoralVQA: A Large-Scale Visual Question Answering Dataset for Coral Reef Image Understanding	Hongyong Han et.al.	2507.10449	null
2025-07-15	Text-Visual Semantic Constrained AI-Generated Image Quality Assessment	Qiang Li et.al.	2507.10432	null
2025-07-14	Towards Emotion Co-regulation with LLM-powered Socially Assistive Robots: Integrating LLM Prompts and Robotic Behaviors to Support Parent-Neurodivergent Child Dyads	Jing Li et.al.	2507.10427	null
2025-07-14	Multiple Choice Learning of Low Rank Adapters for Language Modeling	Victor Letzelter et.al.	2507.10419	null
2025-07-14	Zorse: Optimizing LLM Training Efficiency on Heterogeneous GPU Clusters	Runsheng Benson Guo et.al.	2507.10392	null
2025-07-14	Extracting Important Tokens in E-Commerce Queries with a Tag Interaction-Aware Transformer Model	Md. Ahsanul Kabir et.al.	2507.10385	null
2025-07-14	Test-Time Canonicalization by Foundation Models for Robust Perception	Utkarsh Singhal et.al.	2507.10375	null
2025-07-14	Beyond Graph Model: Reliable VLM Fine-Tuning via Random Graph Adapter	Bo Jiang et.al.	2507.10355	null
2025-07-11	The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?	Denis Sutter et.al.	2507.08802	null
2025-07-11	Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective	Hangjie Yuan et.al.	2507.08801	null
2025-07-11	KV Cache Steering for Inducing Reasoning in Small Language Models	Max Belitsky et.al.	2507.08799	null
2025-07-11	One Token to Fool LLM-as-a-Judge	Yulai Zhao et.al.	2507.08794	null
2025-07-11	From One to More: Contextual Part Latents for 3D Generation	Shaocong Dong et.al.	2507.08772	null
2025-07-11	BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity	Chenyang Song et.al.	2507.08771	null
2025-07-11	EqualMotion: Accessible Motion Capture for the Creative Industries	Clarice Hilton et.al.	2507.08744	null
2025-07-11	Multilingual Multimodal Software Developer for Code Generation	Linzheng Chai et.al.	2507.08719	null
2025-07-11	Unreal is all you need: Multimodal ISAC Data Simulation with Only One Engine	Kongwu Huang et.al.	2507.08716	null
2025-07-11	KG-Attention: Knowledge Graph-Guided Attention at Test-Time via Bidirectional Information Aggregation	Songlin Zhai et.al.	2507.08704	null
2025-07-11	ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way	Rajarshi Roy et.al.	2507.08679	null
2025-07-11	LLMCup: Ranking-Enhanced Comment Updating with LLMs	Hua Ge et.al.	2507.08671	null
2025-07-11	KELPS: A Framework for Verified Multi-Language Autoformalization via Semantic-Syntactic Alignment	Jiyao Zhang et.al.	2507.08665	null
2025-07-11	Introspection of Thought Helps AI Agents	Haoran Sun et.al.	2507.08664	null
2025-07-11	Leanabell-Prover-V2: Verifier-integrated Reasoning for Formal Theorem Proving via Reinforcement Learning	Xingguang Ji et.al.	2507.08649	null
2025-07-11	DatasetAgent: A Novel Multi-Agent System for Auto-Constructing Datasets from Real-World Images	Haoran Sun et.al.	2507.08648	null
2025-07-11	NL in the Middle: Code Translation with LLMs and Intermediate Representations	Chi-en Amy Tai et.al.	2507.08627	null
2025-07-11	Adaptive Framework for Ambient Intelligence in Rehabilitation Assistance	Gábor Baranyi et.al.	2507.08624	null
2025-07-11	A comprehensive study of LLM-based argument classification: from LLAMA through GPT-4o to Deepseek-R1	Marcin Pietroń et.al.	2507.08621	null
2025-07-11	Agentic Large Language Models for Conceptual Systems Engineering and Design	Soheyl Massoudi et.al.	2507.08619	null
2025-07-10	Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs	Ziyue Li et.al.	2507.07996	null
2025-07-10	Multigranular Evaluation for Brain Visual Decoding	Weihao Xia et.al.	2507.07993	null
2025-07-10	Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs	Jeongseok Hyun et.al.	2507.07990	null
2025-07-10	Automating Expert-Level Medical Reasoning Evaluation of Large Language Models	Shuang Zhou et.al.	2507.07988	null
2025-07-10	CLIP Won't Learn Object-Attribute Binding from Natural Data and Here is Why	Bijay Gurung et.al.	2507.07985	null
2025-07-10	OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding	JingLi Lin et.al.	2507.07984	null
2025-07-10	Performance and Practical Considerations of Large and Small Language Models in Clinical Decision Support in Rheumatology	Sabine Felde et.al.	2507.07983	null
2025-07-10	Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling	Haoyu Wu et.al.	2507.07982	null
2025-07-10	Why is Your Language Model a Poor Implicit Reward Model?	Noam Razin et.al.	2507.07981	null
2025-07-10	Defending Against Prompt Injection With a Few DefensiveTokens	Sizhe Chen et.al.	2507.07974	null
2025-07-10	Scaling RL to Long Videos	Yukang Chen et.al.	2507.07966	null
2025-07-10	MIRIX: Multi-Agent Memory System for LLM-Based Agents	Yu Wang et.al.	2507.07957	null
2025-07-10	Dynamic Chunking for End-to-End Hierarchical Sequence Modeling	Sukjun Hwang et.al.	2507.07955	null
2025-07-10	Input Conditioned Layer Dropping in Speech Foundation Models	Abdul Hannan et.al.	2507.07954	null
2025-07-10	SAGE: A Visual Language Model for Anomaly Detection via Fact Enhancement and Entropy-aware Alignment	Guoxin Zang et.al.	2507.07939	null
2025-07-10	Can Large Language Models Improve Phishing Defense? A Large-Scale Controlled Experiment on Warning Dialogue Explanations	Federico Maria Cau et.al.	2507.07916	null
2025-07-10	MIRA: A Novel Framework for Fusing Modalities in Medical RAG	Jinhong Wang et.al.	2507.07902	null
2025-07-10	An Integrated Framework of Prompt Engineering and Multidimensional Knowledge Graphs for Legal Dispute Analysis	Mingda Zhang et.al.	2507.07893	null
2025-07-10	Automating MD simulations for Proteins using Large language Models: NAMD-Agent	Achuth Chandrasekhar et.al.	2507.07887	null
2025-07-10	Opting Out of Generative AI: a Behavioral Experiment on the Role of Education in Perplexity AI Avoidance	Roberto Ulloa et.al.	2507.07881	null
2025-07-09	Towards Multimodal Understanding via Stable Diffusion as a Task-Aware Feature Extractor	Vatsal Agarwal et.al.	2507.07106	null
2025-07-09	4KAgent: Agentic Any Image to 4K Super-Resolution	Yushen Zuo et.al.	2507.07105	null
2025-07-09	Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models	Tiezheng Zhang et.al.	2507.07104	null
2025-07-09	Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful	Martin Marek et.al.	2507.07101	null
2025-07-09	Evaluating Attribute Confusion in Fashion Text-to-Image Generation	Ziyue Liu et.al.	2507.07079	null
2025-07-09	5C Prompt Contracts: A Minimalist, Creative-Friendly, Token-Efficient Design Framework for Individual and SME LLM Usage	Ugur Ari et.al.	2507.07045	null
2025-07-09	UniConv: Unifying Retrieval and Response Generation for Large Language Models in Conversations	Fengran Mo et.al.	2507.07030	null
2025-07-09	FlexOlmo: Open Language Models for Flexible Data Use	Weijia Shi et.al.	2507.07024	null
2025-07-09	First Return, Entropy-Eliciting Explore	Tianyu Zheng et.al.	2507.07017	null
2025-07-09	Integrating Pathology Foundation Models and Spatial Transcriptomics for Cellular Decomposition from Histology Images	Yutong Sun et.al.	2507.07013	null
2025-07-09	GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning	S M Taslim Uddin Raju et.al.	2507.07006	null
2025-07-09	Learning Deliberately, Acting Intuitively: Unlocking Test-Time Reasoning in Multimodal LLMs	Yahan Yu et.al.	2507.06999	null
2025-07-09	MCA-RG: Enhancing LLMs with Medical Concept Alignment for Radiology Report Generation	Qilong Xing et.al.	2507.06992	null
2025-07-09	Are They All Good? Evaluating the Quality of CoTs in LLM-based Code Generation	Binquan Zhang et.al.	2507.06980	null
2025-07-09	Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM	Qiyuan Dai et.al.	2507.06973	null
2025-07-09	Scaling Towards the Information Boundary of Instruction Set: InfinityInstruct-Subject Technical Report	Li Du et.al.	2507.06968	null
2025-07-09	CheXPO: Preference Optimization for Chest X-ray VLMs with Counterfactual Rationale	Xiao Liang et.al.	2507.06959	null
2025-07-09	Investigating the Robustness of Retrieval-Augmented Generation at the Query Level	Sezen Perçin et.al.	2507.06956	null
2025-07-10	What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models	Keyon Vafa et.al.	2507.06952	null
2025-07-10	Rethinking Verification for LLM Code Generation: From Generation to Testing	Zihan Ma et.al.	2507.06920	null
2025-07-08	RSRefSeg 2: Decoupling Referring Remote Sensing Image Segmentation with Foundation Models	Keyan Chen et.al.	2507.06231	null
2025-07-08	Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers	Zhiyuan Peng et.al.	2507.06223	null
2025-07-08	Aligned Textual Scoring Rules	Yuxuan Lu et.al.	2507.06221	null
2025-07-08	Is Diversity All You Need for Scalable Robotic Manipulation?	Modi Shi et.al.	2507.06219	null
2025-07-08	CultureCLIP: Empowering CLIP with Cultural Awareness through Synthetic Images and Contextualized Captions	Yuchen Huang et.al.	2507.06210	null
2025-07-08	Ontological differentiation as a measure of semantic accuracy	Pablo Garcia-Cuadrillero et.al.	2507.06208	null
2025-07-08	Differential Mamba	Nadav Schneider et.al.	2507.06204	null
2025-07-08	A Survey on Latent Reasoning	Rui-Jie Zhu et.al.	2507.06203	null
2025-07-08	UQLM: A Python Package for Uncertainty Quantification in Large Language Models	Dylan Bouchard et.al.	2507.06196	null
2025-07-08	SQLBarber: A System Leveraging Large Language Models to Generate Customized and Realistic SQL Workloads	Jiale Lao et.al.	2507.06192	null
2025-07-08	The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains	Scott Geng et.al.	2507.06187	null
2025-07-08	Hidden Prompts in Manuscripts Exploit AI-Assisted Peer Review	Zhicheng Lin et.al.	2507.06185	null
2025-07-08	Enhancing Scientific Visual Question Answering through Multimodal Reasoning and Ensemble Modeling	Prahitha Movva et.al.	2507.06183	null
2025-07-08	Data-Semantics-Aware Recommendation of Diverse Pivot Tables	Whanhee Cho et.al.	2507.06171	null
2025-07-09	Skywork-R1V3 Technical Report	Wei Shen et.al.	2507.06167	null
2025-07-08	Evaluation of Habitat Robotics using Large Language Models	William Li et.al.	2507.06157	null
2025-07-08	Large Language Models Predict Human Well-being -- But Not Equally Everywhere	Pat Pataranutaporn et.al.	2507.06141	null
2025-07-08	LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models	Zhihao Chen et.al.	2507.06140	null
2025-07-08	Coding Triangle: How Does Large Language Model Understand Code?	Taolin Zhang et.al.	2507.06138	null
2025-07-08	PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder Optimization	Dongsheng Zuo et.al.	2507.06127	null
2025-07-07	Beyond Simple Edits: X-Planner for Complex Instruction-Based Image Editing	Chun-Hsiao Yeh et.al.	2507.05259	null
2025-07-07	Spatio-Temporal LLM: Reasoning about Environments and Actions	Haozhen Zheng et.al.	2507.05258	null
2025-07-07	Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions	Yuanzhe Hu et.al.	2507.05257	null
2025-07-07	Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning	Yana Wei et.al.	2507.05255	null
2025-07-07	Response Attack: Exploiting Contextual Priming to Jailbreak Large Language Models	Ziqi Miao et.al.	2507.05248	null
2025-07-07	When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors	Scott Emmons et.al.	2507.05246	null
2025-07-07	StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling	Meng Wei et.al.	2507.05240	null
2025-07-07	Logit Reweighting for Topic-Focused Summarization	Joschka Braun et.al.	2507.05235	null
2025-07-07	NavigScene: Bridging Local Perception and Global Navigation for Beyond-Visual-Range Autonomous Driving	Qucheng Peng et.al.	2507.05227	null
2025-07-07	QuEst: Enhancing Estimates of Quantile-Based Distributional Measures Using Model Predictions	Zhun Deng et.al.	2507.05220	null
2025-07-07	All in One: Visual-Description-Guided Unified Point Cloud Segmentation	Zongyan Han et.al.	2507.05211	null
2025-07-07	MedGemma Technical Report	Andrew Sellergren et.al.	2507.05201	null
2025-07-07	Train-before-Test Harmonizes Language Model Rankings	Guanhua Zhang et.al.	2507.05195	null
2025-07-07	CREW-WILDFIRE: Benchmarking Agentic Multi-Agent Collaborations at Scale	Jonathan Hyun et.al.	2507.05178	null
2025-07-08	OpenS2S: Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model	Chen Wang et.al.	2507.05177	null
2025-07-07	Differential Attention for Multimodal Crisis Event Analysis	Nusrat Munia et.al.	2507.05165	null
2025-07-07	InfoSteer: Steering Information Utility in Language Model Post-Training	Chunyuan Deng et.al.	2507.05158	null
2025-07-07	AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models	Chinnappa Guggilla et.al.	2507.05157	null
2025-07-07	Interpretable Mnemonic Generation for Kanji Learning via Expectation-Maximization	Jaewook Lee et.al.	2507.05137	null
2025-07-07	LERa: Replanning with Visual Feedback in Instruction Following	Svyatoslav Pchelintsev et.al.	2507.05135	null
2025-07-03	Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation	Jiaer Xia et.al.	2507.02859	null
2025-07-03	Requirements Elicitation Follow-Up Question Generation	Yuchen Shen et.al.	2507.02858	null
2025-07-03	Answer Matching Outperforms Multiple Choice for Language Model Evaluation	Nikhil Chandak et.al.	2507.02856	null
2025-07-03	MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs	Purbesh Mitra et.al.	2507.02851	null
2025-07-03	LLM Hypnosis: Exploiting User Feedback for Unauthorized Knowledge Injection to All Users	Almog Hilel et.al.	2507.02850	null
2025-07-03	Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection	Ziqi Miao et.al.	2507.02844	null
2025-07-03	LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding	Yuchen Ma et.al.	2507.02843	null
2025-07-03	StepHint: Multi-level Stepwise Hints Enhance Reinforcement Learning to Reason	Kaiyi Zhang et.al.	2507.02841	null
2025-07-03	ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning	Ruiyang Zhou et.al.	2507.02834	null
2025-07-03	Generalizing Verifiable Instruction Following	Valentina Pyatkin et.al.	2507.02833	null
2025-07-03	SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model	Wencheng Zhang et.al.	2507.02822	null
2025-07-03	Multimodal Mathematical Reasoning with Diverse Solving Perspective	Wenhao Shi et.al.	2507.02804	null
2025-07-03	Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models	Riccardo Cantini et.al.	2507.02799	null
2025-07-03	No time to train! Training-Free Reference-Based Instance Segmentation	Miguel Espinosa et.al.	2507.02798	null
2025-07-03	From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding	Xiangfeng Wang et.al.	2507.02790	null
2025-07-03	Moral Responsibility or Obedience: What Do We Want from AI?	Joseph Boland et.al.	2507.02788	null
2025-07-03	Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs	Ken Tsui et.al.	2507.02778	null
2025-07-03	KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs	Yuzhang Xie et.al.	2507.02773	null
2025-07-03	DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment	Ke-Han Lu et.al.	2507.02768	null
2025-07-03	Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work	Guangwei Zhang et.al.	2507.02760	null
2025-07-02	How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks	Rahul Ramachandran et.al.	2507.01955	null
2025-07-02	Kwai Keye-VL Technical Report	Kwai Keye Team et.al.	2507.01949	null
2025-07-02	SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars	Xiaosheng Zhao et.al.	2507.01939	null
2025-07-02	The Thin Line Between Comprehension and Persuasion in LLMs	Adrian de Wynter et.al.	2507.01936	null
2025-07-03	Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations	Wenhao Wang et.al.	2507.01930	null
2025-07-02	A Survey on Vision-Language-Action Models: An Action Tokenization Perspective	Yifan Zhong et.al.	2507.01925	null
2025-07-03	Decision-Oriented Text Evaluation	Yu-Shiang Huang et.al.	2507.01923	null
2025-07-02	Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models	Chengao Li et.al.	2507.01915	null
2025-07-02	Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning	Qingdong He et.al.	2507.01908	null
2025-07-02	AI4Research: A Survey of Artificial Intelligence for Scientific Research	Qiguang Chen et.al.	2507.01903	null
2025-07-02	High-Layer Attention Pruning with Rescaling	Songtao Liu et.al.	2507.01900	null
2025-07-02	MiCoTA: Bridging the Learnability Gap with Intermediate CoT and Teacher Assistants	Dongyi Ding et.al.	2507.01887	null
2025-07-02	A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs	Niccolò McConnell et.al.	2507.01881	null
2025-07-02	Towards Foundation Auto-Encoders for Time-Series Anomaly Detection	Gastón García González et.al.	2507.01875	null
2025-07-02	DIY-MKG: An LLM-Based Polyglot Language Learning System	Kenan Tang et.al.	2507.01872	null
2025-07-02	Bridging UI Design and chatbot Interactions: Applying Form-Based Principles to Conversational Agents	Sanjay Krishna Anbalagan et.al.	2507.01862	null
2025-07-02	TypeTele: Releasing Dexterity in Teleoperation by Dexterous Manipulation Types	Yuhao Lin et.al.	2507.01857	null
2025-07-02	Eka-Eval : A Comprehensive Evaluation Framework for Large Language Models in Indian Languages	Samridhi Raj Sinha et.al.	2507.01853	null
2025-07-02	Low-Perplexity LLM-Generated Sequences and Where To Find Them	Arthur Wuhrmann et.al.	2507.01844	null
2025-07-02	MoIRA: Modular Instruction Routing Architecture for Multi-Task Robotics	Dmytro Kuzmenko et.al.	2507.01843	null
2025-07-01	Teaching Time Series to See and Speak: Forecasting with Aligned Visual and Textual Perspectives	Sixun Dong et.al.	2506.24124	null
2025-06-30	Calligrapher: Freestyle Text Image Customization	Yue Ma et.al.	2506.24123	null
2025-06-30	Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime	Yuqing Wang et.al.	2506.24120	null
2025-07-01	SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning	Bo Liu et.al.	2506.24119	null
2025-07-01	Intertextual Parallel Detection in Biblical Hebrew: A Transformer-Based Benchmark	David M. Smiley et.al.	2506.24117	null
2025-06-30	On the Predictive Power of Representation Dispersion in Language Models	Yanhong Li et.al.	2506.24106	null
2025-06-30	DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World	Xiangtai Li et.al.	2506.24102	null
2025-06-30	MotionGPT3: Human Motion as a Second Modality	Bingfan Zhu et.al.	2506.24086	null
2025-06-30	Logit-Gap Steering: Efficient Short-Suffix Jailbreaks for Aligned Large Language Models	Tung-Ling Li et.al.	2506.24056	null
2025-06-30	Agent.xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC	Xinming Wei et.al.	2506.24045	null
2025-06-30	A Survey on Vision-Language-Action Models for Autonomous Driving	Sicong Jiang et.al.	2506.24044	null
2025-06-30	Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data	Shubhabrata Mukherjee et.al.	2506.24039	null
2025-06-30	Ella: Embodied Social Agents with Lifelong Memory	Hongxin Zhang et.al.	2506.24019	null
2025-06-30	EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations	Hyunjong Kim et.al.	2506.24016	null
2025-06-30	Large Language Models Don't Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective	Anselm R. Strohmaier et.al.	2506.24006	null
2025-06-30	The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models	Lijun Sheng et.al.	2506.24000	null
2025-06-30	Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning	Seungjun Yi et.al.	2506.23998	null
2025-06-30	StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving	Ruiyang Hao et.al.	2506.23982	null
2025-06-30	TaP: A Taxonomy-Guided Framework for Automated and Scalable Preference Data Generation	Renren Jin et.al.	2506.23979	null
2025-06-30	Visual and Memory Dual Adapter for Multi-Modal Object Tracking	Boyue Xu et.al.	2506.23972	null
2025-06-27	MiCo: Multi-image Contrast for Reinforcement Visual Reasoning	Xi Chen et.al.	2506.22434	null
2025-06-27	The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements	Bingchen Zhao et.al.	2506.22419	null
2025-06-27	Sequential Diagnosis with Language Models	Harsha Nori et.al.	2506.22405	null
2025-06-27	HyperCLOVA X THINK Technical Report	NAVER Cloud HyperCLOVA X Team et.al.	2506.22403	null
2025-06-27	Refining Czech GEC: Insights from a Multi-Experiment Approach	Petr Pechman et.al.	2506.22402	null
2025-06-27	QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization	Danush Khanna et.al.	2506.22396	null
2025-06-27	Test-Time Consistency in Vision Language Models	Shih-Han Chou et.al.	2506.22395	null
2025-06-27	What Makes ChatGPT Effective for Software Issue Resolution? An Empirical Study of Developer-ChatGPT Conversations in GitHub	Ramtin Ehsani et.al.	2506.22390	null
2025-06-27	Can Video Large Multimodal Models Think Like Doubters-or Double-Down: A Study on Defeasible Video Entailment	Yue Zhang et.al.	2506.22385	null
2025-06-27	Probabilistic Optimality for Inference-time Scaling	Youkang Wang et.al.	2506.22376	null
2025-06-27	Exploiting Vision Language Model for Training-Free 3D Point Cloud OOD Detection via Graph Score Propagation	Tiankai Chen et.al.	2506.22375	null
2025-06-27	Towards Fair Rankings: Leveraging LLMs for Gender Bias Detection and Measurement	Maryam Mousavian et.al.	2506.22372	null
2025-06-27	Can Large Language Models Help Students Prove Software Correctness? An Experimental Study with Dafny	Carolina Carreira et.al.	2506.22370	null
2025-06-27	DiffSoundStream: Efficient Speech Tokenization via Diffusion Decoding	Yang Yang et.al.	2506.22362	null
2025-06-27	Concept-Level AI for Telecom: Moving Beyond Large Language Models	Viswanath Kumarskandpriya et.al.	2506.22359	null
2025-06-27	Optimal Estimation of Watermark Proportions in Hybrid AI-Human Texts	Xiang Li et.al.	2506.22343	null
2025-06-27	Evaluating Scoring Bias in LLM-as-a-Judge	Qingquan Li et.al.	2506.22316	null
2025-06-27	Detection of Personal Data in Structured Datasets Using a Large Language Model	Albert Agisha Ntwali et.al.	2506.22305	null
2025-06-27	Rethinking Visual Token Reduction in LVLMs under Cross-modal Misalignment	Rui Xu et.al.	2506.22283	null
2025-06-27	COOCO -- Common Objects Out-of-Context -- Semantic Violation in Scenes: Investigating Multimodal Context in Referential Communication	Filippo Merlo et.al.	2506.22274	null
2025-06-26	Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test	Ziyue Li et.al.	2506.21551	null
2025-06-26	mTSBench: Benchmarking Multivariate Time Series Anomaly Detection and Model Selection at Scale	Xiaona Zhou et.al.	2506.21550	null
2025-06-26	SAM4D: Segment Anything in Camera and LiDAR Streams	Jianyun Xu et.al.	2506.21547	null
2025-06-26	Data Efficacy for Language Model Training	Yalun Dai et.al.	2506.21545	null
2025-06-26	PsyLite Technical Report	Fangjun Ding et.al.	2506.21536	null
2025-06-26	Exploring the Design Space of 3D MLLMs for CT Report Generation	Mohammed Baharoon et.al.	2506.21535	null
2025-06-26	"What's Up, Doc?": Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets	Akshay Paruchuri et.al.	2506.21532	null
2025-06-26	Potemkin Understanding in Large Language Models	Marina Mancoridis et.al.	2506.21521	null
2025-06-26	Assessing an evolutionary search engine for small language models, prompts, and evaluation metrics	Cláudio Lúcio do Val Lopes et.al.	2506.21512	null
2025-06-26	Mitigating Hallucination of Large Vision-Language Models via Dynamic Logits Calibration	Jiahe Chen et.al.	2506.21509	null
2025-06-26	skLEP: A Slovak General Language Understanding Benchmark	Marek Šuppa et.al.	2506.21508	null
2025-06-26	Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge	Boyu Gou et.al.	2506.21506	null
2025-06-26	Bridging Offline and Online Reinforcement Learning for LLMs	Jack Lanchantin et.al.	2506.21495	null
2025-06-26	Global and Local Entailment Learning for Natural World Imagery	Srikumar Sastry et.al.	2506.21476	null
2025-06-26	TopK Language Models	Ryosuke Takahashi et.al.	2506.21468	null
2025-06-26	Efficient and Reuseable Cloud Configuration Search Using Discovery Spaces	Michael Johnston et.al.	2506.21467	null
2025-06-26	Aligning Spoken Dialogue Models from User Interactions	Anne Wu et.al.	2506.21463	null
2025-06-26	Spatial Mental Modeling from Limited Views	Baiqiao Yin et.al.	2506.21458	null
2025-06-26	ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing	Huadai Liu et.al.	2506.21448	null
2025-06-26	Text2Cypher Across Languages: Evaluating Foundational Models Beyond English	Makbule Gulcin Ozsoy et.al.	2506.21445	null
2025-06-25	The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind	Andrei Lupu et.al.	2506.20664	null
2025-06-25	Memento: Note-Taking for Your Future Self	Chao Wan et.al.	2506.20642	null
2025-06-25	Towards Community-Driven Agents for Machine Learning Engineering	Sijie Li et.al.	2506.20640	null
2025-06-26	DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation	Shansan Gong et.al.	2506.20639	null
2025-06-25	Shape2Animal: Creative Animal Generation from Natural Silhouettes	Quoc-Duy Tran et.al.	2506.20616	null
2025-06-25	AI Assistants to Enhance and Exploit the PETSc Knowledge Base	Barry Smith et.al.	2506.20608	null
2025-06-25	Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or Harm	Baixiang Huang et.al.	2506.20606	null
2025-06-25	Video Perception Models for 3D Scene Synthesis	Rui Huang et.al.	2506.20601	null
2025-06-25	HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction	Zhonghao Shi et.al.	2506.20566	null
2025-06-25	Large Language Model-Driven Code Compliance Checking in Building Information Modeling	Soumya Madireddy et.al.	2506.20551	null
2025-06-25	When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs	Ammar Khairi et.al.	2506.20544	null
2025-06-25	WattsOnAI: Measuring, Analyzing, and Visualizing Energy and Carbon Footprint of AI Workloads	Hongzhen Huang et.al.	2506.20535	null
2025-06-25	Case-based Reasoning Augmented Large Language Model Framework for Decision Making in Realistic Safety-Critical Driving Scenarios	Wenbin Gan et.al.	2506.20531	null
2025-06-25	Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards	Charles Arnal et.al.	2506.20520	null
2025-06-25	OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling	Zengzhi Wang et.al.	2506.20512	null
2025-06-25	BotHash: Efficient and Training-Free Bot Detection Through Approximate Nearest Neighbor	Edoardo Di Paolo et.al.	2506.20503	null
2025-06-25	ReCode: Updating Code API Knowledge with Reinforcement Learning	Haoze Wu et.al.	2506.20495	null
2025-06-25	Brains and language models converge on a shared conceptual space across different languages	Zaid Zada et.al.	2506.20489	null
2025-06-25	Behavior Foundation Model: Towards Next-Generation Whole-Body Control System of Humanoid Robots	Mingqi Yuan et.al.	2506.20487	null
2025-06-25	Counterfactual Influence as a Distributional Quantity	Matthieu Meeus et.al.	2506.20481	null
2025-06-24	Unified Vision-Language-Action Model	Yuqi Wang et.al.	2506.19850	null
2025-06-24	Orthogonal Finetuning Made Scalable	Zeju Qiu et.al.	2506.19847	null
2025-06-24	JoyAgents-R1: Joint Evolution Dynamics for Versatile Multi-LLM Agents with Reinforcement Learning	Ai Han et.al.	2506.19846	null
2025-06-24	MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration	Yucheng Zhou et.al.	2506.19835	null
2025-06-24	Evaluating Compliance with Visualization Guidelines in Diagrams for Scientific Publications Using Large Vision Language Models	Johannes Rückert et.al.	2506.19825	null
2025-06-24	Persona Features Control Emergent Misalignment	Miles Wang et.al.	2506.19823	null
2025-06-24	CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation	Hao Li et.al.	2506.19816	null
2025-06-24	Curating art exhibitions using machine learning	Eurico Covas et.al.	2506.19813	null
2025-06-24	KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality	Baochang Ren et.al.	2506.19807	null
2025-06-24	LLM-Based Social Simulations Require a Boundary	Zengqing Wu et.al.	2506.19806	null
2025-06-24	KnowML: Improving Generalization of ML-NIDS with Attack Knowledge Graphs	Xin Fan Guo et.al.	2506.19802	null
2025-06-24	Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study	Yuqi Zhu et.al.	2506.19794	null
2025-06-24	SAGE: Strategy-Adaptive Generation Engine for Query Rewriting	Teng Wang et.al.	2506.19783	null
2025-06-24	Multi-Preference Lambda-weighted Listwise DPO for Dynamic Preference Alignment	Yuhui Sun et.al.	2506.19780	null
2025-06-24	SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning	Yuqian Fu et.al.	2506.19767	null
2025-06-24	Arabic Dialect Classification using RNNs, Transformers, and Large Language Models: A Comparative Analysis	Omar A. Essameldin et.al.	2506.19753	null
2025-06-24	Breaking Barriers: Do Reinforcement Post Training Gains Transfer To Unseen Domains?	Chuxuan Hu et.al.	2506.19733	null
2025-06-24	LLM-Driven Medical Document Analysis: Enhancing Trustworthy Pathology and Differential Diagnosis	Lei Kang et.al.	2506.19702	null
2025-06-24	Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models	Jungwoo Park et.al.	2506.19697	null
2025-06-24	UltraAD: Fine-Grained Ultrasound Anomaly Classification via Few-Shot CLIP Adaptation	Yue Zhou et.al.	2506.19694	null
2025-06-23	Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations	Jiaming Han et.al.	2506.18898	null
2025-06-23	ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs	Jiaru Zou et.al.	2506.18896	null
2025-06-23	Steering Conceptual Bias via Transformer Latent-Subspace Activation	Vansh Sharma et.al.	2506.18887	null
2025-06-23	Universal Video Temporal Grounding with Generative Multi-modal Large Language Models	Zeqian Li et.al.	2506.18883	null
2025-06-23	OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization	Yiyou Sun et.al.	2506.18880	null
2025-06-23	CommVQ: Commutative Vector Quantization for KV Cache Compression	Junyan Li et.al.	2506.18879	null
2025-06-23	OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation	Qijun Gan et.al.	2506.18866	null
2025-06-23	TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting	Zhongbin Guo et.al.	2506.18862	null
2025-06-23	LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning	Yuhao Wu et.al.	2506.18841	null
2025-06-23	STU-PID: Steering Token Usage via PID Controller for Efficient Large Language Model Reasoning	Aryasomayajula Ram Bharadwaj et.al.	2506.18831	null
2025-06-23	Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories	Islem Bouzenia et.al.	2506.18824	null
2025-06-23	RWESummary: A Framework and Test for Choosing Large Language Models to Summarize Real-World Evidence (RWE) Studies	Arjun Mukerji et.al.	2506.18819	null
2025-06-23	Context-Aware CodeLLM Eviction for AI-assisted Coding	Kishanthan Thangarajah et.al.	2506.18796	null
2025-06-23	TRIZ Agents: A Multi-Agent LLM Approach for TRIZ-Based Innovation	Kamil Szczepanik et.al.	2506.18783	null
2025-06-23	Existing LLMs Are Not Self-Consistent For Simple Tasks	Zhenru Lin et.al.	2506.18781	null
2025-06-23	Programming by Backprop: LLMs Acquire Reusable Algorithmic Abstractions During Code Training	Jonathan Cook et.al.	2506.18777	null
2025-06-23	Towards Group Fairness with Multiple Sensitive Attributes in Federated Foundation Models	Yuning Yang et.al.	2506.18732	null
2025-06-23	PARALLELPROMPT: Extracting Parallelism from Large Language Model Queries	Steven Kolawole et.al.	2506.18728	null
2025-06-23	Multi-modal Anchor Gated Transformer with Knowledge Distillation for Emotion Recognition in Conversation	Jie Li et.al.	2506.18716	link
2025-06-23	LLM-enhanced Interactions in Human-Robot Collaborative Drawing with Older Adults	Marianne Bossema et.al.	2506.18711	null
2025-06-20	VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning	Zhangyang Qi et.al.	2506.17221	null
2025-06-20	No Free Lunch: Rethinking Internal Feedback for LLM Reasoning	Yanzhi Zhang et.al.	2506.17219	null
2025-06-20	Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens	Zeyuan Yang et.al.	2506.17218	link
2025-06-20	BREAD: Branched Rollouts from Expert Anchors Bridge SFT & RL for Reasoning	Xuechen Zhang et.al.	2506.17211	null
2025-06-20	Fine-Tuning Lowers Safety and Disrupts Evaluation Consistency	Kathleen C. Fraser et.al.	2506.17209	null
2025-06-20	Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM- and Agent-Based Repair Systems	Matias Martinez et.al.	2506.17208	null
2025-06-20	DreamCube: 3D Panorama Generation via Multi-plane Synchronization	Yukun Huang et.al.	2506.17206	null
2025-06-20	Confidence Scoring for LLM-Generated SQL in Supply Chain Data Extraction	Jiekai Ma et.al.	2506.17203	null
2025-06-20	Detecting LLM-Generated Short Answers and Effects on Learner Performance	Shambhavi Bhushan et.al.	2506.17196	link
2025-06-20	CLEAR-3K: Assessing Causal Explanatory Capabilities in Language Models	Naiming Liu et.al.	2506.17180	null
2025-06-20	The MedPerturb Dataset: What Non-Content Perturbations Reveal About Human and Clinical LLM Decision Making	Abinitha Gourabathina et.al.	2506.17163	null
2025-06-20	Analyzing PDFs like Binaries: Adversarially Robust PDF Malware Analysis via Intermediate Representation and Language Model	Side Liu et.al.	2506.17162	null
2025-06-20	Do We Need Large VLMs for Spotting Soccer Actions?	Ritabrata Chakraborty et.al.	2506.17144	null
2025-06-20	MeDi: Metadata-Guided Diffusion Models for Mitigating Biases in Tumor Classification	David Jacob Drexlin et.al.	2506.17140	null
2025-06-20	Large Language Model Unlearning for Source Code	Xue Jiang et.al.	2506.17125	null
2025-06-20	When Can Model-Free Reinforcement Learning be Enough for Thinking?	Josiah P. Hanna et.al.	2506.17124	null
2025-06-20	Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?	Adithya Bhaskar et.al.	2506.17121	link
2025-06-20	Reassessing Code Authorship Attribution in the Era of Language Models	Atish Kumar Dipongkor et.al.	2506.17120	null
2025-06-20	Are Bias Evaluation Methods Biased ?	Lina Berrayana et.al.	2506.17111	null
2025-06-20	Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving	Chuxue Cao et.al.	2506.17104	null
2025-06-18	PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning	Yuhui Shi et.al.	2506.15683	null
2025-06-18	GenRecal: Generation after Recalibration from Large to Small Vision-Language Models	Byung-Kwan Lee et.al.	2506.15681	null
2025-06-18	Dense SAE Latents Are Features, Not Bugs	Xiaoqing Sun et.al.	2506.15679	null
2025-06-18	SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence	Yao Zhang et.al.	2506.15672	null
2025-06-18	CC-LEARN: Cohort-based Consistency Learning	Xiao Ye et.al.	2506.15662	null
2025-06-18	PhishDebate: An LLM-Based Multi-Agent Framework for Phishing Website Detection	Wenhao Li et.al.	2506.15656	null
2025-06-18	AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning	Tevin Wang et.al.	2506.15651	null
2025-06-18	Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLM Captioning	Ankan Deria et.al.	2506.15649	null
2025-06-18	deepSURF: Detecting Memory Safety Vulnerabilities in Rust Through Fuzzing LLM-Augmented Harnesses	Georgios Androutsopoulos et.al.	2506.15648	null
2025-06-18	Exploring and Exploiting the Inherent Efficiency within Large Reasoning Models for Self-Guided Efficiency Enhancement	Weixiang Zhao et.al.	2506.15647	null
2025-06-18	Demystifying the Visual Quality Paradox in Multimodal Large Language Models	Shuo Xing et.al.	2506.15645	null
2025-06-18	FindingDory: A Benchmark to Evaluate Memory in Embodied Agents	Karmesh Yadav et.al.	2506.15635	null
2025-06-18	Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability	Yusuke Sakai et.al.	2506.15629	null
2025-06-18	The Effect of State Representation on LLM Agent Behavior in Dynamic Routing Games	Lyle Goodyear et.al.	2506.15624	null
2025-06-18	The Compositional Architecture of Regret in Large Language Models	Xiangxiang Cui et.al.	2506.15617	null
2025-06-18	BoxFusion: Reconstruction-Free Open-Vocabulary 3D Object Detection via Real-Time Multi-View Box Fusion	Yuqing Lan et.al.	2506.15610	null
2025-06-18	LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning	Gabrel J. Perin et.al.	2506.15606	link
2025-06-18	LiteGD: Lightweight and dynamic GPU Dispatching for Large-scale Heterogeneous Clusters	Kunming Zhang et.al.	2506.15595	null
2025-06-18	WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts	Negar Foroutan et.al.	2506.15594	link
2025-06-18	DiscoSG: Towards Discourse-Level Text Scene Graph Parsing through Iterative Graph Refinement	Shaoqing Lin et.al.	2506.15583	link
2025-06-17	A Variational Framework for Improving Naturalness in Generative Spoken Language Models	Li-Wei Chen et.al.	2506.14767	link
2025-06-17	ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM	Yujun Wang et.al.	2506.14766	null
2025-06-17	Scaling-Up the Pretraining of the Earth Observation Foundation Model PhilEO to the MajorTOM Dataset	Nikolaos Dionelis et.al.	2506.14765	link
2025-06-17	RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills	Chunru Lin et.al.	2506.14763	null
2025-06-17	From Bytes to Ideas: Language Modeling with Autoregressive U-Nets	Mathurin Videau et.al.	2506.14761	link
2025-06-17	Reasoning with Exploration: An Entropy Perspective	Daixuan Cheng et.al.	2506.14758	null
2025-06-17	Large Language Models -- the Future of Fundamental Physics?	Caroline Heneka et.al.	2506.14757	null
2025-06-17	Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs	Ring Team et.al.	2506.14731	null
2025-06-17	AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes	Jiahao Qiu et.al.	2506.14728	null
2025-06-17	Casper: Inferring Diverse Intents for Assistive Teleoperation with Vision Language Models	Huihan Liu et.al.	2506.14727	null
2025-06-17	Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data	Anton Changalidis et.al.	2506.14704	link
2025-06-17	AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions	Aishan Liu et.al.	2506.14697	null
2025-06-17	Unified Software Engineering agent as AI Software Engineer	Leonhard Applis et.al.	2506.14683	null
2025-06-17	AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models	Ads Dawson et.al.	2506.14682	link
2025-06-17	Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality	Yuto Harada et.al.	2506.14681	null
2025-06-17	Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models	Ling Li et.al.	2506.14674	null
2025-06-17	StreetLens: Enabling Human-Centered AI Agents for Neighborhood Assessment from Street View Imagery	Jina Kim et.al.	2506.14670	null
2025-06-17	GuiLoMo: Allocating Expert Number and Rank for LoRA-MoE via Bilevel Optimization with GuidedSelection Vectors	Hengyuan Zhang et.al.	2506.14646	link
2025-06-17	Passing the Turing Test in Political Discourse: Fine-Tuning LLMs to Mimic Polarized Social Media Comments	. Pazzaglia et.al.	2506.14645	null
2025-06-17	Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot	Xiang Cheng et.al.	2506.14641	null
2025-06-16	Touch begins where vision ends: Generalizable policies for contact-rich manipulation	Zifan Zhao et.al.	2506.13762	null
2025-06-16	Prompting with the Future: Open-World Model Predictive Control with Interactive Digital Twins	Chuanruo Ning et.al.	2506.13761	null
2025-06-16	Discrete Diffusion in Large Language and Multimodal Models: A Survey	Runpeng Yu et.al.	2506.13759	link
2025-06-16	AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning	Zewei Zhou et.al.	2506.13757	link
2025-06-16	Steering LLM Thinking with Budget Guidance	Junyan Li et.al.	2506.13752	link
2025-06-16	Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability	Shova Kuikel et.al.	2506.13746	link
2025-06-16	Instruction Following by Boosting Attention of Large Language Models	Vitoria Guardieiro et.al.	2506.13734	null
2025-06-16	Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs	Sayed Mohammad Vakilzadeh Hatefi et.al.	2506.13727	link
2025-06-16	Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models	Arjun Krishna et.al.	2506.13726	null
2025-06-16	OTFusion: Bridging Vision-only and Vision-Language Models via Optimal Transport for Transductive Zero-Shot Learning	Qiyu Xu et.al.	2506.13723	null
2025-06-16	TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning	Junru Zhang et.al.	2506.13705	link
2025-06-16	Value-Free Policy Optimization via Reward Partitioning	Bilal Faye et.al.	2506.13702	link
2025-06-16	Balancing Knowledge Delivery and Emotional Comfort in Healthcare Conversational Systems	Shang-Chi Tsai et.al.	2506.13692	null
2025-06-16	What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers	Pulkit Gopalani et.al.	2506.13688	link
2025-06-16	An LLM's Apology: Outsourcing Awkwardness in the Age of AI	Twm Stone et.al.	2506.13685	link
2025-06-16	Turning Down the Heat: A Critical Analysis of Min-p Sampling in Language Models	Rylan Schaeffer et.al.	2506.13681	null
2025-06-16	ROSA: Harnessing Robot States for Vision-Language and Action Alignment	Yuqing Wen et.al.	2506.13679	null
2025-06-16	Prefix-Tuning+: Modernizing Prefix-Tuning through Attention Independent Prefix Data	Haonan Wang et.al.	2506.13674	null
2025-06-16	We Should Identify and Mitigate Third-Party Safety Risks in MCP-Powered Agent Systems	Junfeng Fang et.al.	2506.13666	link
2025-06-16	DesignCoder: Hierarchy-Aware and Self-Correcting UI Code Generation with Large Language Models	Yunnong Chen et.al.	2506.13663	null
2025-06-13	EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction	Hsi-Che Lin et.al.	2506.12015	null
2025-06-13	code_transformed: The Influence of Large Language Models on Code	Yuliang Xu et.al.	2506.12014	null
2025-06-13	Tracing LLM Reasoning Processes with Strategic Games: A Framework for Planning, Revision, and Resource-Constrained Decision Making	Xiaopeng Yuan et.al.	2506.12012	null
2025-06-13	Affogato: Learning Open-Vocabulary Affordance Grounding with Automated Data Generation at Scale	Junha Lee et.al.	2506.12009	null
2025-06-13	Generative Representational Learning of Foundation Models for Recommendation	Zheli Zhou et.al.	2506.11999	null
2025-06-13	pLSTM: parallelizable Linear Source Transition Mark networks	Korbinian Pöppel et.al.	2506.11997	null
2025-06-13	VGR: Visual Grounded Reasoning	Jiacong Wang et.al.	2506.11991	null
2025-06-13	How Visual Representations Map to Language Feature Space in Multimodal LLMs	Constantin Venhoff et.al.	2506.11976	null
2025-06-13	Improving Large Language Model Safety with Contrastive Representation Learning	Samuel Simko et.al.	2506.11938	link
2025-06-13	Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback	Dongwei Jiang et.al.	2506.11930	null
2025-06-13	LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?	Zihan Zheng et.al.	2506.11928	null
2025-06-13	GeistBERT: Breathing Life into German NLP	Raphael Scheible-Schmitt et.al.	2506.11903	null
2025-06-13	Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache	Xiaoran Liu et.al.	2506.11886	null
2025-06-13	Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment	Alejandro Peña et.al.	2506.11880	null
2025-06-13	A Short Survey on Formalising Software Requirements using Large Language Models	Arshad Beg et.al.	2506.11874	null
2025-06-13	Post Persona Alignment for Multi-Session Dialogue Generation	Yi-Pei Chen et.al.	2506.11857	null
2025-06-13	TrustGLM: Evaluating the Robustness of GraphLLMs Against Prompt, Text, and Structure Attacks	Qihai Zhang et.al.	2506.11844	null
2025-06-13	Your Ride, Your Rules: Psychology and Cognition Enabled Automated Driving Systems	Zhipeng Bao et.al.	2506.11842	null
2025-06-13	CLEAN-MI: A Scalable and Efficient Pipeline for Constructing High-Quality Neurodata in Motor Imagery Paradigm	Dingkun Liu et.al.	2506.11830	null
2025-06-13	Revealing Political Bias in LLMs through Structured Multi-Agent Debate	Aishwarya Bandaru et.al.	2506.11825	link
2025-06-12	AutoMind: Adaptive Knowledgeable Agent for Automated Data Science	Yixin Ou et.al.	2506.10974	link
2025-06-12	Farseer: A Refined Scaling Law in Large Language Models	Houyi Li et.al.	2506.10972	link
2025-06-12	Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs	Qizhe Zhang et.al.	2506.10967	link
2025-06-12	GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation	Ning Gao et.al.	2506.10966	null
2025-06-12	ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark	Kangwei Liu et.al.	2506.10960	link
2025-06-12	Distillation of atomistic foundation models across architectures and chemical domains	John L. A. Gardner et.al.	2506.10956	link
2025-06-12	SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks	Lianghong Guo et.al.	2506.10954	link
2025-06-12	Build the web for agents, not agents for the web	Xing Han Lù et.al.	2506.10953	null
2025-06-12	Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training	Mozhi Zhang et.al.	2506.10952	null
2025-06-12	Execution Guided Line-by-Line Code Generation	Boaz Lavon et.al.	2506.10948	link
2025-06-12	GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models	Evelyn Ma et.al.	2506.10946	null
2025-06-12	Self-Adapting Language Models	Adam Zweiger et.al.	2506.10943	null
2025-06-12	Dynamic Epistemic Friction in Dialogue	Timothy Obiso et.al.	2506.10934	null
2025-06-12	The Role of Generative AI in Facilitating Social Interactions: A Scoping Review	T. T. J. E. Arets et.al.	2506.10927	null
2025-06-12	Robustly Improving LLM Fairness in Realistic Settings via Interpretability	Adam Karvonen et.al.	2506.10922	link
2025-06-12	Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization	Or Shafran et.al.	2506.10920	link
2025-06-12	Sequential-Parallel Duality in Prefix Scannable Models	Morris Yau et.al.	2506.10918	null
2025-06-12	Foundation Models for Causal Inference via Prior-Data Fitted Networks	Yuchen Ma et.al.	2506.10914	null
2025-06-12	Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?	Fei Lin et.al.	2506.10912	null
2025-06-12	NoLoCo: No-all-reduce Low Communication Training Method for Large Models	Jari Kolehmainen et.al.	2506.10911	link
2025-06-11	Flipping Against All Odds: Reducing LLM Coin Flip Bias via Verbalized Rejection Sampling	Tim Z. Xiao et.al.	2506.09998	null
2025-06-11	From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring	Yang Li et.al.	2506.09996	null
2025-06-11	Large Language Models for Toxic Language Detection in Low-Resource Balkan Languages	Amel Muminovic et.al.	2506.09992	link
2025-06-11	Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation	Xinyu Yang et.al.	2506.09991	null
2025-06-11	EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits	Ron Yosef et.al.	2506.09988	null
2025-06-11	A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs	Benno Krojer et.al.	2506.09987	null
2025-06-11	V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning	Mido Assran et.al.	2506.09985	link
2025-06-11	Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs	Hiroshi Matsuda et.al.	2506.09983	link
2025-06-11	AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation	Zijie Wu et.al.	2506.09982	null
2025-06-11	SRLAgent: Enhancing Self-Regulated Learning Skills through Gamification and LLM Assistance	Wentao Ge et.al.	2506.09968	null
2025-06-11	Resa: Transparent Reasoning Models via SAEs	Shangshang Wang et.al.	2506.09967	link
2025-06-11	Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing	Junfei Wu et.al.	2506.09965	link
2025-06-11	Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy	Sushant Gautam et.al.	2506.09958	null
2025-06-11	LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge	Sahar Abdelnabi et.al.	2506.09956	link
2025-06-11	Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking	Wuwei Zhang et.al.	2506.09944	link
2025-06-11	VerIF: Verification Engineering for Reinforcement Learning in Instruction Following	Hao Peng et.al.	2506.09942	link
2025-06-11	From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models	Irving Fang et.al.	2506.09930	null
2025-06-11	PersonaLens: A Benchmark for Personalization Evaluation in Conversational AI Assistants	Zheng Zhao et.al.	2506.09902	link
2025-06-11	The Emergence of Abstract Thought in Large Language Models Beyond Any Language	Yuxin Chen et.al.	2506.09890	null
2025-06-11	Attention Head Embeddings with Trainable Deep Kernels for Hallucination Detection in LLMs	Rodion Oblovatny et.al.	2506.09886	null
2025-06-10	VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning	Li Kang et.al.	2506.09049	null
2025-06-10	Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs	Yaniv Nikankin et.al.	2506.09047	link
2025-06-10	Agentic Neural Networks: Self-Evolving Multi-Agent Systems via Textual Backpropagation	Xiaowen Ma et.al.	2506.09046	null
2025-06-10	Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models	Xuanchi Ren et.al.	2506.09042	link
2025-06-10	Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better	Dianyi Wang et.al.	2506.09040	link
2025-06-10	AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions	Polina Kirichenko et.al.	2506.09038	link
2025-06-10	FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed	Sizhe Dang et.al.	2506.09034	null
2025-06-10	Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning	Haozhen Zhang et.al.	2506.09033	link
2025-06-10	Do MIL Models Transfer?	Daniel Shao et.al.	2506.09022	link
2025-06-10	SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning	Ruiqi Zhang et.al.	2506.09016	link
2025-06-10	Learning to Reason Across Parallel Samples for LLM Reasoning	Jianing Qi et.al.	2506.09014	null
2025-06-10	Boosting Rust Unit Test Coverage through Hybrid Program Analysis and Large Language Models	Bei Chu et.al.	2506.09002	null
2025-06-10	Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models	Chenyu Lian et.al.	2506.08990	link
2025-06-10	SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning	Xiao Liang et.al.	2506.08989	link
2025-06-10	On Finetuning Tabular Foundation Models	Ivan Rubachev et.al.	2506.08982	link
2025-06-10	AdaDec: Uncertainty-Guided Adaptive Decoding for LLM-based Code Generation	Kaifeng He et.al.	2506.08980	null
2025-06-10	Propositional Logic for Probing Generalization in Neural Networks	Anna Langedijk et.al.	2506.08978	null
2025-06-10	Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Scheduling System	Yuan Guo et.al.	2506.08972	null
2025-06-10	ADAM: Autonomous Discovery and Annotation Model using LLMs for Context-Aware Annotations	Amirreza Rouhi et.al.	2506.08968	null
2025-06-10	Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model	Ailin Huang et.al.	2506.08967	null
2025-06-09	GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior	Penghao Wu et.al.	2506.08012	null
2025-06-09	Play to Generalize: Learning to Reason Through Game Play	Yunfei Xie et.al.	2506.08011	link
2025-06-09	Vision Transformers Don't Need Trained Registers	Nick Jiang et.al.	2506.08010	link
2025-06-09	Hidden in plain sight: VLMs overlook their visual representations	Stephanie Fu et.al.	2506.08008	null
2025-06-09	Reinforcement Pre-Training	Qingxiu Dong et.al.	2506.08007	null
2025-06-09	Reparameterized LLM Training via Orthogonal Equivalence Transformation	Zeju Qiu et.al.	2506.08001	null
2025-06-09	Supporting Construction Worker Well-Being with a Multi-Agent Conversational AI System	Fan Yang et.al.	2506.07997	null
2025-06-09	HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization	Hongzheng Chen et.al.	2506.07972	link
2025-06-09	CyberV: Cybernetics for Test-time Scaling in Video Understanding	Jiahao Meng et.al.	2506.07971	link
2025-06-09	SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence	Ziyang Gong et.al.	2506.07966	link
2025-06-09	Reinforcing Multimodal Understanding and Generation with Dual Self-rewards	Jixiang Hong et.al.	2506.07963	null
2025-06-09	Correlated Errors in Large Language Models	Elliot Kim et.al.	2506.07962	null
2025-06-09	BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models	Peiyan Li et.al.	2506.07961	null
2025-06-09	Language Models over Canonical Byte-Pair Encodings	Tim Vieira et.al.	2506.07956	null
2025-06-09	TokenBreak: Bypassing Text Classification Models Through Token Manipulation	Kasimir Schulz et.al.	2506.07948	null
2025-06-09	Statistical Hypothesis Testing for Auditing Robustness in Language Models	Paulius Rauba et.al.	2506.07947	null
2025-06-09	ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols	Arnav Sheth et.al.	2506.07945	link
2025-06-09	Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations	Yizhen Li et.al.	2506.07943	null
2025-06-09	Adversarial Attack Classification and Robustness Testing for Large Language Models for Code	Yang Liu et.al.	2506.07942	null
2025-06-09	Gradients: When Markets Meet Fine-tuning -- A Distributed Approach to Model Optimisation	Christopher Subia-Waud et.al.	2506.07940	null
2025-06-06	TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation	Muhammad Sohail Danish et.al.	2506.06281	null
2025-06-06	Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias	Yuanzhe Hu et.al.	2506.06280	null
2025-06-06	CoMemo: LVLMs Need Image Context with Image Memory	Shi Liu et.al.	2506.06279	null
2025-06-06	Movie Facts and Fibs (MF $^2$ ): A Benchmark for Long Movie Understanding	Emmanouil Zaranis et.al.	2506.06275	null
2025-06-06	AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization	Mukur Gupta et.al.	2506.06273	null
2025-06-06	RecGPT: A Foundation Model for Sequential Recommendation	Yangqin Jiang et.al.	2506.06270	link
2025-06-06	Cartridges: Lightweight and general-purpose long context representations via self-study	Sabri Eyuboglu et.al.	2506.06266	null
2025-06-06	PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time	Weizhi Zhang et.al.	2506.06254	null
2025-06-06	DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation	Jingyu Xiao et.al.	2506.06251	link
2025-06-06	Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models	Zahra Babaiee et.al.	2506.06242	null
2025-06-06	Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge	Yi Sui et.al.	2506.06240	null
2025-06-06	Explaining Matters: Leveraging Definitions and Semantic Expansion for Sexism Detection	Sahrish Khan et.al.	2506.06238	null
2025-06-06	Challenging Vision-Language Models with Surgical Data: A New Dataset and Broad Benchmarking Study	Leon Mayer et.al.	2506.06232	null
2025-06-06	CompilerGPT: Leveraging Large Language Models for Analyzing and Acting on Compiler Optimization Reports	Peter Pirkelbauer et.al.	2506.06227	null
2025-06-06	PROVSYN: Synthesizing Provenance Graphs for Data Augmentation in Intrusion Detection Systems	Yi Huang et.al.	2506.06226	null
2025-06-06	GenIR: Generative Visual Feedback for Mental Image Retrieval	Diji Yang et.al.	2506.06220	null
2025-06-06	STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving	Christian Fruhwirth-Reisinger et.al.	2506.06218	link
2025-06-06	Corrector Sampling in Language Models	Itai Gat et.al.	2506.06215	null
2025-06-06	Can Theoretical Physics Research Benefit from Language Agents?	Sirui Lu et.al.	2506.06214	null
2025-06-06	PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts	Hengzhi Li et.al.	2506.06211	null
2025-06-05	Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets	Lei Hsiung et.al.	2506.05346	null
2025-06-05	SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs	Jiahui Wang et.al.	2506.05344	link
2025-06-05	Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning	Xingjian Ran et.al.	2506.05341	null
2025-06-05	Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models	Anirudh Bharadwaj et.al.	2506.05339	link
2025-06-05	VideoMolmo: Spatio-Temporal Grounding Meets Pointing	Ghazi Shazan Ahmad et.al.	2506.05336	link
2025-06-05	Search Arena: Analyzing Search-Augmented LLMs	Mihran Miroyan et.al.	2506.05334	link
2025-06-05	Unleashing Hour-Scale Video Training for Long Video-Language Understanding	Jingyang Lin et.al.	2506.05332	null
2025-06-05	MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning	Xinyan Chen et.al.	2506.05331	link
2025-06-05	LSM-2: Learning from Incomplete Wearable Sensor Data	Maxwell A. Xu et.al.	2506.05321	null
2025-06-06	Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs	Haoyuan Li et.al.	2506.05318	null
2025-06-05	Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay	Yifan Sun et.al.	2506.05316	null
2025-06-05	Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models	Taha Entesari et.al.	2506.05314	null
2025-06-05	ProRefine: Inference-time Prompt Refinement with Textual Feedback	Deepak Pandita et.al.	2506.05305	null
2025-06-05	Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos	Weifeng Lin et.al.	2506.05302	null
2025-06-05	Power Law Guided Dynamic Sifting for Efficient Attention	Nirav Koley et.al.	2506.05300	null
2025-06-05	Control Tax: The Price of Keeping AI in Check	Mikhail Terekhov et.al.	2506.05296	null
2025-06-05	Sample Complexity and Representation Ability of Test-time Scaling Paradigms	Baihe Huang et.al.	2506.05295	null
2025-06-05	EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?	Yuqian Yuan et.al.	2506.05287	null
2025-06-05	Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning	Nan Huo et.al.	2506.05278	null
2025-06-05	Teaming in the AI Era: AI-Augmented Frameworks for Forming, Simulating, and Optimizing Human Teams	Mohammed Almutairi et.al.	2506.05265	null
2025-06-04	OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis	Junting Chen et.al.	2506.04217	link
2025-06-04	Language-Image Alignment with Fixed Text Encoders	Jingfeng Yang et.al.	2506.04209	null
2025-06-04	Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning	Shuang Chen et.al.	2506.04207	null
2025-06-04	EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation	Jinghan Jia et.al.	2506.04205	link
2025-06-04	Cascadia: A Cascade Serving System for Large Language Models	Youhe Jiang et.al.	2506.04203	null
2025-06-04	TracLLM: A Generic Framework for Attributing Long Context LLMs	Yanting Wang et.al.	2506.04202	link
2025-06-04	R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning	Qingfei Zhao et.al.	2506.04185	link
2025-06-04	SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models	Yuhao Wu et.al.	2506.04180	null
2025-06-04	SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling	Anhao Zhao et.al.	2506.04179	null
2025-06-04	Does Prompt Design Impact Quality of Data Imputation by LLMs?	Shreenidhi Srinivasan et.al.	2506.04172	null
2025-06-04	VISCA: Inferring Component Abstractions for Automated End-to-End Testing	Parsa Alian et.al.	2506.04161	null
2025-06-04	Image Editing As Programs with Diffusion Models	Yujia Hu et.al.	2506.04158	null
2025-06-04	A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization	Sarvesh Soni et.al.	2506.04156	null
2025-06-04	Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis	Kejian Zhu et.al.	2506.04142	null
2025-06-04	MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos	Kejian Zhu et.al.	2506.04141	null
2025-06-04	TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems	Shaina Raza et.al.	2506.04133	null
2025-06-04	Recent Advances in Medical Image Classification	Loan Dao et.al.	2506.04129	null
2025-06-04	Guided Speculative Inference for Efficient Test-Time Alignment of LLMs	Jonathan Geuter et.al.	2506.04118	link
2025-06-05	Rectified Sparse Attention	Yutao Sun et.al.	2506.04108	null
2025-06-04	TextAtari: 100K Frames Game Playing with Language Agents	Wenhao Li et.al.	2506.04098	link
2025-06-03	Causal Estimation of Tokenisation Bias	Pietro Lesci et.al.	2506.03149	null
2025-06-03	UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation	Bin Lin et.al.	2506.03147	null
2025-06-03	Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM	Pralaypati Ta et.al.	2506.03145	null
2025-06-03	Not All Tokens Are Meant to Be Forgotten	Xiangyu Zhou et.al.	2506.03142	null
2025-06-03	SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation	Siqi Chen et.al.	2506.03139	null
2025-06-03	OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models	Mengdi Jia et.al.	2506.03135	null
2025-06-03	Native-Resolution Image Synthesis	Zidong Wang et.al.	2506.03131	null
2025-06-03	AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation	Lu Qiu et.al.	2506.03126	null
2025-06-03	AUTOCIRCUIT-RL: Reinforcement Learning-Driven LLM for Automated Circuit Topology Generation	Prashanth Vijayaraghavan et.al.	2506.03122	null
2025-06-03	Targeted Forgetting of Image Subgroups in CLIP Models	Zeliang Zhang et.al.	2506.03117	null
2025-06-04	Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback	Xiaoying Zhang et.al.	2506.03106	null
2025-06-03	Beyond Text Compression: Evaluating Tokenizers Across Scales	Jonas F. Lotz et.al.	2506.03101	null
2025-06-03	TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models	Chetwin Low et.al.	2506.03099	null
2025-06-03	EgoVLM: Policy Optimization for Egocentric Video Understanding	Ashwin Vinod et.al.	2506.03097	link
2025-06-03	DPO Learning with LLMs-Judge Signal for Computer Use Agents	Man Luo et.al.	2506.03095	null
2025-06-03	From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit	Valérie Costa et.al.	2506.03093	null
2025-06-03	Literary Evidence Retrieval via Long-Context Language Models	Katherine Thai et.al.	2506.03090	null
2025-06-03	StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs	Qijun Luo et.al.	2506.03077	null
2025-06-03	LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM	Roman Titkov et.al.	2506.03073	null
2025-06-03	EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models	Mingzhe Li et.al.	2506.03067	null
2025-05-30	ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL	Yu Zhang et.al.	2505.24875	null
2025-05-30	The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models	Adam Stein et.al.	2505.24874	link
2025-05-30	ProxyThinker: Test-Time Guidance through Small Visual Reasoners	Zilin Xiao et.al.	2505.24872	link
2025-05-30	MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning	Yiqing Liang et.al.	2505.24871	null
2025-05-30	GenSpace: Benchmarking Spatially-Aware Image Generation	Zehan Wang et.al.	2505.24870	null
2025-05-30	SiLVR: A Simple Language-based Video Reasoning Framework	Ce Zhang et.al.	2505.24869	link
2025-05-30	Time Blindness: Why Video-Language Models Can't See What Humans Can?	Ujjwal Upadhyay et.al.	2505.24867	null
2025-05-30	ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models	Mingjie Liu et.al.	2505.24864	link
2025-05-30	Beyond Multiple Choice: Evaluating Steering Vectors for Adaptive Free-Form Summarization	Joschka Braun et.al.	2505.24859	null
2025-05-30	Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking	Heli Ben-Hamu et.al.	2505.24857	null
2025-05-30	MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning	Jingyan Shen et.al.	2505.24846	null
2025-05-30	Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning	Wanyun Xie et.al.	2505.24844	link
2025-05-30	Cascading Adversarial Bias from Injection to Distillation in Language Models	Harsh Chaudhari et.al.	2505.24842	null
2025-05-30	Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck	Yuwen Tan et.al.	2505.24840	null
2025-05-30	VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software	Brandon Man et.al.	2505.24838	link
2025-06-02	How much do language models memorize?	John X. Morris et.al.	2505.24832	null
2025-05-30	Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs	Juraj Vladika et.al.	2505.24830	null
2025-05-30	LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal Text	Li yunhan et.al.	2505.24826	link
2025-05-30	PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models	Yinggan Xu et.al.	2505.24823	null
2025-05-30	Bi-Manual Joint Camera Calibration and Scene Representation	Haozhan Tang et.al.	2505.24819	null
2025-05-29	TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models	Yao Xiao et.al.	2505.23769	link
2025-05-29	Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought	Yunze Man et.al.	2505.23766	null
2025-05-29	From Chat Logs to Collective Insights: Aggregative Question Answering	Wentao Zhang et.al.	2505.23765	null
2025-05-29	MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence	Sihan Yang et.al.	2505.23764	null
2025-05-29	ZeroGUI: Automating Online GUI Learning at Zero Human Cost	Chenyu Yang et.al.	2505.23762	link
2025-05-29	Differential Information: An Information-Theoretic Perspective on Preference Optimization	Yunjae Won et.al.	2505.23761	null
2025-05-29	Puzzled by Puzzles: When Vision-Language Models Can't Take a Hint	Heekyung Lee et.al.	2505.23759	link
2025-05-29	DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning	Ziyin Zhang et.al.	2505.23754	link
2025-05-29	ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks	Akashah Shabbir et.al.	2505.23752	link
2025-05-29	Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences?	Paul Gölz et.al.	2505.23749	null
2025-05-29	Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence	Diankun Wu et.al.	2505.23747	null
2025-05-29	To Trust Or Not To Trust Your Vision-Language Model's Prediction	Hao Dong et.al.	2505.23745	link
2025-05-29	LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization	Ronghuan Wu et.al.	2505.23740	null
2025-05-29	ATLAS: Learning to Optimally Memorize the Context at Test Time	Ali Behrouz et.al.	2505.23735	null
2025-05-29	Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time	Mohamad Chehade et.al.	2505.23729	null
2025-05-29	PixelThink: Towards Efficient Chain-of-Pixel Reasoning	Song Wang et.al.	2505.23727	null
2025-05-29	FMG-Det: Foundation Model Guided Robust Object Detection	Darryl Hannan et.al.	2505.23726	null
2025-05-29	MuLoCo: Muon is a practical inner optimizer for DiLoCo	Benjamin Thérien et.al.	2505.23725	null
2025-05-29	SC-LoRA: Balancing Efficient Fine-tuning and Knowledge Preservation via Subspace-Constrained LoRA	Minrui Luo et.al.	2505.23724	null
2025-05-29	ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering	Zexi Liu et.al.	2505.23723	link
2025-05-28	Zero-Shot Vision Encoder Grafting via LLM Surrogates	Kaiyu Yue et.al.	2505.22664	link
2025-05-28	Training Free Stylized Abstraction	Aimon Rahman et.al.	2505.22663	null
2025-05-28	AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models	Feng Luo et.al.	2505.22662	null
2025-05-28	GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning	Qingchen Yu et.al.	2505.22661	null
2025-05-28	Maximizing Confidence Alone Improves Reasoning	Mihir Prabhudesai et.al.	2505.22660	null
2025-05-28	3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model	Wenbo Hu et.al.	2505.22657	null
2025-05-28	Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents	Michael Kirchhof et.al.	2505.22655	null
2025-05-28	VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models	Ce Zhang et.al.	2505.22654	null
2025-05-28	The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason	Ang Lv et.al.	2505.22653	null
2025-05-28	Sherlock: Self-Correcting Reasoning in Vision-Language Models	Yi Ding et.al.	2505.22651	null
2025-05-28	Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese	Hanjia Lyu et.al.	2505.22645	link
2025-05-28	Understanding (Un)Reliability of Steering Vectors in Language Models	Joschka Braun et.al.	2505.22637	null
2025-05-28	Learning Composable Chains-of-Thought	Fangcong Yin et.al.	2505.22635	null
2025-05-28	Spatial Knowledge Graph-Guided Multimodal Synthesis	Yida Xue et.al.	2505.22633	null
2025-05-28	Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs	Ziling Cheng et.al.	2505.22630	null
2025-05-28	Principled Out-of-Distribution Generalization via Simplicity	Jiawei Ge et.al.	2505.22622	null
2025-05-28	Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding	Chengyue Wu et.al.	2505.22618	null
2025-05-28	The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models	Ganqu Cui et.al.	2505.22617	null
2025-05-28	RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction	Yuchi Wang et.al.	2505.22613	null
2025-05-28	Effective and Efficient One-pass Compression of Speech Foundation Models Using Sparsity-aware Self-pinching Gates	Haoning Xu et.al.	2505.22608	null
2025-05-27	Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making	Yihan Wang et.al.	2505.21503	null
2025-05-27	ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models	Dingming Li et.al.	2505.21500	null
2025-05-27	AdInject: Real-World Black-Box Attacks on Web Agents via Advertising Delivery	Haowei Wang et.al.	2505.21499	link
2025-05-27	Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment	Xiaojun Jia et.al.	2505.21494	link
2025-05-27	Reinforcing General Reasoning without Verifiers	Xiangxin Zhou et.al.	2505.21493	link
2025-05-27	Robust Hypothesis Generation: LLM-Automated Language Bias for Inductive Logic Programming	Yang Yang et.al.	2505.21486	null
2025-05-27	Are Language Models Consequentialist or Deontological Moral Reasoners?	Keenan Samway et.al.	2505.21479	null
2025-05-27	Policy Optimized Text-to-Image Pipeline Design	Uri Gadot et.al.	2505.21478	null
2025-05-27	Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration	Mehrdad Fazli et.al.	2505.21472	null
2025-05-27	Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration	Zijun Liu et.al.	2505.21471	link
2025-05-27	Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion	Zhanqiu Hu et.al.	2505.21467	null
2025-05-27	ID-Align: RoPE-Conscious Position Remapping for Dynamic High-Resolution Adaptation in Vision-Language Models	Bozhou Li et.al.	2505.21465	null
2025-05-27	LazyVLM: Neuro-Symbolic Approach to Video Analytics	Xiangru Jian et.al.	2505.21459	null
2025-05-27	Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance	Shintaro Ozaki et.al.	2505.21458	null
2025-05-27	Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO	Muzhi Zhu et.al.	2505.21457	null
2025-05-27	Can Large Reasoning Models Self-Train?	Sheikh Shafayat et.al.	2505.21444	null
2025-05-27	Towards Better Instruction Following Retrieval Models	Yuchen Zhuang et.al.	2505.21439	null
2025-05-27	Hume: Introducing System-2 Thinking in Visual-Language-Action Model	Haoming Song et.al.	2505.21432	null
2025-05-27	Policy Induction: Predicting Startup Success via Explainable Memory-Augmented In-Context Learning	Xianling Mu et.al.	2505.21427	null
2025-05-27	GUARD:Dual-Agent based Backdoor Defense on Chain-of-Thought in Neural Code Generation	Naizhu Jin et.al.	2505.21425	null
2025-05-26	Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs	Hanting Chen et.al.	2505.20155	null
2025-05-26	UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models	Xueyan Zhang et.al.	2505.20154	null
2025-05-26	MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents	Ziming Wei et.al.	2505.20148	link
2025-05-26	FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities	Jin Wang et.al.	2505.20147	null
2025-05-26	SeMe: Training-Free Language Model Merging via Semantic Alignment	Jian Gu et.al.	2505.20144	null
2025-05-26	StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs	Jialin Yang et.al.	2505.20139	null
2025-05-26	AweDist: Attention-aware Embedding Distillation for New Input Token Embeddings	Konstantin Dobler et.al.	2505.20133	null
2025-05-26	Agentic 3D Scene Generation with Spatially Contextualized VLMs	Xinhang Liu et.al.	2505.20129	null
2025-05-26	Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers	Zhengliang Shi et.al.	2505.20128	link
2025-05-26	Agentic AI Process Observability: Discovering Behavioral Variability	Fabiana Fournier et.al.	2505.20127	null
2025-05-26	MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models	Anh Thai et.al.	2505.20122	null
2025-05-27	TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent	Dominik Meier et.al.	2505.20118	link
2025-05-26	Named Entity Recognition in Historical Italian: The Case of Giacomo Leopardi's Zibaldone	Cristian Santini et.al.	2505.20113	null
2025-05-26	ResSVD: Residual Compensated SVD for Large Language Model Compression	Haolei Bai et.al.	2505.20112	null
2025-05-26	Language-Agnostic Suicidal Risk Detection Using Large Language Models	June-Woo Kim et.al.	2505.20109	null
2025-05-26	Adaptive Deep Reasoning: Triggering Deep Thinking When Needed	Yunhao Wang et.al.	2505.20101	null
2025-05-26	AdaTP: Attention-Debiased Token Pruning for Video Large Language Models	Fengyuan Sun et.al.	2505.20100	null
2025-05-26	Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities	Chuangtao Ma et.al.	2505.20099	link
2025-05-26	S2LPP: Small-to-Large Prompt Prediction across LLMs	Liang Cheng et.al.	2505.20097	null
2025-05-26	Multi-Domain Explainability of Preferences	Nitay Calderon et.al.	2505.20088	null
2025-05-26	Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models	Makesh Narsimhan Sreedhar et.al.	2505.20087	null
2025-05-26	Inference-time Alignment in Continuous Space	Yige Yuan et.al.	2505.20081	link
2025-05-23	Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs	Wafa Alghallabi et.al.	2505.18152	link
2025-05-23	First Finish Search: Efficient Test-Time Scaling in Large Language Models	Aradhye Agarwal et.al.	2505.18149	null
2025-05-23	Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find	Owen Bianchi et.al.	2505.18148	null
2025-05-23	Graph-Linguistic Fusion: Using Language Models for Wikidata Vandalism Detection	Mykola Trokhymovych et.al.	2505.18136	null
2025-05-23	Gaming Tool Preferences in Agentic LLMs	Kazem Faghih et.al.	2505.18135	link
2025-05-23	VideoGameBench: Can Vision-Language Models complete popular video games?	Alex L. Zhang et.al.	2505.18134	null
2025-05-23	One RL to See Them All: Visual Triple Unified Reinforcement Learning	Yan Ma et.al.	2505.18129	null
2025-05-23	Reward Model Overoptimisation in Iterated RLHF	Lorenz Wolf et.al.	2505.18126	null
2025-05-23	TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations	Alan Arazi et.al.	2505.18125	null
2025-05-23	UNJOIN: Enhancing Multi-Table Text-to-SQL Generation via Schema Simplification	Poojah Ganesan et.al.	2505.18122	null
2025-05-23	ProgRM: Build Better GUI Agents with Progress Rewards	Danyang Zhang et.al.	2505.18121	null
2025-05-23	Bidirectional Knowledge Distillation for Enhancing Sequential Recommendation with Large Language Models	Jiongran Wu et.al.	2505.18120	null
2025-05-23	Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM	Zinuo Li et.al.	2505.18110	null
2025-05-23	ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework	Lisheng Huang et.al.	2505.18105	link
2025-05-23	How Can I Publish My LLM Benchmark Without Giving the True Answers Away?	Takashi Ishida et.al.	2505.18102	null
2025-05-23	Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL	Joey Hong et.al.	2505.18098	null
2025-05-23	QwenLong-CPRS: Towards $\infty$ -LLMs with Dynamic Context Optimization	Weizhou Shen et.al.	2505.18092	null
2025-05-23	Data Mixing Can Induce Phase Transitions in Knowledge Acquisition	Xinran Gu et.al.	2505.18091	null
2025-05-23	CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays	Hyungyung Lee et.al.	2505.18087	link
2025-05-23	Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding	Xiaoyi Zhang et.al.	2505.18079	null
2025-05-22	CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms	Shilin Yan et.al.	2505.17020	link
2025-05-22	Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning Framework	Chenhao Zhang et.al.	2505.17019	link
2025-05-22	SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward	Kaixuan Fan et.al.	2505.17018	link
2025-05-22	Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO	Chengzhuo Tong et.al.	2505.17017	link
2025-05-22	Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models	Runsen Xu et.al.	2505.17015	null
2025-05-22	SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding	Haoning Wu et.al.	2505.17012	link
2025-05-22	R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning	Huatong Song et.al.	2505.17005	link
2025-05-22	Do Large Language Models Excel in Complex Logical Reasoning with Formal Language?	Jin Jiang et.al.	2505.16998	link
2025-05-22	DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization	Chao Zhang et.al.	2505.16995	null
2025-05-22	Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding	Runpeng Yu et.al.	2505.16990	link
2025-05-22	T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning	Amartya Chakraborty et.al.	2505.16986	null
2025-05-22	UFT: Unifying Supervised and Reinforcement Fine-Tuning	Mingyang Liu et.al.	2505.16984	link
2025-05-22	LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding	Junlong Tong et.al.	2505.16983	link
2025-05-22	Beyond Correlation: Towards Causal Large Language Model Agents in Biomedicine	Adib Bazgir et.al.	2505.16982	null
2025-05-22	HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation	Weizhi Tang et.al.	2505.16978	link
2025-05-22	SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development	Yaxin Du et.al.	2505.16975	link
2025-05-22	CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark	Ahmed Heakl et.al.	2505.16968	link
2025-05-22	Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models	Junjie Xiong et.al.	2505.16957	null
2025-05-22	On Multilingual Encoder Language Model Compression for Low-Resource Languages	Daniil Gurgurov et.al.	2505.16956	null
2025-05-22	A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization	Shengyu Feng et.al.	2505.16952	null
2025-05-21	InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition	Yijie Zheng et.al.	2505.15818	link
2025-05-21	On the creation of narrow AI: hierarchy and nonlocality of neural network skills	Eric J. Michaud et.al.	2505.15811	link
2025-05-21	MMaDA: Multimodal Large Diffusion Language Models	Ling Yang et.al.	2505.15809	link
2025-05-21	The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation	Patrick Kahardipraja et.al.	2505.15807	link
2025-05-21	Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering	Hwan Chang et.al.	2505.15805	link
2025-05-21	STAR-R1: Spacial TrAnsformation Reasoning by Reinforcing Multimodal LLMs	Zongzhao Li et.al.	2505.15804	link
2025-05-21	VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models	Yuchen Yan et.al.	2505.15801	null
2025-05-21	Model Merging is Secretly Certifiable: Non-Vacuous Generalisation Bounds for Low-Shot Learning	Taehoon Kim et.al.	2505.15798	null
2025-05-21	Reverse Engineering Human Preferences with Reinforcement Learning	Lisa Alazraki et.al.	2505.15795	null
2025-05-21	HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving	Zhiwen Chen et.al.	2505.15793	null
2025-05-21	Large Language Models as Computable Approximations to Solomonoff Induction	Jun Wan et.al.	2505.15784	null
2025-05-21	dKV-Cache: The Cache for Diffusion Language Models	Xinyin Ma et.al.	2505.15781	link
2025-05-21	ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning	Changtai Zhu et.al.	2505.15776	link
2025-05-21	Beyond Hard and Soft: Hybrid Context Compression for Balancing Local and Global Information Retention	Huanxuan Liao et.al.	2505.15774	link
2025-05-21	MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling	Cheng Yifan et.al.	2505.15772	null
2025-05-21	An Empirical Analysis of Vulnerability Detection Tools for Solidity Smart Contracts Using Line Level Manually Annotated Vulnerabilities	Francesco Salzano et.al.	2505.15756	null
2025-05-21	Exploring The Visual Feature Space for Multimodal Neural Decoding	Weihao Xia et.al.	2505.15755	null
2025-05-21	Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval	Taiye Chen et.al.	2505.15753	null
2025-05-21	Multi-modal Integration Analysis of Alzheimer's Disease Using Large Language Models and Knowledge Graphs	Kanan Kiguchi et.al.	2505.15747	null
2025-05-21	Evolutionary Computation and Large Language Models: A Survey of Methods, Synergies, and Applications	Dikshit Chauhan et.al.	2505.15741	null
2025-05-20	Language Models use Lookbacks to Track Beliefs	Nikhil Prakash et.al.	2505.14685	null
2025-05-20	Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning	Haolei Xu et.al.	2505.14684	null
2025-05-20	Emerging Properties in Unified Multimodal Pretraining	Chaorui Deng et.al.	2505.14683	null
2025-05-20	UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation	Rui Tian et.al.	2505.14682	null
2025-05-20	UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models	Xiaojie Gu et.al.	2505.14679	link
2025-05-20	Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning	Jiaer Xia et.al.	2505.14677	null
2025-05-20	Reward Reasoning Model	Jiaxin Guo et.al.	2505.14674	null
2025-05-20	UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens	Ruichuan An et.al.	2505.14671	link
2025-05-20	Quartet: Native FP4 Training Can Be Optimal for Large Language Models	Roberto L. Castro et.al.	2505.14669	link
2025-05-20	ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions	Bufang Yang et.al.	2505.14668	null
2025-05-20	Beyond Words: Multimodal LLM Knows When to Speak	Zikai Liao et.al.	2505.14654	null
2025-05-20	General-Reasoner: Advancing LLM Reasoning Across All Domains	Xueguang Ma et.al.	2505.14652	null
2025-05-20	Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits	Tiantian Feng et.al.	2505.14648	link
2025-05-20	CAD-Coder: An Open-Source Vision-Language Model for Computer-Aided Design Code Generation	Anna C. Doris et.al.	2505.14646	link
2025-05-20	Think Only When You Need with Large Hybrid-Reasoning Models	Lingjie Jiang et.al.	2505.14631	null
2025-05-20	KERL: Knowledge-Enhanced Personalized Recipe Recommendation using Large Language Models	Fnu Mohbat et.al.	2505.14629	link
2025-05-20	Debating for Better Reasoning: An Unsupervised Multimodal Approach	Ashutosh Adhikari et.al.	2505.14627	null
2025-05-20	TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning	Zhangchen Xu et.al.	2505.14625	link
2025-05-20	Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs	Morgan Lindsay Heisler et.al.	2505.14620	null
2025-05-20	Linear Control of Test Awareness Reveals Differential Compliance in Reasoning Models	Sahar Abdelnabi et.al.	2505.14617	link
2025-05-19	CIE: Controlling Language Model Text Generations Using Continuous Signals	Vinay Samuel et.al.	2505.13448	link
2025-05-19	Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards	Xiaoyuan Liu et.al.	2505.13445	link
2025-05-19	ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models	Liyan Tang et.al.	2505.13444	null
2025-05-19	GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation	Abhay Deshpande et.al.	2505.13441	null
2025-05-19	Optimizing Anytime Reasoning via Budget Relative Policy Optimization	Penghui Qi et.al.	2505.13438	link
2025-05-19	SMOTExT: SMOTE meets Large Language Models	Mateusz Bystroński et.al.	2505.13434	null
2025-05-19	Fine-tuning Quantized Neural Networks with Zeroth-order Optimization	Sifeng Shang et.al.	2505.13430	link
2025-05-19	MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision	Lingxiao Du et.al.	2505.13427	link
2025-05-19	G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning	Liang Chen et.al.	2505.13426	link
2025-05-19	Learnware of Language Models: Specialized Small Language Models Can Do Big	Zhi-Hao Tan et.al.	2505.13425	link
2025-05-19	Make Still Further Progress: Chain of Thoughts for Tabular Data Leaderboard	Si-Yang Liu et.al.	2505.13421	null
2025-05-19	FEALLM: Advancing Facial Emotion Analysis in Multimodal Large Language Models with Emotional Synergy and Reasoning	Zhuozhao Hu et.al.	2505.13419	link
2025-05-19	CoT-Kinetics: A Theoretical Modeling Assessing LRM Reasoning Process	Jinhe Bi et.al.	2505.13408	null
2025-05-19	AutoMathKG: The automated mathematical knowledge graph based on LLM and vector database	Rong Bian et.al.	2505.13406	null
2025-05-19	MR. Judge: Multimodal Reasoner as a Judge	Renjie Pi et.al.	2505.13403	null
2025-05-19	R3: Robust Rubric-Agnostic Reward Models	David Anugraha et.al.	2505.13388	link
2025-05-19	CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition	Nam V. Nguyen et.al.	2505.13380	link
2025-05-19	Thinkless: LLM Learns When to Think	Gongfan Fang et.al.	2505.13379	link
2025-05-19	Seeing, Saying, Solving: An LLM-to-TL Framework for Cooperative Robots	Dan BW Choe et.al.	2505.13376	null
2025-05-19	Multi-Armed Bandits Meet Large Language Models	Djallel Bouneffouf et.al.	2505.13355	null
2025-05-16	Modeling cognitive processes of natural reading with transformer-based Language Models	Bruno Bianchi et.al.	2505.11485	null
2025-05-16	msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML	Zhaolan Huang et.al.	2505.11483	link
2025-05-16	Improving Assembly Code Performance with Large Language Models via Reinforcement Learning	Anjiang Wei et.al.	2505.11480	null
2025-05-16	HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages	Zhilin Wang et.al.	2505.11475	null
2025-05-16	Disentangling Reasoning and Knowledge in Medical Large Language Models	Rahul Thapa et.al.	2505.11462	null
2025-05-16	ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks	Zhixiong Zhuang et.al.	2505.11459	null
2025-05-16	LLMs unlock new paths to monetizing exploits	Nicholas Carlini et.al.	2505.11449	null
2025-05-16	Is Compression Really Linear with Code Intelligence?	Xianzhen Luo et.al.	2505.11441	null
2025-05-16	GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art	Chenkai Zhang et.al.	2505.11436	link
2025-05-16	MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production	Chao Jin et.al.	2505.11432	null
2025-05-16	Mergenetic: a Simple Evolutionary Model Merging Library	Adrian Robert Minut et.al.	2505.11427	link
2025-05-16	When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs	Xiaomin Li et.al.	2505.11423	null
2025-05-16	Towards Cultural Bridge by Bahnaric-Vietnamese Translation Using Transfer Learning of Sequence-To-Sequence Pre-training Language Model	Phan Tran Minh Dat et.al.	2505.11421	null
2025-05-16	EdgeWisePersona: A Dataset for On-Device User Profiling from Natural Language Interactions	Patryk Bartkowiak et.al.	2505.11417	link
2025-05-16	MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems	Yinsicheng Jiang et.al.	2505.11415	null
2025-05-16	CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs	Sijia Chen et.al.	2505.11413	null
2025-05-16	Visual Planning: Let's Think Only with Images	Yi Xu et.al.	2505.11409	link
2025-05-16	Large Language Model Use Impact Locus of Control	Jenny Xiyu Fu et.al.	2505.11406	null
2025-05-16	EmotionHallucer: Evaluating Emotion Hallucinations in Multimodal Large Language Models	Bohao Xing et.al.	2505.11405	link
2025-05-16	Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert Reasoner	Wenchuan Zhang et.al.	2505.11404	link
2025-05-15	End-to-End Vision Tokenizer Tuning	Wenxuan Wang et.al.	2505.10562	null
2025-05-15	Neural Thermodynamic Laws for Large Language Model Training	Ziming Liu et.al.	2505.10559	null
2025-05-15	Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data	Yiwen Liu et.al.	2505.10551	link
2025-05-15	Real-Time Out-of-Distribution Failure Prevention via Multi-Modal Reasoning	Milan Ganai et.al.	2505.10547	null
2025-05-15	Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models	Annie Wong et.al.	2505.10543	link
2025-05-15	Exploring Implicit Visual Misunderstandings in Multimodal Large Language Models through Attention Analysis	Pengfei Wang et.al.	2505.10541	link
2025-05-15	S3C2 Summit 2024-09: Industry Secure Software Supply Chain Summit	Imranur Rahman et.al.	2505.10538	null
2025-05-15	WorldPM: Scaling Human Preference Modeling	Binghai Wang et.al.	2505.10527	link
2025-05-15	MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models	Mugilan Ganesan et.al.	2505.10526	null
2025-05-15	Multi-Token Prediction Needs Registers	Anastasios Gerontopoulos et.al.	2505.10518	link
2025-05-15	RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs	Vibha Belavadi et.al.	2505.10495	null
2025-05-15	Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective	Yutao Mou et.al.	2505.10494	link
2025-05-15	CL-RAG: Bridging the Gap in Retrieval-Augmented Generation with Curriculum Learning	Shaohan Wang et.al.	2505.10493	null
2025-05-15	Campus AI vs Commercial AI: A Late-Breaking Study on How LLM As-A-Service Customizations Shape Trust and Usage Patterns	Leon Hannig et.al.	2505.10490	null
2025-05-15	Parallel Scaling Law for Language Models	Mouxiang Chen et.al.	2505.10475	link
2025-05-15	Large Language Models for Cancer Communication: Evaluating Linguistic Quality, Safety, and Accessibility in Generative AI	Agnik Saha et.al.	2505.10472	null
2025-05-15	AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenge	Ranjan Sapkota et.al.	2505.10468	null
2025-05-15	Superposition Yields Robust Neural Scaling	Yizhou liu et.al.	2505.10465	link
2025-05-15	Vision language models have difficulty recognizing virtual objects	Tyler Tran et.al.	2505.10453	null
2025-05-15	Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models	Zemin Huang et.al.	2505.10446	null
2025-05-14	Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?	Anthony GX-Chen et.al.	2505.09614	null
2025-05-14	Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors	Nicolas Dupuis et.al.	2505.09610	null
2025-05-14	Adversarial Suffix Filtering: a Defense Pipeline for LLMs	David Khachaturov et.al.	2505.09602	null
2025-05-14	How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference	Nidhal Jegham et.al.	2505.09598	null
2025-05-14	WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models	Abdullah Mushtaq et.al.	2505.09595	null
2025-05-14	Variational Visual Question Answering	Tobias Jan Wieczorek et.al.	2505.09591	null
2025-05-15	Beyond Likes: How Normative Feedback Complements Engagement Signals on Social Media	Yuchen Wu et.al.	2505.09583	null
2025-05-14	VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation	Chaofan Zhang et.al.	2505.09577	null
2025-05-14	Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach	Shannon Lodoen et.al.	2505.09576	null
2025-05-14	MIGRATION-BENCH: Repository-Level Code Migration Benchmark from Java 8	Linbo Liu et.al.	2505.09569	link
2025-05-14	Using Foundation Models as Pseudo-Label Generators for Pre-Clinical 4D Cardiac CT Segmentation	Anne-Marie Rickmann et.al.	2505.09564	null
2025-05-14	WavReward: Spoken Dialogue Models With Generalist Reward Evaluators	Shengpeng Ji et.al.	2505.09558	link
2025-05-14	PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning	Zongqian Li et.al.	2505.09519	link
2025-05-15	Towards Fair In-Context Learning with Tabular Foundation Models	Patrik Kenfack et.al.	2505.09503	null
2025-05-14	Layered Unlearning for Adversarial Relearning	Timothy Qian et.al.	2505.09500	link
2025-05-14	Flash-VL 2B: Optimizing Vision-Language Model Performance for Ultra-Low Latency and High Throughput	Bo Zhang et.al.	2505.09498	null
2025-05-14	Card Sorting Simulator: Augmenting Design of Logical Information Architectures with Large Language Models	Eduard Kuric et.al.	2505.09478	null
2025-05-14	Deploying Foundation Model-Enabled Air and Ground Robots in the Field: Challenges and Opportunities	Zachary Ravichandran et.al.	2505.09477	null
2025-05-14	Evaluating GPT- and Reasoning-based Large Language Models on Physics Olympiad Problems: Surpassing Human Performance and Implications for Educational Assessment	Paul Tschisgale et.al.	2505.09438	null
2025-05-14	CXMArena: Unified Dataset to benchmark performance in realistic CXM Scenarios	Raghav Garg et.al.	2505.09436	link
2025-05-13	CodePDE: An Inference Framework for LLM-driven PDE Solver Generation	Shanda Li et.al.	2505.08783	link
2025-05-13	HealthBench: Evaluating Large Language Models Towards Improved Human Health	Rahul K. Arora et.al.	2505.08775	link
2025-05-14	Towards Autonomous UAV Visual Object Search in City Space: Benchmark and Agentic Methodology	Yatai Ji et.al.	2505.08765	null
2025-05-13	Aya Vision: Advancing the Frontier of Multilingual Multimodality	Saurabh Dash et.al.	2505.08751	null
2025-05-13	AC-Reason: Towards Theory-Guided Actual Causality Reasoning with Large Language Models	Yanxi Zhang et.al.	2505.08750	link
2025-05-13	DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models	Xiaoyang Chen et.al.	2505.08744	link
2025-05-13	Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies	Xiaoliang Luo et.al.	2505.08739	link
2025-05-13	Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data	James Giroux et.al.	2505.08736	link
2025-05-13	NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context	Ben Yao et.al.	2505.08734	null
2025-05-13	Securing RAG: A Risk Assessment and Mitigation Framework	Lukas Ammann et.al.	2505.08728	null
2025-05-13	Memorization-Compression Cycles Improve Generalization	Fangyuan Yu et.al.	2505.08727	null
2025-05-13	Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving	Zongchuang Zhao et.al.	2505.08725	link
2025-05-13	TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series	Xiaolei Qin et.al.	2505.08723	link
2025-05-13	PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts	Yang Su et.al.	2505.08719	null
2025-05-13	Controllable Image Colorization with Instance-aware Texts and Masks	Yanru An et.al.	2505.08705	null
2025-05-13	LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs	K M Sajjadul Islam et.al.	2505.08704	null
2025-05-14	Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities	George Saon et.al.	2505.08699	null
2025-05-13	VizCV: AI-assisted visualization of researchers' publications tracks	Vladimír Lazárik et.al.	2505.08691	null
2025-05-13	Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation	Sheng Liang et.al.	2505.08690	null
2025-05-13	A Social Robot with Inner Speech for Dietary Guidance	Valerio Belcamino et.al.	2505.08664	link
2025-05-12	DanceGRPO: Unleashing GRPO on Visual Generation	Zeyue Xue et.al.	2505.07818	null
2025-05-12	Imagine, Verify, Execute: Memory-Guided Agentic Exploration with Vision-Language Models	Seungjae Lee et.al.	2505.07815	null
2025-05-12	Learning Dynamics in Continual Pre-Training for Large Language Models	Xingjin Wang et.al.	2505.07796	null
2025-05-12	Domain Regeneration: How well do LLMs match syntactic properties of text domains?	Da Ju et.al.	2505.07784	null
2025-05-12	Relative Overfitting and Accept-Reject Framework	Yanxin Liu et.al.	2505.07783	null
2025-05-12	MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering	Rushi Qiang et.al.	2505.07782	link
2025-05-12	Must Read: A Systematic Survey of Computational Persuasion	Nimet Beyza Bozdag et.al.	2505.07775	link
2025-05-12	Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving	Xinji Mai et.al.	2505.07773	link
2025-05-12	Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding	Yifeng Di et.al.	2505.07768	link
2025-05-12	BodyGPS: Anatomical Positioning System	Halid Ziya Yerebakan et.al.	2505.07744	null
2025-05-12	Assessing the Chemical Intelligence of Large Language Models	Nicholas T. Runcie et.al.	2505.07735	link
2025-05-12	Spoken Language Understanding on Unseen Tasks With In-Context Learning	Neeraj Agrawal et.al.	2505.07731	null
2025-05-12	Reproducibility, Replicability, and Insights into Visual Document Retrieval with Late Interaction	Jingfen Qiao et.al.	2505.07730	link
2025-05-12	Circuit Partitioning Using Large Language Models for Quantum Compilation and Simulations	Pranav Sinha et.al.	2505.07711	null
2025-05-12	Through the Looking Glass: Common Sense Consistency Evaluation of Weird Images	Elisei Rykov et.al.	2505.07704	null
2025-05-12	PatchTrack: A Comprehensive Analysis of ChatGPT's Influence on Pull Request Outcomes	Daniel Ogenrwot et.al.	2505.07700	null
2025-05-12	Beyond CLIP Generalization: Against Forward&Backward Forgetting Adapter for Continual Learning of Vision-Language Models	Songlin Dong et.al.	2505.07690	null
2025-05-12	S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models	Muzhi Dai et.al.	2505.07686	null
2025-05-12	Multimodal Survival Modeling in the Age of Foundation Models	Steven Song et.al.	2505.07683	link
2025-05-12	SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models	Hang Wu et.al.	2505.07680	null
2025-05-09	Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks	Christos Plachouras et.al.	2505.06224	link
2025-05-09	Adapting a Segmentation Foundation Model for Medical Image Classification	Pengfei Gu et.al.	2505.06217	null
2025-05-09	From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling	Vahid Rahimzadeh et.al.	2505.06184	null
2025-05-09	A Large Language Model-Enhanced Q-learning for Capacitated Vehicle Routing Problem with Time Windows	Linjiang Cao et.al.	2505.06178	null
2025-05-09	MonetGPT: Solving Puzzles Enhances MLLMs' Image Retouching Skills	Niladri Shekhar Dutt et.al.	2505.06176	null
2025-05-09	Turbo-ICL: In-Context Learning-Based Turbo Equalization	Zihang Song et.al.	2505.06175	null
2025-05-09	MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from Textbooks	Wenqi Zeng et.al.	2505.06152	link
2025-05-09	A Scaling Law for Token Efficiency in LLM Fine-Tuning Under Fixed Compute Budgets	Ryan Lagasse et.al.	2505.06150	null
2025-05-09	Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study	Faeze Ghorbanpour et.al.	2505.06149	null
2025-05-09	LLMs Get Lost In Multi-Turn Conversation	Philippe Laban et.al.	2505.06120	link
2025-05-09	LLMs Outperform Experts on Challenging Biology Benchmarks	Lennart Justen et.al.	2505.06108	null
2025-05-09	Free and Fair Hardware: A Pathway to Copyright Infringement-Free Verilog Generation using LLMs	Sam Bush et.al.	2505.06096	null
2025-05-09	Assessing Tenstorrent's RISC-V MatMul Acceleration Capabilities	Hiari Pizzini Cavagna et.al.	2505.06085	null
2025-05-09	Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information	Joshua Harris et.al.	2505.06046	null
2025-05-09	Short-circuiting Shortcuts: Mechanistic Investigation of Shortcuts in Text Classification	Leon Eshuijs et.al.	2505.06032	link
2025-05-09	Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation	Stefan Vasilev et.al.	2505.06027	null
2025-05-09	ArtRAG: Retrieval-Augmented Generation with Structured Context for Visual Art Understanding	Shuai Wang et.al.	2505.06020	null
2025-05-09	Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models	Dawid Wisniewski et.al.	2505.06004	link
2025-05-09	Task-Adapter++: Task-specific Adaptation with Order-aware Alignment for Few-shot Action Recognition	Congqi Cao et.al.	2505.06002	link
2025-05-09	Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models	Lennart Stöpler et.al.	2505.05970	null
2025-05-08	Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation	Chao Liao et.al.	2505.05472	null
2025-05-08	Generating Physically Stable and Buildable LEGO Designs from Text	Ava Pun et.al.	2505.05469	link
2025-05-08	StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant	Haibo Wang et.al.	2505.05467	null
2025-05-08	ComPO: Preference Alignment via Comparison Oracles	Peter Chen et.al.	2505.05465	null
2025-05-08	Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging	Shiqi Chen et.al.	2505.05464	link
2025-05-08	UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections	Fatima Haouari et.al.	2505.05459	null
2025-05-08	SITE: towards Spatial Intelligence Thorough Evaluation	Wenqi Wang et.al.	2505.05456	null
2025-05-08	Conversational Process Model Redesign	Nataliia Klievtsova et.al.	2505.05453	null
2025-05-08	clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations	Chalamalasetti Kranti et.al.	2505.05445	null
2025-05-08	GesPrompt: Leveraging Co-Speech Gestures to Augment LLM-Based Interaction in Virtual Reality	Xiyun Hu et.al.	2505.05441	null
2025-05-09	EcoAgent: An Efficient Edge-Cloud Collaborative Multi-Agent Framework for Mobile Automation	Biao Yi et.al.	2505.05440	null
2025-05-08	Ultra-FineWeb: Efficient Data Filtering and Verification for High-Quality LLM Training Data	Yudong Wang et.al.	2505.05427	null
2025-05-09	LiTransProQA: an LLM-based Literary Translation evaluation metric with Professional Question Answering	Ran Zhang et.al.	2505.05423	link
2025-05-08	Crosslingual Reasoning through Test-Time Scaling	Zheng-Xin Yong et.al.	2505.05408	link
2025-05-08	Frame In, Frame Out: Do LLMs Generate More Biased News Headlines than Humans?	Valeria Pastorino et.al.	2505.05406	null
2025-05-08	A Pain Assessment Framework based on multimodal data and Deep Machine Learning methods	Stefanos Gkikas et.al.	2505.05396	null
2025-05-08	DSDrive: Distilling Large Language Model for Lightweight End-to-End Autonomous Driving with Unified Reasoning and Planning	Wenru Liu et.al.	2505.05360	null
2025-05-08	Hearing and Seeing Through CLIP: A Framework for Self-Supervised Sound Source Localization	Sooyoung Park et.al.	2505.05343	link
2025-05-08	FLAM: Frame-Wise Language-Audio Modeling	Yusong Wu et.al.	2505.05335	null
2025-05-08	ICon: In-Context Contribution for Automatic Data Selection	Yixin Yang et.al.	2505.05327	null
2025-05-07	EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning	Zhenghao Xing et.al.	2505.04623	link
2025-05-07	On Path to Multimodal Generalist: General-Level and General-Bench	Hao Fei et.al.	2505.04620	null
2025-05-07	OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution	Lianghong Guo et.al.	2505.04606	link
2025-05-07	OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning	Xianhang Li et.al.	2505.04601	null
2025-05-08	MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection	Zhihao Zhang et.al.	2505.04594	null
2025-05-07	ZeroSearch: Incentivize the Search Capability of LLMs without Searching	Hao Sun et.al.	2505.04588	link
2025-05-07	SlideItRight: Using AI to Find Relevant Slides and Provide Feedback for Open-Ended Questions	Chloe Qianhui Zhao et.al.	2505.04584	link
2025-05-07	Fight Fire with Fire: Defending Against Malicious RL Fine-Tuning via Reward Neutralization	Wenjun Cao et.al.	2505.04578	null
2025-05-07	Communication-Efficient Federated Fine-Tuning of Language Models via Dynamic Update Schedules	Michail Theologitis et.al.	2505.04535	link
2025-05-07	Overcoming Data Scarcity in Generative Language Modelling for Low-Resource Languages: A Systematic Review	Josh McGiff et.al.	2505.04531	null
2025-05-07	Comparative Analysis of Carbon Footprint in Manual vs. LLM-Assisted Code Development	Kuen Sum Cheung et.al.	2505.04521	null
2025-05-07	Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs	Yehui Tang et.al.	2505.04519	null
2025-05-07	"I Can See Forever!": Evaluating Real-time VideoLLMs for Assisting Individuals with Visual Impairments	Ziyi Zhang et.al.	2505.04488	null
2025-05-07	CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation	Jiahao Li et.al.	2505.04481	null
2025-05-07	TrajEvo: Designing Trajectory Prediction Heuristics via LLM-driven Evolution	Zhikai Zhao et.al.	2505.04480	link
2025-05-07	Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data Restoration	Shigeki Karita et.al.	2505.04457	link
2025-05-07	M2Rec: Multi-scale Mamba for Efficient Sequential Recommendation	Qianru Zhang et.al.	2505.04445	null
2025-05-07	Towards Effectively Leveraging Execution Traces for Program Repair with Code LLMs	Mirazul Haque et.al.	2505.04441	null
2025-05-07	OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models	Xiaoyu Xu et.al.	2505.04416	null
2025-05-07	DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception	Junjie Wang et.al.	2505.04410	link
2025-05-06	VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model	Zuwei Long et.al.	2505.03739	link
2025-05-06	Decentralized Nonconvex Optimization under Heavy-Tailed Noise: Normalization and Optimal Convergence	Shuhua Yu et.al.	2505.03736	null
2025-05-06	Meta-Optimization and Program Search using Language Models for Task and Motion Planning	Denis Shcherba et.al.	2505.03725	null
2025-05-06	Fill the Gap: Quantifying and Reducing the Modality Gap in Image-Text Representation Learning	François Role et.al.	2505.03703	null
2025-05-06	Fairness of Automatic Speech Recognition in Cleft Lip and Palate Speech	Susmita Bhattacharjee et.al.	2505.03697	null
2025-05-06	Graph Drawing for LLMs: An Empirical Evaluation	Walter Didimo et.al.	2505.03678	null
2025-05-06	Distribution-Conditional Generation: From Class Distribution to Creative Generation	Fu Feng et.al.	2505.03667	null
2025-05-06	Binding threshold units with artificial oscillatory neurons	Vladimir Fanaskov et.al.	2505.03648	link
2025-05-06	PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing	Yiping Xie et.al.	2505.03621	null
2025-05-06	Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images	Fangling Jiang et.al.	2505.03611	null
2025-05-06	Learning Knowledge-based Prompts for Robust 3D Mask Presentation Attack Detection	Fangling Jiang et.al.	2505.03610	null
2025-05-06	DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes	Sergey Linok et.al.	2505.03581	link
2025-05-06	LlamaFirewall: An open source guardrail system for building secure AI agents	Sahana Chennabasappa et.al.	2505.03574	null
2025-05-06	Say It Another Way: A Framework for User-Grounded Paraphrasing	Cléa Chataigner et.al.	2505.03563	null
2025-05-06	A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications and Challenges	Feibo Jiang et.al.	2505.03556	link
2025-05-06	A Hashgraph-Inspired Consensus Mechanism for Reliable Multi-Model Reasoning	Kolawole E. Ogunsina et.al.	2505.03553	null
2025-05-06	STORY2GAME: Generating (Almost) Everything in an Interactive Fiction Game	Eric Zhou et.al.	2505.03547	null
2025-05-06	Faster MoE LLM Inference for Extremely Large Models	Haoqi Yang et.al.	2505.03531	null
2025-05-06	Ruled by the Representation Space: On the University's Embrace of Large Language Models	Katia Schwerzmann et.al.	2505.03513	null
2025-05-06	BadLingual: A Novel Lingual-Backdoor Attack against Large Language Models	Zihan Wang et.al.	2505.03501	null
2025-05-05	Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation	Lu Ling et.al.	2505.02836	null
2025-05-05	R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning	Yi-Fan Zhang et.al.	2505.02835	link
2025-05-05	No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves	Dengyang Jiang et.al.	2505.02831	link
2025-05-05	LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery	Jerome Quenum et.al.	2505.02829	null
2025-05-05	ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations	Dmitriy Shopkhoev et.al.	2505.02819	link
2025-05-05	Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing	Diji Yang et.al.	2505.02811	link
2025-05-05	Towards Quantifying the Hessian Structure of Neural Networks	Zhaorui Dong et.al.	2505.02809	link
2025-05-05	Generating HomeAssistant Automations Using an LLM-based Chatbot	Mathyas Giudici et.al.	2505.02802	null
2025-05-05	HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models	Zheng Lin et.al.	2505.02795	null
2025-05-05	Giving Simulated Cells a Voice: Evolving Prompt-to-Intervention Models for Cellular Control	Nam H. Le et.al.	2505.02766	null
2025-05-05	Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models	Matthew Dahl et.al.	2505.02763	null
2025-05-05	Using Knowledge Graphs to harvest datasets for efficient CLIP model training	Simon Ging et.al.	2505.02746	link
2025-05-06	Knowledge Graphs for Enhancing Large Language Models in Entity Disambiguation	Gerard Pons et.al.	2505.02737	null
2025-05-05	FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models	Zhouliang Yu et.al.	2505.02735	link
2025-05-05	Enhancing LLMs' Clinical Reasoning with Real-World Data from a Nationwide Sepsis Registry	Junu Kim et.al.	2505.02722	link
2025-05-05	Less is More: Efficient Weight Farcasting with 1-Layer Neural Network	Xiao Shou et.al.	2505.02714	null
2025-05-05	Technical Report: Evaluating Goal Drift in Language Model Agents	Rauno Arike et.al.	2505.02709	null
2025-05-05	Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play	Yemin Shi et.al.	2505.02707	link
2025-05-05	AI Standardized Patient Improves Human Conversations in Advanced Cancer Care	Kurtis Haut et.al.	2505.02694	link
2025-05-05	Predicting Movie Hits Before They Happen with LLMs	Shaghayegh Agah et.al.	2505.02693	null
2025-05-02	How Effective are Large Time Series Models in Hydrology? A Study on Water Level Forecasting in Everglades	Rahuul Rangaraj et.al.	2505.01415	null
2025-05-02	Dynamic Robot Tool Use with Vision Language Models	Noah Trupin et.al.	2505.01399	null
2025-05-02	FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors	Chenxi Li et.al.	2505.01322	null
2025-05-02	Helping Big Language Models Protect Themselves: An Enhanced Filtering and Summarization System	Sheikh Samit Muhaimin et.al.	2505.01315	null
2025-05-02	Enhancing SPARQL Query Rewriting for Complex Ontology Alignments	Anicet Lepetit Ondo et.al.	2505.01309	null
2025-05-02	Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments	Regan Bolton et.al.	2505.01307	null
2025-05-02	FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing	Gaoxiang Cong et.al.	2505.01263	null
2025-05-02	Digital Pathway Curation (DPC): a comparative pipeline to assess the reproducibility, consensus and accuracy across Gemini, PubMed, and scientific reviewers in biomedical research	Flavio Lichtenstein et.al.	2505.01259	null
2025-05-02	Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging	Elena Mulero Ayllón et.al.	2505.01239	null
2025-05-02	CaReAQA: A Cardiac and Respiratory Audio Question Answering Model for Open-Ended Diagnostic Reasoning	Tsai-Ning Wang et.al.	2505.01199	null
2025-05-02	Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods	Mahdi Dhaini et.al.	2505.01198	link
2025-05-02	TSTMotion: Training-free Scene-awarenText-to-motion Generation	Ziyan Guo et.al.	2505.01182	null
2025-05-02	LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures	Francisco Aguilera-Martínez et.al.	2505.01177	null
2025-05-02	On the Limitations of Steering in Language Model Alignment	Chebrolu Niranjan et.al.	2505.01162	null
2025-05-02	Methodological Foundations for AI-Driven Survey Question Generation	Ted K. Mburu et.al.	2505.01150	null
2025-05-02	Retrieval-Augmented Generation in Biomedicine: A Survey of Technologies, Datasets, and Clinical Applications	Jiawei He et.al.	2505.01146	null
2025-05-02	MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning	Murtadha Ahmed et.al.	2505.01110	null
2025-05-02	Self-Supervision Enhances Instance-based Multiple Instance Learning Methods in Digital Pathology: A Benchmark Study	Ali Mammadov et.al.	2505.01109	link
2025-05-02	Nesterov Method for Asynchronous Pipeline Parallel Optimization	Thalaiyasingam Ajanthan et.al.	2505.01099	link
2025-05-02	Evaluating Vision Language Model Adaptations for Radiology Report Generation in Low-Resource Languages	Marco Salmè et.al.	2505.01096	null
2025-05-01	T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT	Dongzhi Jiang et.al.	2505.00703	link
2025-05-01	Robotic Visual Instruction	Yanbang Li et.al.	2505.00693	null
2025-05-01	Visual Test-time Scaling for GUI Agent Grounding	Tiange Luo et.al.	2505.00684	link
2025-05-01	Steering Large Language Models with Register Analysis for Arbitrary Style Transfer	Xinchen Yang et.al.	2505.00679	null
2025-05-01	Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions	Yiming Du et.al.	2505.00675	link
2025-05-01	DeepCritic: Deliberate Critique with Large Language Models	Wenkai Yang et.al.	2505.00662	link
2025-05-01	On the generalization of language models from in-context learning and finetuning: a controlled study	Andrew K. Lampinen et.al.	2505.00661	null
2025-05-01	Large Language Models Understanding: an Inherent Ambiguity Barrier	Daniel N. Nissani et.al.	2505.00654	null
2025-05-01	Open-Source LLM-Driven Federated Transformer for Predictive IoV Management	Yazan Otoum et.al.	2505.00651	null
2025-05-01	Investigating Task Arithmetic for Zero-Shot Information Retrieval	Marco Braga et.al.	2505.00649	link
2025-05-01	Brain Foundation Models with Hypergraph Dynamic Adapter for Brain Disease Analysis	Zhongying Deng et.al.	2505.00627	null
2025-05-01	The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)	Zihao Wang et.al.	2505.00626	null
2025-05-01	FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation	Chaitali Bhattacharyya et.al.	2505.00624	null
2025-05-01	Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction	Simon Giebenhain et.al.	2505.00615	null
2025-05-01	Combining LLMs with Logic-Based Framework to Explain MCTS	Ziyan An et.al.	2505.00610	null
2025-05-01	Can LLMs Help Improve Analogical Reasoning For Strategic Decisions? Experimental Evidence from Humans and GPT-4	Phanish Puranam et.al.	2505.00603	null
2025-05-02	Fast and Low-Cost Genomic Foundation Models via Outlier Removal	Haozheng Luo et.al.	2505.00598	link
2025-05-01	Block Circulant Adapter for Large Language Models	Xinyu Ding et.al.	2505.00582	null
2025-05-01	Parameter-Efficient Fine-Tuning with Circulant and Diagonal Vectors	Xinyu Ding et.al.	2505.00580	null
2025-05-01	FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension	Jushi Kai et.al.	2505.00570	null
2025-04-30	TRUST: An LLM-Based Dialogue System for Trauma Understanding and Structured Assessments	Sichang Tu et.al.	2504.21851	null
2025-04-30	COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning	Xindi Wu et.al.	2504.21850	null
2025-04-30	Early Exit and Multi Stage Knowledge Distillation in VLMs for Video Summarization	Anas Anwarul Haq Khan et.al.	2504.21831	null
2025-04-30	Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields	Yixin Gao et.al.	2504.21814	null
2025-04-30	A simple and effective approach for body part recognition on CT scans based on projection estimation	Franko Hrzic et.al.	2504.21810	null
2025-04-30	An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding	Xiuwei Shang et.al.	2504.21803	null
2025-04-30	DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition	Z. Z. Ren et.al.	2504.21801	link
2025-04-30	SWE-smith: Scaling Data for Software Engineering Agents	John Yang et.al.	2504.21798	null
2025-04-30	MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness	Junsheng Huang et.al.	2504.21773	null
2025-04-30	LASHED: LLMs And Static Hardware Analysis for Early Detection of RTL Bugs	Baleegh Ahmad et.al.	2504.21770	null
2025-04-30	LLM-based Interactive Imitation Learning for Robotic Manipulation	Jonas Werner et.al.	2504.21769	link
2025-04-30	Investigating Literary Motifs in Ancient and Medieval Novels with Large Language Models	Emelie Hallenberg et.al.	2504.21742	null
2025-04-30	TheraQuest: A Gamified, LLM-Powered Simulation for Massage Therapy Training	Shengqian Wang et.al.	2504.21735	null
2025-04-30	XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs	Marco Arazzi et.al.	2504.21700	null
2025-04-30	Visual Text Processing: A Comprehensive Review and Unified Evaluation	Yan Shu et.al.	2504.21682	link
2025-04-30	Hoist with His Own Petard: Inducing Guardrails to Facilitate Denial-of-Service Attacks on Retrieval-Augmented Generation of LLMs	Pan Suo et.al.	2504.21680	null
2025-04-30	Traceback of Poisoning Attacks to Retrieval-Augmented Generation	Baolei Zhang et.al.	2504.21668	null
2025-04-30	From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising	Jingwen Cai et.al.	2504.21667	null
2025-04-30	AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization	Haotian Luo et.al.	2504.21659	link
2025-04-30	Sadeed: Advancing Arabic Diacritization Through Small Language Model	Zeina Aldallal et.al.	2504.21635	null
2025-04-29	Toward Efficient Exploration by Large Language Model Agents	Dilip Arumugam et.al.	2504.20997	null
2025-04-29	X-Fusion: Introducing New Modality to Frozen Large Language Models	Sicheng Mo et.al.	2504.20996	null
2025-04-29	ACE: A Security Architecture for LLM-Integrated App Systems	Evan Li et.al.	2504.20984	null
2025-04-29	Real-Time Wayfinding Assistant for Blind and Low-Vision Users	Dabbrata Das et.al.	2504.20976	null
2025-04-29	SetKE: Knowledge Editing for Knowledge Elements Overlap	Yifan Wei et.al.	2504.20972	null
2025-04-29	OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification	Shangyu Li et.al.	2504.20964	link
2025-04-29	Information Gravity: A Field-Theoretic Model for Token Selection in Large Language Models	Maryna Vyshnyvetska et.al.	2504.20951	null
2025-04-29	Trace-of-Thought: Enhanced Arithmetic Problem Solving via Reasoning Distillation From Large to Small Language Models	Tyler McDonald et.al.	2504.20946	null
2025-04-29	ChestX-Reasoner: Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification	Ziqing Fan et.al.	2504.20930	link
2025-04-29	An Empirical Study on the Capability of LLMs in Decomposing Bug Reports	Zhiyuan Chen et.al.	2504.20911	null
2025-04-29	Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers	Quentin Guimard et.al.	2504.20902	null
2025-04-29	LELANTE: LEveraging LLM for Automated ANdroid TEsting	Shamit Fatin et.al.	2504.20896	null
2025-04-29	FedMVP: Federated Multi-modal Visual Prompt Tuning for Vision-Language Models	Mainak Singha et.al.	2504.20860	null
2025-04-29	X-Cross: Dynamic Integration of Language Models for Cross-Domain Sequential Recommendation	Guy Hadad et.al.	2504.20859	null
2025-04-29	JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry	Anum Afzal et.al.	2504.20849	null
2025-04-29	Language Model for Large-Text Transmission in Noisy Quantum Communications	Yuqi Li et.al.	2504.20842	null
2025-04-29	Universal language model with the intervention of quantum theory	D. -F. Qin et.al.	2504.20839	null
2025-04-29	Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning	Hongfei Xue et.al.	2504.20835	null
2025-04-29	Reinforcement Learning for LLM Reasoning Under Memory Constraints	Alan Lee et.al.	2504.20834	null
2025-04-30	Ascendra: Dynamic Request Prioritization for Efficient LLM Serving	Azam Ikram et.al.	2504.20828	null
2025-04-28	Learning Streaming Video Representation via Multitask Training	Yibin Yan et.al.	2504.20041	null
2025-04-28	AutoJudge: Judge Decoding Without Manual Annotation	Roman Garipov et.al.	2504.20039	null
2025-04-28	SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning	Wufei Ma et.al.	2504.20024	null
2025-04-28	Better To Ask in English? Evaluating Factual Accuracy of Multilingual LLMs in English and Low-Resource Languages	Pritika Rohera et.al.	2504.20022	null
2025-04-28	Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models	Xin Wang et.al.	2504.20020	null
2025-04-28	LLM-Generated Fake News Induces Truth Decay in News Ecosystem: A Case Study on Neural News Recommendation	Beizhe Hu et.al.	2504.20013	null
2025-04-28	Towards Automated Scoping of AI for Social Good Projects	Jacob Emmerson et.al.	2504.20010	null
2025-04-28	Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom	Rishika Sen et.al.	2504.20000	null
2025-04-28	HJRNO: Hamilton-Jacobi Reachability with Neural Operators	Yankai Li et.al.	2504.19989	null
2025-04-28	TD-EVAL: Revisiting Task-Oriented Dialogue Evaluation by Combining Turn-Level Precision with Dialogue-Level Comparisons	Emre Can Acikgoz et.al.	2504.19982	null
2025-04-28	Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets	Adam Younsi et.al.	2504.19981	null
2025-04-29	From Concept to Practice: an Automated LLM-aided UVM Machine for RTL Verification	Junhao Ye et.al.	2504.19959	null
2025-04-28	Enhancing Surgical Documentation through Multimodal Visual-Temporal Transformers and Generative AI	Hugo Georgenthum et.al.	2504.19918	null
2025-04-28	Can AI Agents Design and Implement Drug Discovery Pipelines?	Khachik Smbatyan et.al.	2504.19912	null
2025-04-28	GenCLS++: Pushing the Boundaries of Generative Classification in LLMs Through Comprehensive SFT and RL Studies Across Diverse Datasets	Mingqian He et.al.	2504.19898	null
2025-04-28	CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition	Quynh Phung et.al.	2504.19894	null
2025-04-28	semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage	Ke Hong et.al.	2504.19867	null
2025-04-28	CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback	Chenhan Jiang et.al.	2504.19860	null
2025-04-28	Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language	Anastasia Zhukova et.al.	2504.19856	null
2025-04-29	The Automation Advantage in AI Red Teaming	Rob Mulla et.al.	2504.19855	null
2025-04-25	Generalization Capability for Imitation Learning	Yixiao Wang et.al.	2504.18538	null
2025-04-25	TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation	Gwen Yidou Weng et.al.	2504.18535	null
2025-04-25	Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation	Shivam Duggal et.al.	2504.18509	null
2025-04-25	Investigating Co-Constructive Behavior of Large Language Models in Explanation Dialogues	Leandra Fichtel et.al.	2504.18483	null
2025-04-25	Generative Induction of Dialogue Task Schemas with Streaming Refinement and Simulated Interactions	James D. Finch et.al.	2504.18474	null
2025-04-25	Fast-Slow Thinking for Large Vision-Language Model Reasoning	Wenyi Xiao et.al.	2504.18458	null
2025-04-25	Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training	Hiroki Naganuma et.al.	2504.18454	null
2025-04-25	Reason Like a Radiologist: Chain-of-Thought and Reinforcement Learning for Verifiable Report Generation	Peiyuan Jing et.al.	2504.18453	null
2025-04-25	Kimi-Audio Technical Report	KimiTeam et.al.	2504.18425	link
2025-04-25	LLMpatronous: Harnessing the Power of LLMs For Vulnerability Detection	Rajesh Yarra et.al.	2504.18423	null
2025-04-25	BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs	Hongyu Wang et.al.	2504.18415	null
2025-04-25	An Empirical Study of Evaluating Long-form Question Answering	Ning Xian et.al.	2504.18413	link
2025-04-25	Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers	Jared Moore et.al.	2504.18412	link
2025-04-25	HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding?	Yusen Zhang et.al.	2504.18406	null
2025-04-25	Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization	Kesen Zhao et.al.	2504.18397	link
2025-04-25	Bridge the Domains: Large Language Models Enhanced Cross-domain Sequential Recommendation	Qidong Liu et.al.	2504.18383	null
2025-04-25	Pushing the boundary on Natural Language Inference	Pablo Miralles-González et.al.	2504.18376	null
2025-04-25	Auto-SLURP: A Benchmark Dataset for Evaluating Multi-Agent Frameworks in Smart Personal Assistant	Lei Shen et.al.	2504.18373	link
2025-04-25	ThreMoLIA: Threat Modeling of Large Language Model-Integrated Applications	Felix Viktor Jedrzejewski et.al.	2504.18369	null
2025-04-25	Testing Individual Fairness in Graph Neural Networks	Roya Nasiri et.al.	2504.18353	null
2025-04-24	Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models	Xu Ma et.al.	2504.17789	null
2025-04-24	Replay to Remember: Retaining Domain Knowledge in Streaming Language Models	Sneh Pillai et.al.	2504.17780	null
2025-04-24	Conversational Assistants to support Heart Failure Patients: comparing a Neurosymbolic Architecture with ChatGPT	Anuja Tayal et.al.	2504.17753	null
2025-04-24	Towards Robust LLMs: an Adversarial Robustness Measurement Framework	Natan Levy et.al.	2504.17723	null
2025-04-24	Multilingual Performance Biases of Large Language Models in Education	Vansh Gupta et.al.	2504.17720	null
2025-04-24	PICO: Reconstructing 3D People In Contact with Objects	Alpár Cseke et.al.	2504.17695	null
2025-04-24	Ensemble Bayesian Inference: Leveraging Small Language Models to Achieve LLM-level Accuracy in Profile Matching Tasks	Haru-Tada Sato et.al.	2504.17685	null
2025-04-24	INSIGHT: Bridging the Student-Teacher Gap in Times of Large Language Models	Jarne Thys et.al.	2504.17677	null
2025-04-24	Energy Considerations of Large Language Model Inference and Efficiency Optimizations	Jared Fernandez et.al.	2504.17674	null
2025-04-24	Cross-region Model Training with Communication-Computation Overlapping and Delay Compensation	Ying Zhu et.al.	2504.17672	null
2025-04-25	Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction	Yuanchang Ye et.al.	2504.17671	null
2025-04-24	Towards a HIPAA Compliant Agentic AI System in Healthcare	Subash Neupane et.al.	2504.17669	null
2025-04-24	Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics	Zena Al-Khalili et.al.	2504.17665	null
2025-04-24	Effortless, Simulation-Efficient Bayesian Inference using Tabular Foundation Models	Julius Vetter et.al.	2504.17660	null
2025-04-24	Portability of Optimizations from SC to TSO	Akshay Gopalakrishnan et.al.	2504.17646	null
2025-04-24	L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference	Qingyuan Liu et.al.	2504.17584	null
2025-04-25	DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training	Xiaoyu Tian et.al.	2504.17565	null
2025-04-24	When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars	Rei Higuchi et.al.	2504.17562	null
2025-04-24	HalluLens: LLM Hallucination Benchmark	Yejin Bang et.al.	2504.17550	null
2025-04-24	A Comprehensive Survey of Knowledge-Based Vision Question Answering Systems: The Lifecycle of Knowledge in Visual Reasoning Task	Jiaqi Deng et.al.	2504.17547	null
2025-04-23	Generalized Neighborhood Attention: Multi-dimensional Sparse Attention at the Speed of Light	Ali Hassani et.al.	2504.16922	link
2025-04-23	IberBench: LLM Evaluation on Iberian Languages	José Ángel González et.al.	2504.16921	null
2025-04-23	Tracing Thought: Using Chain-of-Thought Reasoning to Identify the LLM Behind AI-Generated Text	Shifali Agrahari et.al.	2504.16913	null
2025-04-23	Do Large Language Models know who did what to whom?	Joseph M. Denning et.al.	2504.16884	null
2025-04-23	Enhancing Critical Thinking with AI: A Tailored Warning System for RAG Models	Xuyang Zhu et.al.	2504.16883	null
2025-04-23	Context-Enhanced Vulnerability Detection Based on Large Language Model	Yixin Yang et.al.	2504.16877	null
2025-04-23	Exploring How LLMs Capture and Represent Domain-Specific Knowledge	Mirian Hipolito Garcia et.al.	2504.16871	null
2025-04-23	Common Functional Decompositions Can Mis-attribute Differences in Outcomes Between Populations	Manuel Quintero et.al.	2504.16864	null
2025-04-23	Planning with Diffusion Models for Target-Oriented Dialogue Systems	Hanwen Du et.al.	2504.16858	null
2025-04-23	Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification	Alexander Shvets et.al.	2504.16856	null
2025-04-23	Monte Carlo Planning with Large Language Model for Text-Based Game Agents	Zijing Shi et.al.	2504.16855	null
2025-04-23	Improving Significant Wave Height Prediction Using Chronos Models	Yilin Zhai et.al.	2504.16834	null
2025-04-23	LRASGen: LLM-based RESTful API Specification Generation	Sida Deng et.al.	2504.16833	null
2025-04-23	GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning	Luu Quy Tung et.al.	2504.16832	null
2025-04-23	Decoupled Global-Local Alignment for Improving Compositional Understanding	Xiaoxing Hu et.al.	2504.16801	null
2025-04-23	MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores	Fengwei Zhou et.al.	2504.16786	null
2025-04-23	Graph2Nav: 3D Object-Relation Graph Generation to Robot Navigation	Tixiao Shan et.al.	2504.16782	null
2025-04-23	How Effective are Generative Large Language Models in Performing Requirements Classification?	Waad Alhoshan et.al.	2504.16768	null
2025-04-23	Lightweight Latent Verifiers for Efficient Meta-Generation Strategies	Bartosz Piotrowski et.al.	2504.16760	null
2025-04-23	HEMA : A Hippocampus-Inspired Extended Memory Architecture for Long-Context AI Conversations	Kwangseob Ahn et.al.	2504.16754	null
2025-04-22	TTRL: Test-Time Reinforcement Learning	Yuxin Zuo et.al.	2504.16084	link
2025-04-22	MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention	Yucheng Li et.al.	2504.16083	null
2025-04-22	MR. Video: "MapReduce" is the Principle for Long Video Understanding	Ziqi Pang et.al.	2504.16082	null
2025-04-22	From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning	Le Zhuo et.al.	2504.16080	null
2025-04-22	LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities	Thomas Schmied et.al.	2504.16078	null
2025-04-22	PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models	Shi Qiu et.al.	2504.16074	null
2025-04-22	Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation	Zhiyuan Hu et.al.	2504.16073	null
2025-04-22	Describe Anything: Detailed Localized Image and Video Captioning	Long Lian et.al.	2504.16072	null
2025-04-22	A Python Tool for Reconstructing Full News Text from GDELT	A. Fronzetti Colladon et.al.	2504.16063	link
2025-04-22	Vision language models are unreliable at trivial spatial cognition	Sangeet Khemlani et.al.	2504.16061	null
2025-04-22	Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation	Ziqiao Ma et.al.	2504.16060	link
2025-04-22	Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach	Penghui Li et.al.	2504.16057	null
2025-04-22	Honey, I Shrunk the Language Model: Impact of Knowledge Distillation Methods on Performance and Explainability	Daniel Hendriks et.al.	2504.16056	null
2025-04-22	LongMamba: Enhancing Mamba's Long Context Capabilities via Training-Free Receptive Field Enlargement	Zhifan Ye et.al.	2504.16053	link
2025-04-22	Evaluating Vision Language Models (VLMs) for Radiology: A Comprehensive Analysis	Frank Li et.al.	2504.16047	null
2025-04-22	Certified Mitigation of Worst-Case LLM Copyright Infringement	Jingyu Zhang et.al.	2504.16046	null
2025-04-22	LLMs meet Federated Learning for Scalable and Secure IoT Management	Yazan Otoum et.al.	2504.16032	null
2025-04-22	LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale	Joya Chen et.al.	2504.16030	null
2025-04-22	Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V3	Ahmed R. Sadik et.al.	2504.16027	null
2025-04-22	Efficient Temporal Consistency in Diffusion-Based Video Editing with Adaptor Modules: A Theoretical Framework	Xinyuan Song et.al.	2504.16016	null
2025-04-21	Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs	Chun-Hsiao Yeh et.al.	2504.15280	link
2025-04-21	VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models	Weiye Xu et.al.	2504.15279	null
2025-04-21	Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning	Jie Cheng et.al.	2504.15275	link
2025-04-21	Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models	Guo Chen et.al.	2504.15271	null
2025-04-21	Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction	Vaishnavh Nagarajan et.al.	2504.15266	link
2025-04-21	Interpretable Locomotion Prediction in Construction Using a Memory-Driven LLM Agent With Chain-of-Thought Reasoning	Ehsan Ahmadi et.al.	2504.15263	null
2025-04-21	Leveraging Language Models for Automated Patient Record Linkage	Mohammad Beheshti et.al.	2504.15261	null
2025-04-21	CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation	Anirudh Khatry et.al.	2504.15254	link
2025-04-21	Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators	Yilun Zhou et.al.	2504.15253	link
2025-04-21	MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning	Yahan Yang et.al.	2504.15241	null
2025-04-21	Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions	Saffron Huang et.al.	2504.15236	null
2025-04-21	A Self-Improving Coding Agent	Maxime Robeyns et.al.	2504.15228	null
2025-04-21	EvalAgent: Discovering Implicit Evaluation Criteria from the Web	Manya Wadhwa et.al.	2504.15219	null
2025-04-21	Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs	Marina Sakharova et.al.	2504.15210	null
2025-04-21	Compute-Optimal LLMs Provably Generalize Better With Scale	Marc Finzi et.al.	2504.15208	null
2025-04-21	Support Evaluation for the TREC 2024 RAG Track: Comparing Human versus LLM Judges	Nandan Thakur et.al.	2504.15205	null
2025-04-22	Synergistic Weak-Strong Collaboration by Aligning Preferences	Yizhu Jiao et.al.	2504.15188	null
2025-04-21	DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution	Miaomiao Cai et.al.	2504.15176	null
2025-04-21	The Synthetic Imputation Approach: Generating Optimal Synthetic Texts For Underrepresented Categories In Supervised Classification Tasks	Joan C. Timoneda et.al.	2504.15160	null
2025-04-21	KGMEL: Knowledge Graph-Enhanced Multimodal Entity Linking	Juyeon Kim et.al.	2504.15135	link
2025-04-18	Generative AI Act II: Test Time Scaling Drives Cognition Engineering	Shijie Xia et.al.	2504.13828	link
2025-04-18	Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models	Junjie Yang et.al.	2504.13825	null
2025-04-18	CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning	Yang Yue et.al.	2504.13820	link
2025-04-18	Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning	Yixuan Even Xu et.al.	2504.13818	null
2025-04-18	BadApex: Backdoor Attack Based on Adaptive Optimization Mechanism of Black-box Large Language Models	Zhengxian Wu et.al.	2504.13775	null
2025-04-18	DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs	Tamim Al Mahmud et.al.	2504.13774	link
2025-04-18	Detecting Malicious Source Code in PyPI Packages with LLMs: Does RAG Come in Handy?	Motunrayo Ibiyo et.al.	2504.13769	null
2025-04-18	Decoding Vision Transformers: the Diffusion Steering Lens	Ryota Takatsuki et.al.	2504.13763	link
2025-04-18	Scaling sparse feature circuit finding for in-context learning	Dmitrii Kharlapenko et.al.	2504.13756	null
2025-04-18	Learning to Attribute with Attention	Benjamin Cohen-Wang et.al.	2504.13752	link
2025-04-18	Controlled Territory and Conflict Tracking (CONTACT): (Geo-)Mapping Occupied Territory from Open Source Intelligence	Paul K. Mandal et.al.	2504.13730	link
2025-04-18	OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation	Yichen Wu et.al.	2504.13707	null
2025-04-18	Exploring Multimodal Prompt for Visualization Authoring with Large Language Models	Zhen Wen et.al.	2504.13700	null
2025-04-18	Analysing the Robustness of Vision-Language-Models to Common Corruptions	Muhammad Usama et.al.	2504.13690	null
2025-04-18	Intelligent Interaction Strategies for Context-Aware Cognitive Augmentation	Xiangrong et.al.	2504.13684	null
2025-04-18	Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results	Andrea Santilli et.al.	2504.13677	null
2025-04-18	Large Language Models Will Change The Way Children Think About Technology And Impact Every Interaction Paradigm	Russell Beale et.al.	2504.13667	null
2025-04-18	Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code	Antonio Della Porta et.al.	2504.13656	null
2025-04-18	EyecareGPT: Boosting Comprehensive Ophthalmology Understanding with Tailored Dataset, Benchmark and Model	Sijing Li et.al.	2504.13650	link
2025-04-18	Exploring the Potential for Large Language Models to Demonstrate Rational Probabilistic Beliefs	Gabriel Freedman et.al.	2504.13644	link
2025-04-17	Perception Encoder: The best visual embeddings are not at the output of the network	Daniel Bolya et.al.	2504.13181	null
2025-04-17	PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding	Jang Hyun Cho et.al.	2504.13180	link
2025-04-17	It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization	Ali Behrouz et.al.	2504.13173	null
2025-04-17	Sleep-time Compute: Beyond Inference Scaling at Test-time	Kevin Lin et.al.	2504.13171	link
2025-04-17	Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling	Tsung-Han Wu et.al.	2504.13169	link
2025-04-17	CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training	Shizhe Diao et.al.	2504.13161	null
2025-04-17	Digital Twin Generation from Visual Data: A Survey	Andrew Melnik et.al.	2504.13159	link
2025-04-17	MIB: A Mechanistic Interpretability Benchmark	Aaron Mueller et.al.	2504.13151	link
2025-04-17	Exploring Expert Failures Improves LLM Agent Tuning	Li-Cheng Lan et.al.	2504.13145	null
2025-04-17	Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo	João Loula et.al.	2504.13139	null
2025-04-17	Energy-Based Reward Models for Robust Language Model Alignment	Anamika Lochab et.al.	2504.13134	link
2025-04-17	LLMs Meet Finance: Fine-Tuning Foundation Models for the Open FinLLM Leaderboard	Varun Rao et.al.	2504.13125	null
2025-04-17	Low-hallucination Synthetic Captions for Large-Scale Vision-Language Model Pre-training	Xinsong Zhang et.al.	2504.13123	null
2025-04-17	VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models	Haojian Huang et.al.	2504.13122	link
2025-04-17	Probing and Inducing Combinational Creativity in Vision-Language Models	Yongqian Peng et.al.	2504.13120	null
2025-04-17	Object-Driven Narrative in AR: A Scenario-Metaphor Framework with VLM Integration	Yusi Sun et.al.	2504.13119	null
2025-04-17	Uncertainty-Aware Trajectory Prediction via Rule-Regularized Heteroscedastic Deep Classification	Kumar Manas et.al.	2504.13111	link
2025-04-17	EventVAD: Training-Free Event-Aware Video Anomaly Detection	Yihua Shao et.al.	2504.13092	null
2025-04-17	Retrieval-Augmented Generation with Conflicting Evidence	Han Wang et.al.	2504.13079	link
2025-04-18	SkyReels-V2: Infinite-length Film Generative Model	Guibin Chen et.al.	2504.13074	link
2025-04-16	BitNet b1.58 2B4T Technical Report	Shuming Ma et.al.	2504.12285	null
2025-04-16	HLS-Eval: A Benchmark and Framework for Evaluating LLMs on High-Level Synthesis Design Tasks	Stefan Abi-Karam et.al.	2504.12268	link
2025-04-16	FLIP Reasoning Challenge	Andreas Plesner et.al.	2504.12256	link
2025-04-16	AnomalyGen: An Automated Semantic Log Sequence Generation Framework with LLM for Anomaly Detection	Xinyu Li et.al.	2504.12250	null
2025-04-16	MOS: Towards Effective Smart Contract Vulnerability Detection through Mixture-of-Experts Tuning of Large Language Models	Hang Yuan et.al.	2504.12234	null
2025-04-16	Watermarking Needs Input Repetition Masking	David Khachaturov et.al.	2504.12229	null
2025-04-16	d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning	Siyan Zhao et.al.	2504.12216	null
2025-04-16	What Do Large Language Models Know? Tacit Knowledge as a Potential Causal-Explanatory Structure	Céline Budding et.al.	2504.12187	null
2025-04-16	SALAD: Improving Robustness and Generalization through Contrastive Learning with Structure-Aware and LLM-Driven Augmented Data	Suyoung Bae et.al.	2504.12185	null
2025-04-16	Trusting CHATGPT: how minor tweaks in the prompts lead to major differences in sentiment classification	Jaime E. Cuellar et.al.	2504.12180	null
2025-04-16	Multilingual Contextualization of Large Language Models for Document-Level Machine Translation	Miguel Moura Ramos et.al.	2504.12140	null
2025-04-16	Efficient Contrastive Decoding with Probabilistic Hallucination Detection - Mitigating Hallucinations in Large Vision Language Models -	Laura Fieback et.al.	2504.12137	null
2025-04-16	Clarifying Ambiguities: on the Role of Ambiguity Types in Prompting Methods for Clarification Generation	Anfu Tang et.al.	2504.12113	null
2025-04-16	Entropy-Guided Watermarking for LLMs: A Test-Time Framework for Robust and Traceable Text Generation	Shizhan Cai et.al.	2504.12108	null
2025-04-16	Logits DeConfusion with CLIP for Few-Shot Learning	Shuo Li et.al.	2504.12104	link
2025-04-16	Gauging Overprecision in LLMs: An Empirical Study	Adil Bahaj et.al.	2504.12098	null
2025-04-16	Reasoning-Based AI for Startup Evaluation (R.A.I.S.E.): A Memory-Augmented, Multi-Step Decision Framework	Jack Preuveneers et.al.	2504.12090	null
2025-04-16	Self-alignment of Large Video Language Models with Refined Regularized Preference Optimization	Pritam Sarkar et.al.	2504.12083	null
2025-04-16	Selective Demonstration Retrieval for Improved Implicit Hate Speech Detection	Yumin Kim et.al.	2504.12082	null
2025-04-16	Subitizing-Inspired_Large_Language_Models_for_Floorplanning	Shao-Chien Lu et.al.	2504.12076	null
2025-04-16	Elucidating the Design Space of Multimodal Protein Language Models	Cheng-Yen Hsieh et.al.	2504.11454	null
2025-04-15	TextArena	Leon Guertler et.al.	2504.11442	link
2025-04-15	Masculine Defaults via Gendered Discourse in Podcasts and Large Language Models	Maria Teleki et.al.	2504.11431	link
2025-04-15	A Dual-Space Framework for General Knowledge Distillation of Large Language Models	Xue Zhang et.al.	2504.11426	null
2025-04-15	Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts	Quanyu Long et.al.	2504.11420	null
2025-04-15	Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning	Ali Taghibakhshi et.al.	2504.11409	null
2025-04-15	DataDecide: How to Predict Best Pretraining Data with Small Experiments	Ian Magnusson et.al.	2504.11393	null
2025-04-15	RankAlign: A Ranking View of the Generator-Validator Gap in Large Language Models	Juan Diego Rodriguez et.al.	2504.11381	link
2025-04-15	Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions	Wang Bill Zhu et.al.	2504.11373	link
2025-04-15	OpenTuringBench: An Open-Model-based Benchmark and Framework for Machine-Generated Text Detection and Attribution	Lucio La Cava et.al.	2504.11369	null
2025-04-15	From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation	Jingkun Chen et.al.	2504.11368	null
2025-04-15	Teaching Large Language Models to Reason through Learning and Forgetting	Tianwei Ni et.al.	2504.11364	link
2025-04-15	Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning	Haiming Wang et.al.	2504.11354	link
2025-04-15	Seedream 3.0 Technical Report	Yu Gao et.al.	2504.11346	null
2025-04-15	A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce	Wei Xiong et.al.	2504.11343	link
2025-04-15	REWARD CONSISTENCY: Improving Multi-Objective Alignment from a Data-Centric Perspective	Zhihao Xu et.al.	2504.11337	null
2025-04-15	Looking beyond the next token	Abitha Thankaraj et.al.	2504.11336	null
2025-04-15	Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints	Ruicheng Ao et.al.	2504.11320	link
2025-04-15	Learning to Be A Doctor: Searching for Effective Medical Agent Architectures	Yangyang Zhuang et.al.	2504.11301	null
2025-04-15	Automated Python Translation	Joshua Otten et.al.	2504.11290	null
2025-04-14	InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models	Jinguo Zhu et.al.	2504.10479	link
2025-04-14	Weight Ensembling Improves Reasoning in Language Models	Xingyu Dang et.al.	2504.10478	null
2025-04-14	MIEB: Massive Image Embedding Benchmark	Chenghao Xiao et.al.	2504.10471	link
2025-04-14	Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding	Tao Zhang et.al.	2504.10465	link
2025-04-14	The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer	Weixian Lei et.al.	2504.10462	link
2025-04-14	GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents	Xiaobo Xia et.al.	2504.10458	null
2025-04-14	M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models	Junxiong Wang et.al.	2504.10449	link
2025-04-14	Multimodal Long Video Modeling Based on Temporal Dynamic Context	Haoran Hao et.al.	2504.10443	link
2025-04-14	LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models	Minqian Liu et.al.	2504.10430	null
2025-04-14	Foundation models for electronic health records: representation dynamics and transferability	Michael C. Burkhart et.al.	2504.10422	link
2025-04-14	Can We Edit LLMs for Long-Tail Biomedical Knowledge?	Xinhao Yi et.al.	2504.10421	link
2025-04-15	Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA	Michał Turski et.al.	2504.10419	link
2025-04-14	CliniChat: A Multi-Source Knowledge-Driven Framework for Clinical Interview Dialogue Reconstruction and Evaluation	Jing Chen et.al.	2504.10418	null
2025-04-14	LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models	Parshin Shojaee et.al.	2504.10415	link
2025-04-14	Performance of Large Language Models in Supporting Medical Diagnosis and Treatment	Diogo Sousa et.al.	2504.10405	null
2025-04-14	Satellite Federated Fine-Tuning for Foundation Models in Space Computing Power Networks	Yan zhu et.al.	2504.10403	null
2025-04-14	Can LLMs Assist Expert Elicitation for Probabilistic Causal Modeling?	Olha Shaposhnyk et.al.	2504.10397	null
2025-04-14	SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning	Yiting Wang et.al.	2504.10369	null
2025-04-14	DICE: A Framework for Dimensional and Contextual Evaluation of Language Models	Aryan Shrivastava et.al.	2504.10359	null
2025-04-14	Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis	Yifan Yang et.al.	2504.10352	null
2025-04-11	Quantum Large Language Model Fine-Tuning	Sang Hyub Kim et.al.	2504.08732	null
2025-04-11	DocAgent: A Multi-Agent System for Automated Code Documentation Generation	Dayu Yang et.al.	2504.08725	link
2025-04-11	SWAN-GPT: An Efficient and Scalable Approach for Long-Context Language Modeling	Krishna C. Puvvada et.al.	2504.08719	null
2025-04-11	SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents	Muhammad Shihab Rashid et.al.	2504.08703	link
2025-04-11	Large Language Models as Span Annotators	Zdeněk Kasner et.al.	2504.08697	null
2025-04-11	TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning	Hang Ni et.al.	2504.08694	null
2025-04-11	Fast-Slow-Thinking: Complex Task Solving with Large Language Models	Yiliu Sun et.al.	2504.08690	null
2025-04-11	Voice Interaction With Conversational AI Could Facilitate Thoughtful Reflection and Substantive Revision in Writing	Jiho Kim et.al.	2504.08687	null
2025-04-11	Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model	Team Seawead et.al.	2504.08685	null
2025-04-11	Variability-Driven User-Story Generation using LLM and Triadic Concept Analysis	Alexandre Bazin et.al.	2504.08666	null
2025-04-11	Quality evaluation of Tabby coding assistant using real source code snippets	Marta Borek et.al.	2504.08650	link
2025-04-11	Do LLMs trust AI regulation? Emerging behaviour of game-theoretic LLM agents	Alessio Buscemi et.al.	2504.08640	null
2025-04-11	Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging	Gabriele Lozupone et.al.	2504.08635	link
2025-04-11	MooseAgent: A LLM Based Multi-agent Framework for Automating Moose Simulation	Tao Zhang et.al.	2504.08621	link
2025-04-11	Analyzing 16,193 LLM Papers for Fun and Profits	Zhiqiu Xia et.al.	2504.08619	null
2025-04-11	Playpen: An Environment for Exploring Learning Through Conversational Interaction	Nicola Horst et.al.	2504.08590	link
2025-04-11	AstroLLaVA: towards the unification of astronomical data and natural language	Sharaf Zaman et.al.	2504.08583	null
2025-04-11	UoB-NLP at SemEval-2025 Task 11: Leveraging Adapters for Multilingual and Cross-Lingual Emotion Detection	Frances Laureano De Leon et.al.	2504.08543	null
2025-04-11	Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions	Tommaso Galliena et.al.	2504.08531	null
2025-04-11	On The Landscape of Spoken Language Models: A Comprehensive Survey	Siddhant Arora et.al.	2504.08528	null
2025-04-10	Cat, Rat, Meow: On the Alignment of Language Model and Human Term-Similarity Judgments	Lorenz Linhardt et.al.	2504.07965	null
2025-04-10	C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing	Zhongyang Li et.al.	2504.07964	link
2025-04-10	GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation	Lang Lin et.al.	2504.07962	null
2025-04-10	Detect Anything 3D in the Wild	Hanxue Zhang et.al.	2504.07958	link
2025-04-10	MM-IFEngine: Towards Multimodal Instruction Following	Shengyuan Ding et.al.	2504.07957	link
2025-04-10	VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning	Yukun Qi et.al.	2504.07956	null
2025-04-10	Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory	Mirac Suzgun et.al.	2504.07952	link
2025-04-10	We Are All Creators: Generative AI, Collective Knowledge, and the Path Towards Human-AI Synergy	Jordi Linares-Pellicer et.al.	2504.07936	null
2025-04-10	Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining	Rosie Zhao et.al.	2504.07912	link
2025-04-10	Porting an LLM based Application from ChatGPT to an On-Premise Environment	Teemu Paloniemi et.al.	2504.07907	null
2025-04-10	Redefining Machine Translation on Social Network Services with Large Language Models	Hongcheng Guo et.al.	2504.07901	link
2025-04-10	How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective	Qi Liu et.al.	2504.07898	link
2025-04-10	Fast Adaptation with Behavioral Foundation Models	Harshit Sikchi et.al.	2504.07896	null
2025-04-10	Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge	Riccardo Cantini et.al.	2504.07887	link
2025-04-11	An LLM-Driven Multi-Agent Debate System for Mendelian Diseases	Xinyang Zhou et.al.	2504.07881	null
2025-04-10	Token Level Routing Inference System for Edge Devices	Jianshu She et.al.	2504.07878	null
2025-04-10	SAMJAM: Zero-Shot Video Scene Graph Generation for Egocentric Kitchen Videos	Joshua Li et.al.	2504.07867	null
2025-04-11	Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs	Yichun Yin et.al.	2504.07866	null
2025-04-10	Robust Hallucination Detection in LLMs via Adaptive Token Selection	Mengjia Niu et.al.	2504.07863	null
2025-04-10	2D-Curri-DPO: Two-Dimensional Curriculum Learning for Direct Preference Optimization	Mengyang Li et.al.	2504.07856	null
2025-04-09	Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning	Nikhil Shivakumar Nayak et.al.	2504.07097	link
2025-04-09	OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens	Jiacheng Liu et.al.	2504.07096	null
2025-04-09	Are We Done with Object-Centric Learning?	Alexander Rubinstein et.al.	2504.07092	link
2025-04-09	KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs	Elan Markowitz et.al.	2504.07087	null
2025-04-09	A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility	Andreas Hochlehnert et.al.	2504.07086	null
2025-04-09	Self-Steering Language Models	Gabriel Grand et.al.	2504.07081	null
2025-04-09	DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning	Atharva Pandey et.al.	2504.07080	null
2025-04-09	Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation	Israfel Salazar et.al.	2504.07072	null
2025-04-09	A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models	Zhouhang Xie et.al.	2504.07070	null
2025-04-09	HalluciNot: Hallucination Detection Through Context and Common Knowledge Verification	Bibek Paudel et.al.	2504.07069	null
2025-04-09	Teaching pathology foundation models to accurately predict gene expression with parameter efficient knowledge transfer	Shi Pan et.al.	2504.07061	null
2025-04-09	TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling	Liang-Hsuan Tseng et.al.	2504.07053	link
2025-04-09	To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning	Tian Qin et.al.	2504.07052	null
2025-04-09	Evaluating Retrieval Augmented Generative Models for Document Queries in Transportation Safety	Chad Melton et.al.	2504.07022	null
2025-04-09	LLM-IFT: LLM-Powered Information Flow Tracking for Secure Hardware	Nowfel Mashnoor et.al.	2504.07015	null
2025-04-09	Towards LLMs Robustness to Changes in Prompt Format Styles	Lilian Ngweta et.al.	2504.06969	null
2025-04-09	Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation	Thomas Kerdreux et.al.	2504.06962	null
2025-04-09	VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning	Xinhao Li et.al.	2504.06958	null
2025-04-09	Adaptive Computation Pruning for the Forgetting Transformer	Zhixuan Lin et.al.	2504.06949	null
2025-04-09	RuOpinionNE-2024: Extraction of Opinion Tuples from Russian News Texts	Natalia Loukachevitch et.al.	2504.06947	link
2025-04-08	GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization	Bojana Ranković et.al.	2504.06265	link
2025-04-08	OmniSVG: A Unified Scalable Vector Graphics Generation Model	Yiying Yang et.al.	2504.06263	null
2025-04-08	Hogwild! Inference: Parallel LLM Generation via Concurrent Attention	Gleb Rodionov et.al.	2504.06261	link
2025-04-08	FEABench: Evaluating Language Models on Multiphysics Reasoning Ability	Nayantara Mudur et.al.	2504.06260	link
2025-04-08	Orb-v3: atomistic simulation at scale	Benjamin Rhodes et.al.	2504.06231	link
2025-04-08	LExT: Towards Evaluating Trustworthiness of Natural Language Explanations	Krithi Shailya et.al.	2504.06227	null
2025-04-08	Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation	Biao Zhang et.al.	2504.06225	null
2025-04-09	Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation	Xiaoxing Hu et.al.	2504.06220	link
2025-04-08	Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs	Dongyang Fan et.al.	2504.06219	null
2025-04-08	From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models	Chejian Xu et.al.	2504.06214	null
2025-04-08	TxGemma: Efficient and Agentic LLMs for Therapeutics	Eric Wang et.al.	2504.06196	null
2025-04-08	A Self-Supervised Framework for Space Object Behaviour Characterisation	Ian Groves et.al.	2504.06176	null
2025-04-08	Assessing how hyperparameters impact Large Language Models' sarcasm detection performance	Montgomery Gole et.al.	2504.06166	null
2025-04-09	Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups	Rijul Magu et.al.	2504.06160	null
2025-04-08	A Large-Scale Analysis on Contextual Self-Supervised Video Representation Learning	Akash Kumar et.al.	2504.06153	null
2025-04-08	V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models	Xiangxi Zheng et.al.	2504.06148	link
2025-04-08	ARLO: A Tailorable Approach for Transforming Natural Language Software Requirements into Architecture using LLMs	Tooraj Helmi et.al.	2504.06143	null
2025-04-08	Adversarial Training of Reward Models	Alexander Bukharin et.al.	2504.06141	null
2025-04-08	A Multimedia Analytics Model for the Foundation Model Era	Marcel Worring et.al.	2504.06138	null
2025-04-08	QGen Studio: An Adaptive Question-Answer Generation, Training and Evaluation Platform	Movina Moses et.al.	2504.06136	null
2025-04-07	URECA: Unique Region Caption Anything	Sangbeom Lim et.al.	2504.05305	null
2025-04-07	InteractVLM: 3D Interaction Reasoning from 2D Foundational Models	Sai Kumar Dwivedi et.al.	2504.05303	link
2025-04-07	SmolVLM: Redefining small and efficient multimodal models	Andrés Marafioti et.al.	2504.05299	null
2025-04-07	Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations	Pedro Ferreira et.al.	2504.05294	null
2025-04-07	The challenge of uncertainty quantification of large language models in medicine	Zahra Atf et.al.	2504.05278	null
2025-04-07	Enhancing LLM-Based Short Answer Grading with Retrieval-Augmented Generation	Yucheng Chu et.al.	2504.05276	null
2025-04-07	Do PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning vs. Memorization in Large Language Models	Yang Yan et.al.	2504.05262	null
2025-04-07	Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models	Adrián Bazaga et.al.	2504.05258	null
2025-04-07	Explaining Low Perception Model Competency with High-Competency Counterfactuals	Sara Pohland et.al.	2504.05254	null
2025-04-07	LLM-based Automated Grading with Human-in-the-Loop	Hang Li et.al.	2504.05239	null
2025-04-07	NoveltyBench: Evaluating Creativity and Diversity in Language Models	Yiming Zhang et.al.	2504.05228	null
2025-04-07	A Reality Check of Vision-Language Pre-training in Radiology: Have We Progressed Using Text?	Julio Silva-Rodríguez et.al.	2504.05227	null
2025-04-07	Vision-Language Model Predictive Control for Manipulation Planning and Trajectory Generation	Jiaming Chen et.al.	2504.05225	link
2025-04-08	Leveraging LLMs for Utility-Focused Annotation: Reducing Manual Effort for Retrieval and RAG	Hengran Zhang et.al.	2504.05220	null
2025-04-07	Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling	Hengran Zhang et.al.	2504.05216	null
2025-04-07	Post-Training Language Models for Continual Relation Extraction	Sefika Efeoglu et.al.	2504.05214	null
2025-04-07	Quantum Program Linting with LLMs: Emerging Results from a Comparative Study	Seung Yeob Shin et.al.	2504.05204	null
2025-04-07	Training state-of-the-art pathology foundation models with orders of magnitude less data	Mikhail Karasikov et.al.	2504.05186	null
2025-04-07	Concise Reasoning via Reinforcement Learning	Mehdi Fatemi et.al.	2504.05185	link
2025-04-07	BRIDGES: Bridging Graph Modality and Large Language Models within EDA Tasks	Wei Li et.al.	2504.05180	null
2025-04-04	Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions	Ting-Hsuan Liao et.al.	2504.03639	null
2025-04-04	Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning	Xinyi Wang et.al.	2504.03635	null
2025-04-04	Align to Structure: Aligning Large Language Models with Structural Information	Zae Myung Kim et.al.	2504.03622	null
2025-04-04	VISTA-OCR: Towards generative and interactive end to end OCR models	Laziz Hamdi et.al.	2504.03621	null
2025-04-04	Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task	Leonardo Ranaldi et.al.	2504.03616	null
2025-04-04	AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset	Bingxiang He et.al.	2504.03612	null
2025-04-04	MedSAM2: Segment Anything in 3D Medical Images and Videos	Jun Ma et.al.	2504.03600	link
2025-04-04	EnrichIndex: Using LLMs to Enrich Retrieval Indices Offline	Peter Baile Chen et.al.	2504.03598	null
2025-04-04	PF3Det: A Prompted Foundation Feature Assisted Visual LiDAR 3D Detector	Kaidong Li et.al.	2504.03563	null
2025-04-04	Agentic Knowledgeable Self-awareness	Shuofei Qiao et.al.	2504.03553	link
2025-04-04	RANa: Retrieval-Augmented Navigation	Gianluca Monaci et.al.	2504.03524	null
2025-04-04	Neutralizing the Narrative: AI-Powered Debiasing of Online News Articles	Chen Wei Kuo et.al.	2504.03520	null
2025-04-04	SpectR: Dynamically Composing LM Experts with Spectral Routing	William Fleshman et.al.	2504.03454	null
2025-04-04	Optimizing Specific and Shared Parameters for Efficient Parameter Tuning	Van-Anh Nguyen et.al.	2504.03450	null
2025-04-04	LLMSched: Uncertainty-Aware Workload Scheduling for Compound LLM Applications	Botao Zhu et.al.	2504.03444	null
2025-04-04	Know What You do Not Know: Verbalized Uncertainty Estimation Robustness on Corrupted Images in Vision-Language Models	Mirko Borszukovszki et.al.	2504.03440	null
2025-04-04	Locations of Characters in Narratives: Andersen and Persuasion Datasets	Batuhan Ozyurt et.al.	2504.03434	link
2025-04-04	Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning	Sanghwan Bae et.al.	2504.03380	null
2025-04-04	MultiClear: Multimodal Soft Exoskeleton Glove for Transparent Object Grasping Assistance	Chen Hu et.al.	2504.03379	null
2025-04-04	Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency	Erik Johannes Husom et.al.	2504.03360	null
2025-04-03	STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection	Divya Velayudhan et.al.	2504.02823	null
2025-04-03	Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models	Mateusz Pach et.al.	2504.02821	link
2025-04-03	Generative Evaluation of Complex Reasoning in Large Language Models	Haowei Lin et.al.	2504.02810	link
2025-04-03	MegaMath: Pushing the Limits of Open Math Corpora	Fan Zhou et.al.	2504.02807	link
2025-04-03	F-ViTA: Foundation Model Guided Visible to Thermal Translation	Jay N. Paranjape et.al.	2504.02801	link
2025-04-04	A Survey of Large Language Models in Mental Health Disorder Detection on Social Media	Zhuohan Ge et.al.	2504.02800	null
2025-04-03	Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence	Anita Rau et.al.	2504.02799	null
2025-04-03	A Framework for Situating Innovations, Opportunities, and Challenges in Advancing Vertical Systems with Large AI Models	Gaurav Verma et.al.	2504.02793	null
2025-04-03	Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets	Chuning Zhu et.al.	2504.02792	null
2025-04-03	A Framework for Robust Cognitive Evaluation of LLMs	Karin de Langis et.al.	2504.02789	null
2025-04-03	From Consumption to Collaboration: Measuring Interaction Patterns to Augment Human Cognition in Open-Ended Tasks	Joshua Holstein et.al.	2504.02780	null
2025-04-03	BT-ACTION: A Test-Driven Approach for Modular Understanding of User Instruction Leveraging Behaviour Trees and LLMs	Alexander Leszczynski et.al.	2504.02779	link
2025-04-03	How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices?	Andres Algaba et.al.	2504.02767	link
2025-04-03	Robot-Led Vision Language Model Wellbeing Assessment of Children	Nida Itrat Abbasi et.al.	2504.02765	null
2025-04-03	Enhancing LLM Robustness to Perturbed Instructions: An Empirical Study	Aryan Agrawal et.al.	2504.02733	link
2025-04-04	Why do LLMs attend to the first token?	Federico Barbero et.al.	2504.02732	null
2025-04-03	ERPO: Advancing Safety Alignment via Ex-Ante Reasoning Preference Optimization	Kehua Feng et.al.	2504.02725	null
2025-04-03	TeleMoM: Consensus-Driven Telecom Intelligence via Mixture of Models	Xinquan Wang et.al.	2504.02712	null
2025-04-03	The Hidden Space of Safety: Understanding Preference-Tuned LLMs in Multilingual context	Nikhil Verma et.al.	2504.02708	null
2025-04-03	LLM for Complex Reasoning Task: An Exploratory Study in Fermi Problems	Zishuo Liu et.al.	2504.02671	null
2025-04-02	Slot-Level Robotic Placement via Visual Imitation from Single Human Video	Dandan Shan et.al.	2504.01959	null
2025-04-02	Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target Granularities	Jing Liu et.al.	2504.01954	null
2025-04-02	The LLM Wears Prada: Analysing Gender Bias and Stereotypes through Online Shopping Data	Massimiliano Luca et.al.	2504.01951	null
2025-04-02	Efficient Federated Learning Tiny Language Models for Mobile Network Feature Prediction	Daniel Becking et.al.	2504.01947	null
2025-04-02	OpenCodeReasoning: Advancing Data Distillation for Competitive Coding	Wasi Uddin Ahmad et.al.	2504.01943	null
2025-04-02	Critical Thinking: Which Kinds of Complexity Govern Optimal Reasoning Length?	Celine Lee et.al.	2504.01935	link
2025-04-02	A thorough benchmark of automatic text classification: From traditional approaches to large language models	Washington Cunha et.al.	2504.01930	link
2025-04-02	Gen-C: Populating Virtual Worlds with Generative Crowds	Andreas Panayiotou et.al.	2504.01924	null
2025-04-02	Is Less Really More? Fake News Detection with Limited Information	Zhaoyang Cao et.al.	2504.01922	link
2025-04-02	Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation	Baban Gain et.al.	2504.01919	null
2025-04-02	FineLIP: Extending CLIP's Reach via Fine-Grained Alignment with Longer Text Inputs	Mothilal Asokan et.al.	2504.01916	link
2025-04-02	Advancing AI-Scientist Understanding: Making LLM Think Like a Physicist with Interpretable Reasoning	Yinggan Xu et.al.	2504.01911	null
2025-04-02	Is Temporal Prompting All We Need For Limited Labeled Action Recognition?	Shreyank N Gowda et.al.	2504.01890	null
2025-04-02	TransientTables: Evaluating LLMs' Reasoning on Temporally Evolving Semi-structured Tables	Abhilash Shankarampeta et.al.	2504.01879	null
2025-04-02	From Code Generation to Software Testing: AI Copilot with Context-Based RAG	Yuchen Wang et.al.	2504.01866	null
2025-04-02	Cross-Lingual Consistency: A Novel Inference Framework for Advancing Reasoning in Large Language Models	Zhiwei Yu et.al.	2504.01857	null
2025-04-02	Code Red! On the Harmfulness of Applying Off-the-shelf Large Language Models to Programming Tasks	Ali Al-Kaswan et.al.	2504.01850	null
2025-04-02	LARGE: Legal Retrieval Augmented Generation Evaluation Tool	Minhu Park et.al.	2504.01840	link
2025-04-02	Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images	Nusrat Munia et.al.	2504.01838	link
2025-04-02	YourBench: Easy Custom Evaluation Sets for Everyone	Sumuk Shashidhar et.al.	2504.01833	link
2025-03-31	Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation	Shengqiong Wu et.al.	2503.24379	null
2025-03-31	ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning	Harsha Kokel et.al.	2503.24378	null
2025-03-31	Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models	Rui Wang et.al.	2503.24377	link
2025-03-31	Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1	Yi Chen et.al.	2503.24376	link
2025-03-31	Effectively Controlling Reasoning Models through Thinking Intervention	Tong Wu et.al.	2503.24370	null
2025-03-31	Adapting Vision Foundation Models for Real-time Ultrasound Image Segmentation	Xiaoran Zhang et.al.	2503.24368	null
2025-03-31	ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion	Rana Muhammad Shahroz Khan et.al.	2503.24354	null
2025-03-31	PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks	Fang Yan et.al.	2503.24345	null
2025-03-31	Can Test-Time Scaling Improve World Foundation Model?	Wenyan Cong et.al.	2503.24320	link
2025-03-31	BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models	Alok Abhishek et.al.	2503.24310	null
2025-03-31	A Systematic Evaluation of LLM Strategies for Mental Health Text Analysis: Fine-tuning vs. Prompt Engineering vs. RAG	Arshia Kermani et.al.	2503.24307	null
2025-03-31	Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning	Jiacheng Lin et.al.	2503.24289	link
2025-03-31	Style Quantization for Data-Efficient GAN Training	Jian Wang et.al.	2503.24282	null
2025-03-31	Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality	Sewoong Lee et.al.	2503.24277	link
2025-03-31	Enhancing Large Language Models (LLMs) for Telecommunications using Knowledge Graphs and Retrieval-Augmented Generation	Dun Yuan et.al.	2503.24245	null
2025-03-31	What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models	Qiyuan Zhang et.al.	2503.24235	link
2025-03-31	Synthetic News Generation for Fake News Classification	Abdul Sittar et.al.	2503.24206	null
2025-03-31	TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers' Guidance	Jingxian Xu et.al.	2503.24198	null
2025-03-31	Text2Tracks: Prompt-based Music Recommendation via Generative Retrieval	Enrico Palumbo et.al.	2503.24193	null
2025-03-31	Output Constraints as Attack Surface: Exploiting Structured Generation to Bypass LLM Safety Mechanisms	Shuoming Zhang et.al.	2503.24191	null
2025-03-28	Q-Insight: Understanding Image Quality via Visual Reinforcement Learning	Weiqi Li et.al.	2503.22679	link
2025-03-28	QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?	Belinda Z. Li et.al.	2503.22674	link
2025-03-28	Exploring the Effectiveness of Multi-stage Fine-tuning for Cross-encoder Re-rankers	Francesca Pezzuti et.al.	2503.22672	link
2025-03-28	Understanding Co-speech Gestures in-the-wild	Sindhu B Hegde et.al.	2503.22668	null
2025-03-28	Unicorn: Text-Only Data Synthesis for Vision Language Model Training	Xiaomin Yu et.al.	2503.22655	link
2025-03-28	Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users	Antonia Karamolegkou et.al.	2503.22610	null
2025-03-28	On the Alignment of Post-Publication Reviews & Bibliometric and Altmetric Impact -- A Case Study on Expert Statements from the Science Media Center Germany	Dirk Tunger et.al.	2503.22594	null
2025-03-28	LLM-enabled Instance Model Generation	Fengjunjie Pan et.al.	2503.22587	null
2025-03-28	Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish	Kevin Cohen et.al.	2503.22585	link
2025-03-28	Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation	Sarubi Thillainathan et.al.	2503.22582	null
2025-03-28	Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization	Iñigo Pikabea et.al.	2503.22577	null
2025-03-28	Niyama : Breaking the Silos of LLM Inference Serving	Kanishk Goel et.al.	2503.22562	null
2025-03-28	Bridging the Dimensional Chasm: Uncover Layer-wise Dimensional Reduction in Transformers through Token Correlation	Zhuo-Yang Song et.al.	2503.22547	null
2025-03-28	Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities	Raman Dutt et.al.	2503.22517	null
2025-03-28	Assessing Foundation Models for Sea Ice Type Segmentation in Sentinel-1 SAR Imagery	Samira Alkaee Taleghan et.al.	2503.22516	null
2025-03-28	Probabilistic Uncertain Reward Model: A Natural Generalization of Bradley-Terry Reward Model	Wangtao Sun et.al.	2503.22480	null
2025-03-28	WorkTeam: Constructing Workflows from Natural Language with Multi-Agents	Hanchao Liu et.al.	2503.22473	null
2025-03-28	Evaluating LLM-based Agents for Multi-Turn Conversations: A Survey	Shengyue Guan et.al.	2503.22458	null
2025-03-28	Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning	Abdullah Vanlioglu et.al.	2503.22456	null
2025-03-28	STADE: Standard Deviation as a Pruning Metric	Diego Coello de Portugal Mecke et.al.	2503.22451	link
2025-03-27	Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model	Abdelrahman Shaker et.al.	2503.21782	link
2025-03-27	Video-R1: Reinforcing Video Reasoning in MLLMs	Kaituo Feng et.al.	2503.21776	link
2025-03-27	Stable-SCore: A Stable Registration-based Framework for 3D Shape Correspondence	Haolin Liu et.al.	2503.21766	null
2025-03-27	Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video	David Yifan Yao et.al.	2503.21761	link
2025-03-27	MemInsight: Autonomous Memory Augmentation for LLM Agents	Rana Salama et.al.	2503.21760	null
2025-03-27	Fwd2Bot: LVLM Visual Token Compression with Double Forward Bottleneck	Adrian Bulat et.al.	2503.21757	null
2025-03-27	GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics	Arsham Gholamzadeh Khoee et.al.	2503.21735	null
2025-03-27	Effective Skill Unlearning through Intervention and Abstention	Yongce Li et.al.	2503.21730	link
2025-03-27	Collab: Controlled Decoding using Mixture of Agents for LLM Alignment	Souradip Chakraborty et.al.	2503.21720	null
2025-03-27	Outlier dimensions favor frequent tokens in language model	Iuri Macocco et.al.	2503.21718	null
2025-03-27	As easy as PIE: understanding when pruning causes language models to disagree	Pietro Tropeano et.al.	2503.21714	link
2025-03-27	Enhancing Repository-Level Software Repair via Repository-Aware Knowledge Graphs	Boyang Yang et.al.	2503.21710	null
2025-03-27	LLM-Gomoku: A Large Language Model-Based System for Strategic Gomoku with Self-Play and Reinforcement Learning	Hui Wang et.al.	2503.21683	null
2025-03-27	JiraiBench: A Bilingual Benchmark for Evaluating Large Language Models' Detection of Human Self-Destructive Behavior Content in Jirai Community	Yunze Xiao et.al.	2503.21679	null
2025-03-27	How do language models learn facts? Dynamics, curricula and hallucinations	Nicolas Zucchet et.al.	2503.21676	null
2025-03-27	Intelligent IoT Attack Detection Design via ODLLM with Feature Ranking-based Knowledge Base	Satvik Verma et.al.	2503.21674	link
2025-03-27	Model Assembly Learning with Heterogeneous Layer Weight Merging	Yi-Kai Zhang et.al.	2503.21657	null
2025-03-27	UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning	Zhengxi Lu et.al.	2503.21620	link
2025-03-27	Leveraging Language Models for Analyzing Longitudinal Experiential Data in Education	Ahatsham Hayat et.al.	2503.21617	null
2025-03-27	Evaluating book summaries from internal knowledge in Large Language Models: a cross-model and semantic consistency approach	Javier Coronado-Blázquez et.al.	2503.21613	null
2025-03-26	Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark	Sondos Mahmoud Bsharat et.al.	2503.20786	link
2025-03-26	Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency	Tianqi Liu et.al.	2503.20785	link
2025-03-26	Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields	Shijie Zhou et.al.	2503.20776	null
2025-03-26	ASGO: Adaptive Structured Gradient Optimization	Kang An et.al.	2503.20762	null
2025-03-26	MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search	Yunhai Hu et.al.	2503.20757	null
2025-03-27	Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning	Huajie Tan et.al.	2503.20752	null
2025-03-26	UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines	Chen Tang et.al.	2503.20748	null
2025-03-26	MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams	Yanpeng Sun et.al.	2503.20745	null
2025-03-26	Dynamic Motion Blending for Versatile Motion Editing	Nan Jiang et.al.	2503.20724	null
2025-03-26	From Annotation to Adaptation: Metrics, Synthetic Data, and Aspect Extraction for Aspect-Based Sentiment Analysis with Large Language Models	Nikita Neveditsin et.al.	2503.20715	null
2025-03-26	MMMORRF: Multimodal Multilingual Modularized Reciprocal Rank Fusion	Saron Samuel et.al.	2503.20698	null
2025-03-26	Graph-Enhanced Model-Free Reinforcement Learning Agents for Efficient Power Grid Topological Control	Eloy Anguiano Batanero et.al.	2503.20688	null
2025-03-27	Flip Learning: Weakly Supervised Erase to Segment Nodules in Breast Ultrasound	Yuhao Huang et.al.	2503.20685	null
2025-03-27	Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy	Yinan Sun et.al.	2503.20673	null
2025-03-26	TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews	Huimin Xu et.al.	2503.20666	null
2025-03-26	AutoRad-Lung: A Radiomic-Guided Prompting Autoregressive Vision-Language Model for Lung Nodule Malignancy Prediction	Sadaf Khademi et.al.	2503.20662	null
2025-03-26	AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports	Xiangwen Zhang et.al.	2503.20654	null
2025-03-26	Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging	Han Wu et.al.	2503.20641	link
2025-03-26	Collaborative Storytelling and LLM: A Linguistic Analysis of Automatically-Generated Role-Playing Game Sessions	Alessandro Maisto et.al.	2503.20623	null
2025-03-26	IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting	Hao Fu et.al.	2503.20612	link
2025-03-25	SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining	Xiang Xu et.al.	2503.19912	link
2025-03-25	CoLLM: A Large Language Model for Composed Image Retrieval	Chuong Huynh et.al.	2503.19910	link
2025-03-25	FullDiT: Multi-Task Video Generative Foundation Model with Full Attention	Xuan Ju et.al.	2503.19907	null
2025-03-25	CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning	Hao Yu et.al.	2503.19900	link
2025-03-25	A Multi-Agent Framework Integrating Large Language Models and Generative AI for Accelerated Metamaterial Design	Jie Tian et.al.	2503.19889	null
2025-03-25	CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation	Nengbo Wang et.al.	2503.19878	null
2025-03-25	Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators	Seungone Kim et.al.	2503.19877	null
2025-03-25	SLA-Awareness for AI-assisted coding	Kishanthan Thangarajah et.al.	2503.19876	null
2025-03-25	Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking	Xiaoyu Tian et.al.	2503.19855	null
2025-03-25	Towards Online Multi-Modal Social Interaction Understanding	Xinpeng Li et.al.	2503.19851	link
2025-03-25	FALCONEye: Finding Answers and Localizing Content in ONE-hour-long videos with multi-modal LLMs	Carlos Plou et.al.	2503.19850	null
2025-03-25	A Comparative Analysis of Word Segmentation, Part-of-Speech Tagging, and Named Entity Recognition for Historical Chinese Sources, 1900-1950	Zhao Fang et.al.	2503.19844	null
2025-03-25	FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model	Jun Zhou et.al.	2503.19839	null
2025-03-25	Domain-incremental White Blood Cell Classification with Privacy-aware Continual Learning	Pratibha Kumari et.al.	2503.19819	null
2025-03-25	SeLIP: Similarity Enhanced Contrastive Language Image Pretraining for Multi-modal Head MRI	Zhiyang Liu et.al.	2503.19801	null
2025-03-25	SemEval-2025 Task 9: The Food Hazard Detection Challenge	Korbinian Randl et.al.	2503.19800	null
2025-03-25	PAVE: Patching and Adapting Video Large Language Models	Zhuoming Liu et.al.	2503.19794	link
2025-03-25	Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models	Kartik Thakral et.al.	2503.19783	null
2025-03-25	LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation	Vladan Stojnić et.al.	2503.19777	link
2025-03-25	OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations	Christina Kassab et.al.	2503.19764	null
2025-03-24	DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation	Karim Abou Zeid et.al.	2503.18944	link
2025-03-24	SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding	Mingze Xu et.al.	2503.18943	null
2025-03-24	Video-T1: Test-Time Scaling for Video Generation	Fangfu Liu et.al.	2503.18942	null
2025-03-24	Exploring Training and Inference Scaling Laws in Generative Retrieval	Hongru Cai et.al.	2503.18941	link
2025-03-24	CoMP: Continual Multimodal Pre-training for Vision Foundation Models	Yitong Chen et.al.	2503.18931	link
2025-03-24	Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training	Brian R. Bartoldson et.al.	2503.18929	null
2025-03-24	Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models	Meng Cao et.al.	2503.18923	null
2025-03-24	FFN Fusion: Rethinking Sequential Computation in Large Language Models	Akhiad Bercovich et.al.	2503.18908	null
2025-03-24	xKV: Cross-Layer SVD for KV-Cache Compression	Chi-Chih Chang et.al.	2503.18893	link
2025-03-24	AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration	Zhexuan Wang et.al.	2503.18891	link
2025-03-24	Toward building next-generation Geocoding systems: a systematic review	Zhengcong Yin et.al.	2503.18888	null
2025-03-24	I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders	Andrey Galichin et.al.	2503.18878	link
2025-03-24	Efficient Self-Supervised Adaptation for Medical Image Analysis	Moein Sorkhei et.al.	2503.18873	link
2025-03-24	Reimagining Memory Access for LLM Inference: Compression-Aware Memory Controller Design	Rui Xie et.al.	2503.18869	null
2025-03-24	Reasoning to Learn from Latent Thoughts	Yangjun Ruan et.al.	2503.18866	null
2025-03-24	Structuring Scientific Innovation: A Framework for Modeling and Discovering Impactful Knowledge Combinations	Junlan Chen et.al.	2503.18865	null
2025-03-24	MC-LLaVA: Multi-Concept Personalized Vision-Language Model	Ruichuan An et.al.	2503.18854	link
2025-03-24	Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations	Jeonghyeon Kim et.al.	2503.18817	link
2025-03-24	Defeating Prompt Injections by Design	Edoardo Debenedetti et.al.	2503.18813	null
2025-03-24	SKDU at De-Factify 4.0: Vision Transformer with Data Augmentation for AI-Generated Image Detection	Shrikant Malviya et.al.	2503.18812	link
2025-03-21	Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique	Yansi Li et.al.	2503.17363	null
2025-03-21	HCAST: Human-Calibrated Autonomy Software Tasks	David Rein et.al.	2503.17354	link
2025-03-21	NdLinear Is All You Need for Representation Learning	Alex Reneau et.al.	2503.17353	link
2025-03-21	OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement	Yihe Deng et.al.	2503.17352	link
2025-03-21	Beyond Semantics: Rediscovering Spatial Awareness in Vision-Language Models	Jianing Qi et.al.	2503.17349	null
2025-03-21	Capturing Individual Human Preferences with Reward Features	André Barreto et.al.	2503.17338	null
2025-03-21	Efficient Intent-Based Filtering for Multi-Party Conversations Using Knowledge Distillation from LLMs	Reem Gody et.al.	2503.17336	null
2025-03-21	CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application Vulnerabilities	Yuxuan Zhu et.al.	2503.17332	link
2025-03-21	LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language	Kun Chu et.al.	2503.17309	link
2025-03-21	Bugdar: AI-Augmented Secure Code Review for GitHub Pull Requests	John Naulty et.al.	2503.17302	null
2025-03-21	FastCuRL: Curriculum Reinforcement Learning with Progressive Context Extension for Efficient Training R1-like Reasoning Models	Mingyang Song et.al.	2503.17287	link
2025-03-21	CASE -- Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement	Gaifan Zhang et.al.	2503.17279	null
2025-03-21	Revisiting End To End Sparse Autoencoder Training -- A Short Finetune is All You Need	Adam Karvonen et.al.	2503.17272	link
2025-03-21	SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging	Aladin Djuhera et.al.	2503.17239	link
2025-03-21	Slide-Level Prompt Learning with Vision Language Models for Few-Shot Multiple Instance Learning in Histopathology	Devavrat Tomar et.al.	2503.17238	link
2025-03-21	FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs	Albert Sawczyn et.al.	2503.17229	null
2025-03-21	Automating Adjudication of Cardiovascular Events Using Large Language Models	Sonish Sivarajkumar et.al.	2503.17222	null
2025-03-21	A Language Anchor-Guided Method for Robust Noisy Domain Generalization	Zilin Dai et.al.	2503.17211	null
2025-03-21	TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning	Sheng Wang et.al.	2503.17195	null
2025-03-21	LLMs Love Python: A Study of LLMs' Bias for Programming Languages and Libraries	Lukas Twist et.al.	2503.17181	link
2025-03-20	DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding	Keyan Chen et.al.	2503.16426	link
2025-03-20	Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models	Yang Sui et.al.	2503.16419	link
2025-03-20	M3: 3D-Spatial MultiModal Memory	Xueyan Zou et.al.	2503.16413	link
2025-03-20	The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination	Yifan Sun et.al.	2503.16402	link
2025-03-20	Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them	Guanyu Chen et.al.	2503.16401	null
2025-03-20	Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation	Yijia Luo et.al.	2503.16385	link
2025-03-20	LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images	Leyang Wang et.al.	2503.16376	null
2025-03-20	JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse	Muyao Li et.al.	2503.16365	null
2025-03-20	CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners	Yunzhi Yao et.al.	2503.16356	link
2025-03-20	Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences	Krithik Ramesh et.al.	2503.16351	null
2025-03-20	LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates	Ying Shen et.al.	2503.16334	null
2025-03-20	OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence	Long Yuan et.al.	2503.16326	null
2025-03-20	Issue2Test: Generating Reproducing Test Cases from Issue Reports	Noor Nashid et.al.	2503.16320	null
2025-03-20	Bridging Technology and Humanities: Evaluating the Impact of Large Language Models on Social Sciences Research with DeepSeek-R1	Peiran Gu et.al.	2503.16304	null
2025-03-20	Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model	Zhaochong An et.al.	2503.16282	link
2025-03-20	Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens	Shuqi Lu et.al.	2503.16278	link
2025-03-20	Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data	Zijian Li et.al.	2503.16260	null
2025-03-20	Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models	Keda Tao et.al.	2503.16257	null
2025-03-21	Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning	Zhaowei Liu et.al.	2503.16252	link
2025-03-20	Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't	Quy-Anh Dang et.al.	2503.16219	link
2025-03-19	TULIP: Towards Unified Language-Image Pretraining	Zineng Tang et.al.	2503.15485	null
2025-03-19	SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks	Yifei Zhou et.al.	2503.15478	link
2025-03-19	What Makes a Reward Model a Good Teacher? An Optimization Perspective	Noam Razin et.al.	2503.15477	link
2025-03-19	Cube: A Roblox View of 3D Intelligence	Foundation AI Team et.al.	2503.15475	link
2025-03-19	EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining	Boshen Xu et.al.	2503.15470	link
2025-03-19	From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment	Jia-Nan Li et.al.	2503.15463	link
2025-03-19	SkyLadder: Better and Faster Pretraining via Context Window Scheduling	Tongyao Zhu et.al.	2503.15450	link
2025-03-19	VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning	Yang Tan et.al.	2503.15438	link
2025-03-19	Visual Position Prompt for MLLM based Visual Grounding	Wei Tang et.al.	2503.15426	link
2025-03-19	Probing the topology of the space of tokens with structured prompts	Michael Robinson et.al.	2503.15421	null
2025-03-19	Visual Persona: Foundation Model for Full-Body Human Customization	Jisu Nam et.al.	2503.15406	null
2025-03-19	FedSCA: Federated Tuning with Similarity-guided Collaborative Aggregation for Heterogeneous Medical Image Segmentation	Yumin Zhang et.al.	2503.15390	null
2025-03-19	EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models	Yinan Liang et.al.	2503.15369	null
2025-03-19	SemEval-2025 Task 1: AdMIRe -- Advancing Multimodal Idiomaticity Representation	Thomas Pickard et.al.	2503.15358	null
2025-03-19	SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models	I-Fan Lin et.al.	2503.15351	null
2025-03-19	TruthLens:A Training-Free Paradigm for DeepFake Detection	Ritabrata Chakraborty et.al.	2503.15342	null
2025-03-19	Uncertainty-Guided Chain-of-Thought for Code Generation with LLMs	Yuqi Zhu et.al.	2503.15341	null
2025-03-19	Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context	Junyi Ao et.al.	2503.15338	link
2025-03-19	Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport	Hao Tan et.al.	2503.15337	link
2025-03-19	Euclid Quick Data Release (Q1) Exploring galaxy properties with a multi-modal foundation model	Euclid Collaboration et.al.	2503.15312	link
2025-03-18	Aligning Multimodal LLM with Human Preference: A Survey	Tao Yu et.al.	2503.14504	link
2025-03-18	Engineering Scientific Assistants using Interactive Structured Induction of Programs	Shraddha Surana et.al.	2503.14488	null
2025-03-18	Gricean Norms as a Basis for Effective Collaboration	Fardin Saad et.al.	2503.14484	link
2025-03-18	Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM	Xinyu Fang et.al.	2503.14478	link
2025-03-18	Characterizing Data Visualization Literacy: a Systematic Literature Review	Sara Beschi et.al.	2503.14468	null
2025-03-18	RWKV-7 "Goose" with Expressive Dynamic State Evolution	Bo Peng et.al.	2503.14456	link
2025-03-18	EnvBench: A Benchmark for Automated Environment Setup	Aleksandra Eliseeva et.al.	2503.14443	link
2025-03-18	LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers	Nikhil Abhyankar et.al.	2503.14434	link
2025-03-18	PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play	Wei Fang et.al.	2503.14432	null
2025-03-18	ExDDV: A New Dataset for Explainable Deepfake Detection in Video	Vlad Hondru et.al.	2503.14421	link
2025-03-18	Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models	Siwei Zhang et.al.	2503.14411	null
2025-03-18	Large Language Models for Virtual Human Gesture Selection	Parisa Ghanad Torshizi et.al.	2503.14408	null
2025-03-18	DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers	Mert Bulent Sariyildiz et.al.	2503.14405	null
2025-03-18	From "Hallucination" to "Suture": Insights from Language Philosophy to Enhance Large Language Models	Qiantong Wang et.al.	2503.14392	null
2025-03-18	How much do LLMs learn from negative examples?	Shadi Hamdan et.al.	2503.14391	link
2025-03-18	Good/Evil Reputation Judgment of Celebrities by LLMs via Retrieval Augmented Generation	Rikuto Tsuchida et.al.	2503.14382	null
2025-03-18	On the Standard Performance Criteria for Applied Control Design: PID, MPC or Machine Learning Controller?	Pouria Sarhadi et.al.	2503.14379	link
2025-03-18	Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels	Maximilian Beck et.al.	2503.14376	link
2025-03-18	MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts	Runqi Meng et.al.	2503.14355	null
2025-03-19	MoonCast: High-Quality Zero-Shot Podcast Generation	Zeqian Ju et.al.	2503.14345	link
2025-03-17	MetaScale: Test-Time Scaling with Evolving Meta-Thoughts	Qin Liu et.al.	2503.13447	null
2025-03-17	MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation	Zhenyu Wu et.al.	2503.13446	null
2025-03-17	Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance	Noah Y. Siegel et.al.	2503.13445	null
2025-03-17	VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning	Ye Liu et.al.	2503.13444	link
2025-03-17	DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models	Haoyang Li et.al.	2503.13443	link
2025-03-18	MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling	Yingyue Li et.al.	2503.13440	link
2025-03-17	xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference	Maximilian Beck et.al.	2503.13427	link
2025-03-17	SuperBPE: Space Travel for Language Models	Alisa Liu et.al.	2503.13423	null
2025-03-17	A Comprehensive Survey on Multi-Agent Cooperative Decision-Making: Scenarios, Approaches, Challenges and Perspectives	Weiqiang Jin et.al.	2503.13415	null
2025-03-18	DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective	Dengyun Peng et.al.	2503.13413	link
2025-03-17	Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis	Alexander Ku et.al.	2503.13401	null
2025-03-17	MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research	James Burgess et.al.	2503.13399	link
2025-03-17	Aligned Probing: Relating Toxic Behavior and Model Internals	Andreas Waldis et.al.	2503.13390	null
2025-03-17	Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning	Mengyao Lyu et.al.	2503.13383	null
2025-03-17	Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions	Wan Ju Kang et.al.	2503.13369	null
2025-03-17	Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning	Hai-Long Sun et.al.	2503.13360	null
2025-03-17	Agents Play Thousands of 3D Video Games	Zhongwen Xu et.al.	2503.13356	null
2025-03-17	Valid Text-to-SQL Generation with Unification-based DeepStochLog	Ying Jiao et.al.	2503.13342	link
2025-03-17	LearnMate: Enhancing Online Education with LLM-Powered Personalized Learning Plans and Support	Xinyu Jessica Wang et.al.	2503.13340	null
2025-03-17	Reliable and Efficient Amortized Model-based Evaluation	Sang Truong et.al.	2503.13335	null
2025-03-14	Tit-for-Tat: Safeguarding Large Vision-Language Models Against Jailbreak Attacks via Adversarial Defense	Shuyang Hao et.al.	2503.11619	null
2025-03-14	ASMA-Tune: Unlocking LLMs' Assembly Code Comprehension via Structural-Semantic Instruction Tuning	Xinyi Wang et.al.	2503.11617	link
2025-03-14	Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages	Matteo Farina et.al.	2503.11609	link
2025-03-14	Do Construction Distributions Shape Formal Language Learning In German BabyLMs?	Bastian Bunzeck et.al.	2503.11593	null
2025-03-14	Pathology Image Compression with Pre-trained Autoencoders	Srikar Yellapragada et.al.	2503.11591	null
2025-03-14	Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs using Semantic Space	Zhiliang Chen et.al.	2503.11586	link
2025-03-14	SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion	Ahmed Nassar et.al.	2503.11576	null
2025-03-14	Synthesizing Access Control Policies using Large Language Models	Adarsh Vatsa et.al.	2503.11573	null
2025-03-14	Implicit Bias-Like Patterns in Reasoning Models	Messi H. J. Lee et.al.	2503.11572	null
2025-03-14	VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity	Jing Bi et.al.	2503.11557	null
2025-03-14	Similarity-Aware Token Pruning: Your VLM but Faster	Ahmadreza Jeddi et.al.	2503.11549	link
2025-03-14	Potential of large language model-powered nudges for promoting daily water and energy conservation	Zonghan Li et.al.	2503.11531	null
2025-03-14	Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models	Hao Cheng et.al.	2503.11519	null
2025-03-14	HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models	Ziqin Zhou et.al.	2503.11513	null
2025-03-14	V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning	Zixu Cheng et.al.	2503.11495	null
2025-03-14	A Review of DeepSeek Models' Key Innovative Techniques	Chengen Wang et.al.	2503.11486	null
2025-03-14	Integrating LLMs in Gamified Systems	Carlos J. Costa et.al.	2503.11458	null
2025-03-14	D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning	Jia Zhang et.al.	2503.11441	null
2025-03-14	Text Compression for Efficient Language Generation	David Gu et.al.	2503.11426	null
2025-03-14	Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models	Xu Liu et.al.	2503.11411	null
2025-03-13	GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing	Rongyao Fang et.al.	2503.10639	link
2025-03-13	A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1	Zhaoyi Li et.al.	2503.10635	link
2025-03-13	HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model	Jiaming Liu et.al.	2503.10631	null
2025-03-13	UniGoal: Towards Universal Zero-shot Goal-oriented Navigation	Hang Yin et.al.	2503.10630	null
2025-03-13	Transformers without Normalization	Jiachen Zhu et.al.	2503.10622	null
2025-03-13	From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM	Kshitij Ambilduke et.al.	2503.10620	link
2025-03-13	Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search	Andy Zhou et.al.	2503.10619	null
2025-03-13	Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models	Andy Zhou et.al.	2503.10617	null
2025-03-13	R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization	Yi Yang et.al.	2503.10615	link
2025-03-13	CoSTA $\ast$ : Cost-Sensitive Toolpath Agent for Multi-turn Image Editing	Advait Gupta et.al.	2503.10613	link
2025-03-13	TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention	Jinhao Duan et.al.	2503.10602	link
2025-03-13	GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding	Rui Hu et.al.	2503.10596	link
2025-03-13	Unlock the Power of Unlabeled Data in Language Driving Model	Chaoqun Wang et.al.	2503.10586	null
2025-03-13	VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search	Yiming Jia et.al.	2503.10582	null
2025-03-13	Unveiling the Mathematical Reasoning in DeepSeek Models: A Comparative Study of Large Language Models	Afrar Jahin et.al.	2503.10573	null
2025-03-13	ASIDE: Architectural Separation of Instructions and Data in Language Models	Egor Zverev et.al.	2503.10566	null
2025-03-13	Short-term AI literacy intervention does not reduce over-reliance on incorrect ChatGPT recommendations	Brett Puppart et.al.	2503.10556	null
2025-03-13	KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation	Zixian Liu et.al.	2503.10546	null
2025-03-13	DP-GPL: Differentially Private Graph Prompt Learning	Jing Xu et.al.	2503.10544	null
2025-03-13	Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More	Arvid Frydenlund et.al.	2503.10542	null
2025-03-12	MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System	Jihao Zhao et.al.	2503.09600	link
2025-03-12	How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation	Ruohao Guo et.al.	2503.09598	link
2025-03-12	SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment	Katrin Renz et.al.	2503.09594	null
2025-03-12	BIMBA: Selective-Scan Compression for Long-Range Video Question Answering	Md Mohaiminul Islam et.al.	2503.09590	link
2025-03-12	Cost-Optimal Grouped-Query Attention for Long-Context LLMs	Yingfa Chen et.al.	2503.09579	link
2025-03-12	Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models	Marianne Arriola et.al.	2503.09573	link
2025-03-12	Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks	Lutfi Eren Erdogan et.al.	2503.09572	null
2025-03-13	Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models	Qiguang Chen et.al.	2503.09567	null
2025-03-12	PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs	Oskar van der Wal et.al.	2503.09543	link
2025-03-13	Large Language Models for Multi-Facility Location Mechanism Design	Nguyen Thach et.al.	2503.09533	null
2025-03-13	SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability	Adam Karvonen et.al.	2503.09532	null
2025-03-12	Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning	Bowen Jin et.al.	2503.09516	link
2025-03-12	Reinforcement Learning is all You Need	Yongsheng Lian et.al.	2503.09512	null
2025-03-12	ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning	Ziyu Wan et.al.	2503.09501	link
2025-03-12	MindGYM: Enhancing Vision-Language Models via Synthetic Self-Challenging Questions	Zhe Xu et.al.	2503.09499	link
2025-03-12	Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection	Romain Thoreau et.al.	2503.09493	null
2025-03-12	Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness	Beier Zhu et.al.	2503.09487	null
2025-03-12	BAMBI: Developing Baby Language Models for Italian	Alice Suozzi et.al.	2503.09481	null
2025-03-12	SurgicalVLM-Agent: Towards an Interactive AI Co-Pilot for Pituitary Surgery	Jiayuan Huang et.al.	2503.09474	null
2025-03-12	Explicit Learning and the LLM in Machine Translation	Malik Marmonier et.al.	2503.09454	link
2025-03-11	QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension	Yongdong Luo et.al.	2503.08689	link
2025-03-11	Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs	Ariba Khan et.al.	2503.08688	link
2025-03-11	Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents	Haoyu Wang et.al.	2503.08684	link
2025-03-11	Self-Taught Self-Correction for Small Language Models	Viktor Moskvoretskii et.al.	2503.08681	null
2025-03-11	Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields	Tobias Kreiman et.al.	2503.08674	null
2025-03-11	Generating Robot Constitutions & Benchmarks for Semantic Safety	Pierre Sermanet et.al.	2503.08663	null
2025-03-11	Exploring the Word Sense Disambiguation Capabilities of Large Language Models	Pierpaolo Basile et.al.	2503.08662	null
2025-03-11	YuE: Scaling Open Foundation Models for Long-Form Music Generation	Ruibin Yuan et.al.	2503.08638	link
2025-03-11	LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization	Xianfeng Wu et.al.	2503.08619	link
2025-03-11	EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments	Dongping Li et.al.	2503.08604	link
2025-03-11	NSF-SciFy: Mining the NSF Awards Database for Scientific Claims	Delip Rao et.al.	2503.08600	null
2025-03-11	Proc4Gem: Foundation models for physical agency through procedural generation	Yixin Lin et.al.	2503.08593	null
2025-03-11	BiasEdit: Debiasing Stereotyped Language Models via Model Editing	Xin Xu et.al.	2503.08588	link
2025-03-11	HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding	Shehreen Azad et.al.	2503.08585	null
2025-03-11	RAG-Adapter: A Plug-and-Play RAG-enhanced Framework for Long Video Understanding	Xichen Tan et.al.	2503.08576	null
2025-03-11	DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process	Minjun Zhu et.al.	2503.08569	null
2025-03-11	Reasoning and Sampling-Augmented MCQ Difficulty Prediction via LLMs	Wanyong Feng et.al.	2503.08551	null
2025-03-11	Transferring Extreme Subword Style Using Ngram Model-Based Logit Scaling	Craig Messner et.al.	2503.08550	null
2025-03-11	Graph of AI Ideas: Leveraging Knowledge Graphs and LLMs for AI Research Idea Generation	Xian Gao et.al.	2503.08549	null
2025-03-11	TLA: Tactile-Language-Action Model for Contact-Rich Manipulation	Peng Hao et.al.	2503.08548	null
2025-03-10	Robusto-1 Dataset: Comparing Humans and VLMs on real out-of-distribution Autonomous Driving VQA from Peru	Dunant Cusipuma et.al.	2503.07587	null
2025-03-10	Talking to GDELT Through Knowledge Graphs	Audun Myers et.al.	2503.07584	null
2025-03-10	VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models	Jen-tse Huang et.al.	2503.07575	link
2025-03-10	AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning	Yangzhe Kong et.al.	2503.07557	null
2025-03-10	Junior Software Developers' Perspectives on Adopting LLMs for Software Engineering: a Systematic Literature Review	Samuel Ferino et.al.	2503.07556	null
2025-03-10	KSOD: Knowledge Supplement for LLMs On Demand	Haoran Li et.al.	2503.07550	null
2025-03-10	Bi-Directional Mental Model Reconciliation for Human-Robot Interaction with Large Language Models	Nina Moorman et.al.	2503.07547	null
2025-03-10	Queueing, Predictions, and LLMs: Challenges and Open Problems	Michael Mitzenmacher et.al.	2503.07545	null
2025-03-10	XIFBench: Evaluating Large Language Models on Multilingual Instruction Following	Zhenyu Li et.al.	2503.07539	null
2025-03-10	Building English ASR model with regional language support	Purvi Agrawal et.al.	2503.07522	null
2025-03-10	GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval	Justus-Jonas Erker et.al.	2503.07519	link
2025-03-10	TokenButler: Token Importance is Predictable	Yash Akhauri et.al.	2503.07518	link
2025-03-10	Language Models Fail to Introspect About Their Knowledge of Language	Siyuan Song et.al.	2503.07513	link
2025-03-10	Plume: Scaffolding Text Composition in Dashboards	Maxim Lisnic et.al.	2503.07512	null
2025-03-10	Sometimes the Model doth Preach: Quantifying Religious Bias in Open LLMs through Demographic Analysis in Asian Nations	Hari Shankar et.al.	2503.07510	link
2025-03-10	Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts	Shiu-hong Kao et.al.	2503.07503	null
2025-03-10	V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation	Guiwei Zhang et.al.	2503.07493	link
2025-03-10	LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition?	Bangyan Li et.al.	2503.07487	null
2025-03-10	Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction	Zongzheng Zhang et.al.	2503.07485	link
2025-03-10	VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models	Jiacheng Ruan et.al.	2503.07478	link
2025-03-10	Advancing Vietnamese Information Retrieval with Learning Objective and Benchmark	Phu-Vinh Nguyen et.al.	2503.07470	null
2025-03-10	YOLOE: Real-Time Seeing Anything	Ao Wang et.al.	2503.07465	link
2025-03-10	GenAIReading: Augmenting Human Cognition with Interactive Digital Textbooks Using Large Language Models and Image Generation Models	Ryugo Morita et.al.	2503.07463	null
2025-03-10	MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning	Xiangru Tang et.al.	2503.07459	link
2025-03-10	LLMs syntactically adapt their language use to their conversational partner	Florian Kandra et.al.	2503.07457	null
2025-03-10	Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration	Dylan J. Foster et.al.	2503.07453	null
2025-03-10	From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development -- An Opinion Paper	Sargam Yadav et.al.	2503.07450	null
2025-03-10	From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics	Jaewook Lee et.al.	2503.07429	null
2025-03-10	RePO: ReLU-based Preference Optimization	Junkang Wu et.al.	2503.07426	link
2025-03-10	REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding	Yan Tai et.al.	2503.07413	link
2025-03-10	Towards Safe Robot Foundation Models	Maximilian Tölle et.al.	2503.07404	null
2025-03-10	Keeping Representation Similarity in Finetuning for Medical Image Analysis	Wenqiang Zu et.al.	2503.07399	null
2025-03-10	Revisiting Noise in Natural Language Processing for Computational Social Science	Nadav Borenstein et.al.	2503.07395	null
2025-03-10	Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs	Gonzalo Mancera et.al.	2503.07384	null
2025-03-10	Process-Supervised LLM Recommenders via Flow-guided Tuning	Chongming Gao et.al.	2503.07377	link
2025-03-10	Artificial Utopia: Simulation and Intelligent Agents for a Democratised Future	Yannick Oswald et.al.	2503.07364	null
2025-03-07	Fairness-Aware Low-Rank Adaptation Under Demographic Privacy Constraints	Parameswaran Kamalaruban et.al.	2503.05684	null
2025-03-07	Understanding the Limits of Lifelong Knowledge Editing in LLMs	Lukas Thede et.al.	2503.05683	null
2025-03-07	A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval	Yu Zhang et.al.	2503.05659	link
2025-03-07	Learning LLM Preference over Intra-Dialogue Pairs: A Framework for Utterance-level Understandings	Xuanqing Liu et.al.	2503.05620	null
2025-03-07	A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models	Dong Shu et.al.	2503.05613	null
2025-03-07	From Theory to Application: A Practical Introduction to Neural Operators in Scientific Computing	Prashant K. Jha et.al.	2503.05598	link
2025-03-07	R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning	Huatong Song et.al.	2503.05592	null
2025-03-07	Quantifying the Robustness of Retrieval-Augmented Language Models Against Spurious Features in Grounding Data	Shiping Yang et.al.	2503.05587	null
2025-03-07	Evaluating open-source Large Language Models for automated fact-checking	Nicolo' Fontana et.al.	2503.05565	null
2025-03-07	Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance	Bryan Etzine et.al.	2503.05551	null
2025-03-07	Leveraging Approximate Caching for Faster Retrieval-Augmented Generation	Shai Bergman et.al.	2503.05530	null
2025-03-07	PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs	Roberto Cerina et.al.	2503.05529	null
2025-03-07	Cognitive Bias Detection Using Advanced Prompt Engineering	Frederic Lemieux et.al.	2503.05516	null
2025-03-07	Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs?	Qingyuan Liang et.al.	2503.05507	null
2025-03-07	Statistical Guarantees of Correctness Coverage for Medical Multiple-Choice Question Answering	Yusong Ke et.al.	2503.05505	null
2025-03-07	Benchmarking LLMs in Recommendation Tasks: A Comparative Evaluation with Conventional Recommenders	Qijiong Liu et.al.	2503.05493	null
2025-03-07	Maximum Hallucination Standards for Domain-Specific Large Language Models	Tingmingke Lu et.al.	2503.05481	null
2025-03-07	The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence	Noah Mamie et.al.	2503.05473	null
2025-03-07	Soft Policy Optimization: Online Off-Policy RL for Sequence Models	Taco Cohen et.al.	2503.05453	null
2025-03-07	LLM-based Iterative Approach to Metamodeling in Automotive	Nenad Petrovic et.al.	2503.05449	null
2025-03-06	L $^2$ M: Mutual Information Scaling Law for Long-Context Language Modeling	Zhuo Chen et.al.	2503.04725	link
2025-03-06	LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM	Sambal Shikhar et.al.	2503.04724	null
2025-03-07	Shifting Long-Context LLMs Research from Input to Output	Yuhao Wu et.al.	2503.04723	null
2025-03-06	Enough Coin Flips Can Make LLMs Act Bayesian	Ritwik Gupta et.al.	2503.04722	null
2025-03-06	Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities	Guan-Ting Lin et.al.	2503.04721	link
2025-03-06	Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining	Houyi Li et.al.	2503.04715	null
2025-03-06	Scaling Rich Style-Prompted Text-to-Speech Datasets	Anuj Diwan et.al.	2503.04713	link
2025-03-06	Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size	Alireza Behtash et.al.	2503.04704	null
2025-03-06	L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning	Pranjal Aggarwal et.al.	2503.04697	null
2025-03-06	UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets	Wenyu Wang et.al.	2503.04693	null
2025-03-06	Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases	Pengcheng Qiu et.al.	2503.04691	null
2025-03-06	LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue	Sangyeop Kim et.al.	2503.04675	null
2025-03-06	An Information-theoretic Multi-task Representation Learning Framework for Natural Language Understanding	Dou Hu et.al.	2503.04667	link
2025-03-06	CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models	Shengzhuang Chen et.al.	2503.04655	link
2025-03-06	Transferable Foundation Models for Geometric Tasks on Point Cloud Representations: Geometric Neural Operators	Blaine Quackenbush et.al.	2503.04649	link
2025-03-06	Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment	Wen Yang et.al.	2503.04647	link
2025-03-06	Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation	Aishik Konwer et.al.	2503.04639	null
2025-03-06	Mark Your LLM: Detecting the Misuse of Open-Source Large Language Models via Watermarking	Yijie Xu et.al.	2503.04636	null
2025-03-06	Better Process Supervision with Bi-directional Rewarding Signals	Wenxiang Chen et.al.	2503.04618	null
2025-03-06	Towards Data-Efficient Language Models: A Child-Inspired Approach to Language Learning	Mohammad Amin Ghanizadeh et.al.	2503.04611	null
2025-03-05	The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems	Richard Ren et.al.	2503.03750	null
2025-03-05	Process-based Self-Rewarding Language Models	Shimao Zhang et.al.	2503.03746	link
2025-03-05	CHOP: Mobile Operating Assistant with Constrained High-frequency Optimized Subtask Planning	Yuqi Zhou et.al.	2503.03743	link
2025-03-05	Towards Understanding Distilled Reasoning Models: A Representational Approach	David D. Baek et.al.	2503.03730	null
2025-03-05	Improving LLM Safety Alignment with Dual-Objective Optimization	Xuandong Zhao et.al.	2503.03710	link
2025-03-05	Effective LLM Knowledge Learning via Model Generalization	Mingkang Zhu et.al.	2503.03705	null
2025-03-05	A Practical Memory Injection Attack against LLM Agents	Shen Dong et.al.	2503.03704	null
2025-03-05	Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models	Jiyue Jiang et.al.	2503.03702	null
2025-03-05	Addressing Overprescribing Challenges: Fine-Tuning Large Language Models for Medication Recommendation Tasks	Zihao Zhao et.al.	2503.03687	link
2025-03-05	Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models	Bar Karov et.al.	2503.03669	link
2025-03-05	Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction	Gustaw Opiełka et.al.	2503.03666	link
2025-03-05	Robust Learning of Diverse Code Edits	Tushar Aggarwal et.al.	2503.03656	null
2025-03-05	Improving Neutral Point of View Text Generation through Parameter-Efficient Reinforcement Learning and a Small-Scale High-Quality Dataset	Jessica Hoffmann et.al.	2503.03654	null
2025-03-05	Token-Level Privacy in Large Language Models	Re'em Harel et.al.	2503.03652	null
2025-03-05	Psy-Copilot: Visual Chain of Thought for Counseling	Keqi Chen et.al.	2503.03645	null
2025-03-05	Large language models in finance: estimating financial sentiment for stock prediction	Kemal Kirtac et.al.	2503.03612	null
2025-03-05	Enhancing the Accuracy and Comprehensibility in Architectural Tactics Detection via Small Model-Augmented Prompt Engineering	Lingli Cao et.al.	2503.03609	link
2025-03-05	Psy-Insight: Explainable Multi-turn Bilingual Dataset for Mental Health Counseling	Keqi Chen et.al.	2503.03607	null
2025-03-05	Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders	Kristian Kuznetsov et.al.	2503.03601	null
2025-03-05	Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs	Haoran Fan et.al.	2503.03594	link
2025-03-04	Wikipedia in the Era of LLMs: Evolution and Risks	Siming Huang et.al.	2503.02879	link
2025-03-04	Language Models can Self-Improve at State-Value Estimation for Better Search	Ethan Mendes et.al.	2503.02878	link
2025-03-04	SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models	Dmitry Nechaev et.al.	2503.02876	link
2025-03-04	The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models	Ke Ji et.al.	2503.02875	null
2025-03-04	Prompting Generative AI with Interaction-Augmented Instructions	Leixian Shen et.al.	2503.02874	null
2025-03-04	FairSense-AI: Responsible AI Meets Sustainability	Shaina Raza et.al.	2503.02865	null
2025-03-04	Calibrating LLM Confidence with Semantic Steering: A Multi-Prompt Aggregation Framework	Ziang Zhou et.al.	2503.02863	null
2025-03-04	Privacy and Accuracy-Aware AI/ML Model Deduplication	Hong Guan et.al.	2503.02862	null
2025-03-04	(How) Do Language Models Track State?	Belinda Z. Li et.al.	2503.02854	null
2025-03-04	Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs' Decoding Layers	Zicong He et.al.	2503.02851	link
2025-03-04	Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs	Yuzhe Gu et.al.	2503.02846	link
2025-03-04	Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training	Paul Janson et.al.	2503.02844	null
2025-03-04	AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation	Songming Zhang et.al.	2503.02832	null
2025-03-04	Developing a PET/CT Foundation Model for Cross-Modal Anatomical and Functional Imaging	Yujin Oh et.al.	2503.02824	null
2025-03-04	"What If Smart Homes Could See Our Homes?": Exploring DIY Smart Home Building Experiences with VLM-Based Camera Sensors	Sojeong Yun et.al.	2503.02816	null
2025-03-04	Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression	Nathan Godey et.al.	2503.02812	link
2025-03-04	RAAD-LLM: Adaptive Anomaly Detection Using LLMs and RAG Integration	Alicia Russell-Gilbert et.al.	2503.02800	null
2025-03-04	Multimodal AI predicts clinical outcomes of drug combinations from preclinical data	Yepeng Huang et.al.	2503.02781	link
2025-03-04	Implicit Bias in LLMs: A Survey	Xinru Lin et.al.	2503.02776	null
2025-03-04	InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training	Dingdong Wang et.al.	2503.02769	null
2025-02-28	LLM Post-Training: A Deep Dive into Reasoning Large Language Models	Komal Kumar et.al.	2502.21321	link
2025-02-28	Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos	Zhiyu Tan et.al.	2502.21314	null
2025-02-28	FANformer: Improving Large Language Models Through Effective Periodicity Modeling	Yihong Dong et.al.	2502.21309	link
2025-02-28	Contextualizing biological perturbation experiments through language	Menghua Wu et.al.	2502.21290	link
2025-02-28	Adaptive Keyframe Sampling for Long Video Understanding	Xi Tang et.al.	2502.21271	null
2025-03-03	Foundation Models -- A Panacea for Artificial Intelligence in Pathology?	Nita Mulliqi et.al.	2502.21264	null
2025-02-28	Modeling Human Beliefs about AI Behavior for Scalable Oversight	Leon Lang et.al.	2502.21262	null
2025-02-28	PET Image Denoising via Text-Guided Diffusion: Integrating Anatomical Priors through Text Prompts	Boxiao Yu et.al.	2502.21260	null
2025-02-28	RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete	Yuheng Ji et.al.	2502.21257	null
2025-02-28	TimesBERT: A BERT-Style Foundation Model for Time Series Understanding	Haoran Zhang et.al.	2502.21245	null
2025-03-04	Semantic Volume: Quantifying and Detecting both External and Internal Uncertainty in LLMs	Xiaomin Li et.al.	2502.21239	null
2025-02-28	Transforming Tuberculosis Care: Optimizing Large Language Models For Enhanced Clinician-Patient Communication	Daniil Filienko et.al.	2502.21236	null
2025-02-28	ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs	Hao Ge et.al.	2502.21231	null
2025-03-03	ECLeKTic: a Novel Challenge Set for Evaluation of Cross-Lingual Knowledge Transfer	Omer Goldman et.al.	2502.21228	null
2025-02-28	Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought	Jianhao Huang et.al.	2502.21212	null
2025-02-28	Chronologically Consistent Large Language Models	Songrun He et.al.	2502.21206	null
2025-02-28	$Δ$ -model correction of Foundation Model based on the models own understanding	Mads-Peter Verner Christiansen et.al.	2502.21179	null
2025-03-03	Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models	Ruta Binkyte et.al.	2502.21123	null
2025-02-28	Optimizing Large Language Models for ESG Activity Detection in Financial Texts	Mattia Birti et.al.	2502.21112	link
2025-02-28	Large Language Model-Based Benchmarking Experiment Settings for Evolutionary Multi-Objective Optimization	Lie Meng Pang et.al.	2502.21108	null
2025-02-27	R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts	Zhongyang Li et.al.	2502.20395	link
2025-02-27	Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis	Jeffrey Yang Fan Chiang et.al.	2502.20383	null
2025-02-27	Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers	Shalev Lifshitz et.al.	2502.20379	null
2025-02-27	PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation	Albert Gong et.al.	2502.20377	link
2025-02-27	Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization	Ryan C. Barron et.al.	2502.20364	link
2025-02-27	Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs	Kuan Lok Zhou et.al.	2502.20356	null
2025-02-27	KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model	Kai Zhang et.al.	2502.20350	null
2025-02-27	Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models	Yi Jing et.al.	2502.20344	null
2025-02-27	Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners	Daniele Paliotta et.al.	2502.20339	null
2025-02-27	Expertise Is What We Want	Alan Ashworth et.al.	2502.20335	null
2025-02-27	Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models	Yukang Yang et.al.	2502.20332	null
2025-02-27	Long-Context Inference with Retrieval-Augmented Speculative Decoding	Guanzheng Chen et.al.	2502.20330	link
2025-02-27	LangProBe: a Language Programs Benchmark	Shangyin Tan et.al.	2502.20315	null
2025-02-27	EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants	Franck Cappello et.al.	2502.20309	link
2025-02-27	M^3Builder: A Multi-Agent System for Automated Machine Learning in Medical Imaging	Jinghao Feng et.al.	2502.20301	null
2025-02-27	An exploration of features to improve the generalisability of fake news detection models	Nathaniel Hoy et.al.	2502.20299	null
2025-02-27	Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription	Benjamin Gutteridge et.al.	2502.20295	link
2025-02-27	Visual Adaptive Prompting for Compositional Zero-Shot Learning	Kyle Stein et.al.	2502.20292	null
2025-02-27	Conformal Tail Risk Control for Large Language Model Alignment	Catherine Yu-Chi Chen et.al.	2502.20285	null
2025-02-27	Evaluating Human Trust in LLM-Based Planners: A Preliminary Study	Shenghui Chen et.al.	2502.20284	null
2025-02-26	Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models	Lucy Xiaoyang Shi et.al.	2502.19417	null
2025-02-26	Norm Growth and Stability Challenges in Localized Sequential Knowledge Editing	Akshat Gupta et.al.	2502.19416	null
2025-02-26	Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation	Shiven Sinha et.al.	2502.19414	link
2025-02-26	Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs	Christoph Schuhmann et.al.	2502.19413	null
2025-02-26	Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs	Dayu Yang et.al.	2502.19411	link
2025-02-26	Less or More: Towards Glanceable Explanations for LLM Recommendations Using Ultra-Small Devices	Xinru Wang et.al.	2502.19410	null
2025-02-26	ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models	Danae Sánchez Villegas et.al.	2502.19409	null
2025-02-26	Learning Code-Edit Embedding to Model Student Debugging Behavior	Hasnain Heickal et.al.	2502.19407	null
2025-02-26	General Reasoning Requires Learning to Reason from the Get-go	Seungwook Han et.al.	2502.19402	null
2025-02-26	TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding	Max Ku et.al.	2502.19400	null
2025-02-26	LiDAR Registration with Visual Foundation Models	Niclas Vödisch et.al.	2502.19374	null
2025-02-26	Deep Learning For Time Series Analysis With Application On Human Motion	Ali Ismail-Fawaz et.al.	2502.19364	null
2025-02-26	DataMan: Data Manager for Pre-training Large Language Models	Ru Peng et.al.	2502.19363	null
2025-02-26	Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?	Yancheng He et.al.	2502.19361	link
2025-02-26	Controlled Diversity: Length-optimized Natural Language Generation	Diana Marie Schenke et.al.	2502.19347	null
2025-02-26	Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets	Tohida Rehman et.al.	2502.19339	null
2025-02-26	I Know What I Don't Know: Improving Model Cascades Through Confidence Tuning	Stephan Rabanser et.al.	2502.19335	null
2025-02-26	Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems	Hao Peng et.al.	2502.19328	link
2025-02-26	Shh, don't say that! Domain Certification in LLMs	Cornelius Emde et.al.	2502.19320	null
2025-02-26	Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond	Qizhou Wang et.al.	2502.19301	null
2025-02-25	DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers	Xueguang Ma et.al.	2502.18460	link
2025-02-25	LLM-Based Design Pattern Detection	Christian Schindler et.al.	2502.18458	null
2025-02-25	Evaluating the Effectiveness of Small Language Models in Detecting Refactoring Bugs	Rohit Gheyi et.al.	2502.18454	null
2025-02-25	FRIDA to the Rescue! Analyzing Synthetic Data Effectiveness in Object-Based Common Sense Reasoning for Disaster Response	Mollie Shichman et.al.	2502.18452	null
2025-02-25	SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution	Yuxiang Wei et.al.	2502.18449	null
2025-02-25	olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models	Jake Poznanski et.al.	2502.18443	link
2025-02-25	MAPoRL: Multi-Agent Post-Co-Training for Collaborative Large Language Models with Reinforcement Learning	Chanwoo Park et.al.	2502.18439	null
2025-02-25	Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions	Yizhe Zhang et.al.	2502.18435	null
2025-02-25	Exploring Gender Disparities in Automatic Speech Recognition Technology	Hend ElGhazaly et.al.	2502.18434	null
2025-02-25	TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning	Frederikus Hudi et.al.	2502.18431	link
2025-02-25	PyEvalAI: AI-assisted evaluation of Jupyter Notebooks for immediate personalized feedback	Nils Wandel et.al.	2502.18425	null
2025-02-25	Compressing Language Models for Specialized Domains	Miles Williams et.al.	2502.18424	null
2025-02-25	Rank1: Test-Time Compute for Reranking in Information Retrieval	Orion Weller et.al.	2502.18418	link
2025-02-25	OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference	Xiangyu Zhao et.al.	2502.18411	link
2025-02-25	Enhancing DNA Foundation Models to Address Masking Inefficiencies	Monireh Safari et.al.	2502.18405	null
2025-02-25	Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods	Nicola Cecere et.al.	2502.18389	null
2025-02-25	How Far are LLMs from Real Search? A Comprehensive Study on Efficiency, Completeness, and Inherent Capabilities	Minhua Lin et.al.	2502.18387	null
2025-02-25	MindMem: Multimodal for Predicting Advertisement Memorability Using LLMs and Deep Learning	Sepehr Asgarian et.al.	2502.18371	null
2025-02-25	Responsible AI Agents	Deven R. Desai et.al.	2502.18359	null
2025-02-25	Which Contributions Deserve Credit? Perceptions of Attribution in Human-AI Co-Creation	Jessica He et.al.	2502.18357	null
2025-02-24	Introducing Visual Perception Token into Multimodal Large Language Model	Runpeng Yu et.al.	2502.17425	link
2025-02-24	MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs	Jiarui Zhang et.al.	2502.17422	link
2025-02-24	LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification	Penghui Yang et.al.	2502.17421	link
2025-02-24	The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence	Tom Wollschläger et.al.	2502.17420	null
2025-02-24	From System 1 to System 2: A Survey of Reasoning Large Language Models	Zhong-Zhi Li et.al.	2502.17419	link
2025-02-24	Reasoning with Latent Thoughts: On the Power of Looped Transformers	Nikunj Saunshi et.al.	2502.17416	null
2025-02-24	COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs	Liming Liu et.al.	2502.17410	link
2025-02-24	Large Language Models are Powerful EHR Encoders	Stefan Hegselmann et.al.	2502.17403	link
2025-02-24	Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models	Alon Albalak et.al.	2502.17387	link
2025-02-24	Bridging Gaps in Natural Language Processing for Yorùbá: A Systematic Review of a Decade of Progress and Prospects	Toheeb A. Jimoh et.al.	2502.17364	null
2025-02-24	A Closer Look at TabPFN v2: Strength, Limitation, and Extension	Han-Jia Ye et.al.	2502.17361	null
2025-02-24	RELICT: A Replica Detection Framework for Medical Image Generation	Orhun Utku Aydin et.al.	2502.17360	link
2025-02-24	DIS-CO: Discovering Copyrighted Content in VLMs Training Data	André V. Duarte et.al.	2502.17358	link
2025-02-24	Distributional Scaling Laws for Emergent Capabilities	Rosie Zhao et.al.	2502.17356	null
2025-02-24	On Relation-Specific Neurons in Large Language Models	Yihong Liu et.al.	2502.17355	link
2025-02-24	How Scientists Use Large Language Models to Program	Gabrielle O'Brien et.al.	2502.17348	null
2025-02-24	Time series forecasting based on optimized LLM for fault prediction in distribution power grid insulators	João Pedro Matos-Carvalho et.al.	2502.17341	null
2025-02-24	Tokenized SAEs: Disentangling SAE Reconstructions	Thomas Dooms et.al.	2502.17332	null
2025-02-24	HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization	Zhenghao Liu et.al.	2502.17315	link
2025-02-24	`Generalization is hallucination' through the lens of tensor completions	Liang Ze Wong et.al.	2502.17305	null
2025-02-21	ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval	Guanqi Zhan et.al.	2502.15682	null
2025-02-21	Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training	Jaydeep Borkar et.al.	2502.15680	link
2025-02-21	BOSS: Benchmark for Observation Space Shift in Long-Horizon Task	Yue Yang et.al.	2502.15679	null
2025-02-21	Testing the limits of fine-tuning to improve reasoning in vision language models	Luca M. Schulze Buschoff et.al.	2502.15678	null
2025-02-21	FLEKE: Federated Locate-then-Edit Knowledge Editing	Zongkai Zhao et.al.	2502.15677	link
2025-02-21	AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind	Zhining Zhang et.al.	2502.15676	link
2025-02-21	Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing	Shoumik Saha et.al.	2502.15666	link
2025-02-21	Machine-generated text detection prevents language model collapse	George Drayson et.al.	2502.15654	link
2025-02-21	Empowering LLMs with Logical Reasoning: A Comprehensive Survey	Fengxiang Cheng et.al.	2502.15652	null
2025-02-21	Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models	Anirudh Sundar et.al.	2502.15639	null
2025-02-21	Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification	Vasilii Feofanov et.al.	2502.15637	link
2025-02-21	The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer	Marthe Ballon et.al.	2502.15631	link
2025-02-21	Extraction multi-étiquettes de relations en utilisant des couches de Transformer	Ngoc Luyen Le et.al.	2502.15619	null
2025-02-21	Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing	Qi Le et.al.	2502.15618	link
2025-02-21	PDeepPP:A Deep learning framework with Pretrained Protein language for peptide classification	Jixiu Zhai et.al.	2502.15610	link
2025-02-21	On the Robustness of Transformers against Context Hijacking for Linear Classification	Tianle Li et.al.	2502.15609	null
2025-02-21	Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance	Akos Nagy et.al.	2502.15604	null
2025-02-21	Do Multilingual LLMs Think In English?	Lisa Schut et.al.	2502.15603	null
2025-02-21	WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents	Xinhang Liu et.al.	2502.15601	null
2025-02-21	SafeInt: Shielding Large Language Models from Jailbreak Attacks via Safety-Aware Representation Intervention	Jiaqi Wu et.al.	2502.15594	null
2025-02-20	LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention	Shang Yang et.al.	2502.14866	link
2025-02-20	Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning	Shuyue Stella Li et.al.	2502.14860	link
2025-02-20	FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling	Weilin Zhao et.al.	2502.14856	null
2025-02-20	Prompt-to-Leaderboard	Evan Frick et.al.	2502.14855	link
2025-02-20	GATE: Graph-based Adaptive Tool Evolution Across Diverse Tasks	Jianwen Luo et.al.	2502.14848	link
2025-02-20	Red-Teaming LLM Multi-Agent Systems via Communication Attacks	Pengfei He et.al.	2502.14847	null
2025-02-20	Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation	Yue Yang et.al.	2502.14846	null
2025-02-20	Revealing and Mitigating Over-Attention in Knowledge Editing	Pinzheng Wang et.al.	2502.14838	link
2025-02-20	LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models	Shangqing Tu et.al.	2502.14834	link
2025-02-20	Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs	Danni Liu et.al.	2502.14830	link
2025-02-20	Measuring Faithfulness of Chains of Thought by Unlearning Reasoning Steps	Martin Tutek et.al.	2502.14829	link
2025-02-20	Exploring Advanced Techniques for Visual Question Answering: A Comprehensive Comparison	Aiswarya Baby et.al.	2502.14827	null
2025-02-20	A Survey of Model Architectures in Information Retrieval	Zhichao Xu et.al.	2502.14822	null
2025-02-20	eC-Tab2Text: Aspect-Based Text Generation from e-Commerce Product Tables	Luis Antonio Gutiérrez Guanilo et.al.	2502.14820	null
2025-02-20	Dynamic Low-Rank Sparse Adaptation for Large Language Models	Weizhong Huang et.al.	2502.14816	link
2025-02-20	FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis	Fadillah Maani et.al.	2502.14807	link
2025-02-20	From RAG to Memory: Non-Parametric Continual Learning for Large Language Models	Bernal Jiménez Gutiérrez et.al.	2502.14802	link
2025-02-20	A Multi-Agent Perspective on Modern Information Retrieval	Haya Nachimovsky et.al.	2502.14796	null
2025-02-20	Rapid Word Learning Through Meta In-Context Learning	Wentao Wang et.al.	2502.14791	null
2025-02-20	SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features	Michael Tschannen et.al.	2502.14786	link
2025-02-19	Where's the Bug? Attention Probing for Scalable Fault Localization	Adam Stein et.al.	2502.13966	null
2025-02-19	Autellix: An Efficient Serving Engine for LLM Agents as General Programs	Michael Luo et.al.	2502.13965	null
2025-02-19	MuDAF: Long-Context Multi-Document Attention Focusing through Contrastive Learning on Attention Heads	Weihao Liu et.al.	2502.13963	link
2025-02-19	Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering	William Jurayj et.al.	2502.13962	null
2025-02-19	LIDDIA: Language-based Intelligent Drug Discovery Agent	Reza Averly et.al.	2502.13959	null
2025-02-19	Neurosymbolic artificial intelligence via large language models and coherence-driven inference	Steve Huntsman et.al.	2502.13953	null
2025-02-19	Why Safeguarded Ships Run Aground? Aligned Large Language Models' Safety Mechanisms Tend to Be Anchored in The Template Region	Chak Tou Leong et.al.	2502.13946	null
2025-02-19	A Chain-of-Thought Subspace Meta-Learning for Few-shot Image Captioning with Large Vision and Language Models	Hao Huang et.al.	2502.13942	null
2025-02-19	Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images	Shengguang Wu et.al.	2502.13928	null
2025-02-19	Beyond Single Frames: Can LMMs Comprehend Temporal and Contextual Narratives in Image Sequences?	Xiaochen Wang et.al.	2502.13925	null
2025-02-19	LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization	Guanzheng Chen et.al.	2502.13922	link
2025-02-19	Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis	Jiahao Gai et.al.	2502.13921	null
2025-02-19	Exploring Personalized Health Support through Data-Driven, Theory-Guided LLMs: A Case Study in Sleep Health	Xingbo Wang et.al.	2502.13920	link
2025-02-19	TESS 2: A Large-Scale Generalist Diffusion Language Model	Jaesung Tae et.al.	2502.13917	link
2025-02-19	How Do LLMs Perform Two-Hop Reasoning in Context?	Tianyu Guo et.al.	2502.13913	null
2025-02-19	Lost in Sequence: Do Large Language Models Understand Sequential Recommendation?	Sein Kim et.al.	2502.13909	link
2025-02-19	Judging the Judges: A Collection of LLM-Generated Relevance Judgements	Hossein A. Rahmani et.al.	2502.13908	link
2025-02-19	DataSciBench: An LLM Agent Benchmark for Data Science	Dan Zhang et.al.	2502.13897	link
2025-02-19	NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants	Yiran Qin et.al.	2502.13894	null
2025-02-19	Refining embeddings with fill-tuning: data-efficient generalised performance improvements for materials foundation models	Matthew P. Wilson et.al.	2502.13886	link
2025-02-18	Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization	Shuo Xing et.al.	2502.13146	link
2025-02-18	Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation	Bencheng Liao et.al.	2502.13145	link
2025-02-18	Pre-training Auto-regressive Robotic Models with 4D Representations	Dantong Niu et.al.	2502.13142	null
2025-02-18	UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models	Huawei Lin et.al.	2502.13141	link
2025-02-18	AIDE: AI-Driven Exploration in the Space of Code	Zhengyao Jiang et.al.	2502.13138	link
2025-02-18	Theorem Prover as a Judge for Synthetic Data Generation	Joshua Ong Jun Leang et.al.	2502.13137	null
2025-02-18	Sleepless Nights, Sugary Days: Creating Synthetic Users with Health Conditions for Realistic Coaching Agent Interactions	Taedong Yun et.al.	2502.13135	null
2025-02-18	Learning to Defer for Causal Discovery with Imperfect Experts	Oscar Clivio et.al.	2502.13132	null
2025-02-18	Rethinking Diverse Human Preference Learning through Principal Component Analysis	Feng Luo et.al.	2502.13131	null
2025-02-18	Magma: A Foundation Model for Multimodal AI Agents	Jianwei Yang et.al.	2502.13130	link
2025-02-18	Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning	Jingyang Lin et.al.	2502.13127	null
2025-02-18	RuozhiBench: Evaluating LLMs with Logical Fallacies and Misleading Premises	Zenan Zhai et.al.	2502.13125	link
2025-02-18	Adapting Psycholinguistic Research for LLMs: Gender-inclusive Language in a Coreference Context	Marion Bartl et.al.	2502.13120	null
2025-02-18	STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models	Narun Raman et.al.	2502.13119	null
2025-02-18	Performance Evaluation of Large Language Models in Statistical Programming	Xinyi Song et.al.	2502.13117	link
2025-02-18	MatterChat: A Multi-Modal LLM for Material Science	Yingheng Tang et.al.	2502.13107	null
2025-02-18	Understanding and Rectifying Safety Perception Distortion in VLMs	Xiaohan Zou et.al.	2502.13095	null
2025-02-18	Text2World: Benchmarking Large Language Models for Symbolic World Model Generation	Mengkang Hu et.al.	2502.13092	null
2025-02-18	KAPPA: A Generic Patent Analysis Framework with Keyphrase-Based Portraits	Xin Xia et.al.	2502.13076	null
2025-02-18	Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity	Yuri Kuratov et.al.	2502.13063	link
2025-02-17	Idiosyncrasies in Large Language Models	Mingjie Sun et.al.	2502.12150	link
2025-02-17	HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation	Ling Yang et.al.	2502.12148	link
2025-02-17	Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control	Jinyan Su et.al.	2502.12145	link
2025-02-17	Small Models Struggle to Learn from Strong Reasoners	Yuetai Li et.al.	2502.12143	null
2025-02-17	SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs	Yige Xu et.al.	2502.12134	link
2025-02-17	Transformer Dynamics: A neuroscientific approach to interpretability of large language models	Jesseba Fernando et.al.	2502.12131	null
2025-02-17	Scaling Autonomous Agents via Automatic Reward Modeling And Planning	Zhenfang Chen et.al.	2502.12130	null
2025-02-17	On the Query Complexity of Verifier-Assisted Language Generation	Edoardo Botta et.al.	2502.12123	null
2025-02-17	Minimal Ranks, Maximum Confidence: Parameter-efficient Uncertainty Quantification for LoRA	Patryk Marszałek et.al.	2502.12122	link
2025-02-17	LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws	Prasanna Mayilvahanan et.al.	2502.12120	null
2025-02-17	PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection	Jinhe Bi et.al.	2502.12119	null
2025-02-17	A-MEM: Agentic Memory for LLM Agents	Wujiang Xu et.al.	2502.12110	link
2025-02-17	Personality Structured Interview for Large Language Model Simulation in Personality Research	Pengda Wang et.al.	2502.12109	null
2025-02-17	Relational Norms for Human-AI Cooperation	Brian D. Earp et.al.	2502.12102	null
2025-02-17	Token Communications: A Unified Framework for Cross-modal Context-aware Semantic Communications	Li Qiao et.al.	2502.12096	null
2025-02-17	Descriminative-Generative Custom Tokens for Vision-Language Models	Pramuditha Perera et.al.	2502.12095	null
2025-02-17	Meta-Statistical Learning: Supervised Learning of Statistical Inference	Maxime Peyrard et.al.	2502.12088	null
2025-02-17	APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs	Yuxiang Huang et.al.	2502.12085	link
2025-02-17	VLM $^2$ -Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues	Jianshu Zhang et.al.	2502.12084	null
2025-02-17	AdaSplash: Adaptive Sparse Flash Attention	Nuno Gonçalves et.al.	2502.12082	link
2025-02-14	MM-RLHF: The Next Step Forward in Multimodal LLM Alignment	Yi-Fan Zhang et.al.	2502.10391	null
2025-02-14	Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction	WonJin Yoon et.al.	2502.10388	null
2025-02-14	Unknown Word Detection for English as a Second Language (ESL) Learners Using Gaze and Pre-trained Language Models	Jiexin Ding et.al.	2502.10378	null
2025-02-14	Robustness tests for biomedical foundation models should tailor to specification	R. Patrick Xian et.al.	2502.10374	link
2025-02-14	Enhancing Multilingual LLM Pretraining with Model-Based Data Selection	Bettina Messmer et.al.	2502.10361	null
2025-02-14	Organize the Web: Constructing Domains Enhances Pre-Training Data Curation	Alexander Wettig et.al.	2502.10341	null
2025-02-14	Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering	Nick Ferguson et.al.	2502.10338	null
2025-02-14	LLM-Powered Preference Elicitation in Combinatorial Assignment	Ermis Soumalias et.al.	2502.10308	null
2025-02-14	SPIRIT: Short-term Prediction of solar IRradIance for zero-shot Transfer learning using Foundation Models	Aditya Mishra et.al.	2502.10307	null
2025-02-14	Open-Source AI-Powered Optimization in Scalene: Advancing Python Performance Profiling with DeepSeek-R1 and LLaMA 3.2	Saem Hasan et.al.	2502.10299	null
2025-02-14	DeltaProduct: Increasing the Expressivity of DeltaNet Through Products of Householders	Julien Siems et.al.	2502.10297	link
2025-02-14	Probing Perceptual Constancy in Large Vision Language Models	Haoran Sun et.al.	2502.10273	null
2025-02-14	Are Large Language Models the future crowd workers of Linguistics?	Iris Ferrazzo et.al.	2502.10266	null
2025-02-14	Large Language Models and Synthetic Data for Monitoring Dataset Mentions in Research Papers	Aivin V. Solatorio et.al.	2502.10263	link
2025-02-14	VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models	Gokul Karthik Kumar et.al.	2502.10250	null
2025-02-14	Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model	Guoqing Ma et.al.	2502.10248	link
2025-02-14	Efficient Zero-Order Federated Finetuning of Language Models for Resource-Constrained Devices	Mohamed Aboelenien Ahmed et.al.	2502.10239	null
2025-02-14	AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting	Abdelhakim Benechehab et.al.	2502.10235	link
2025-02-14	Do Large Language Models Reason Causally Like Us? Even Better?	Hanna M. Dettki et.al.	2502.10215	null
2025-02-14	Can Post-Training Quantization Benefit from an Additional QLoRA Integration?	Xiliang Zhu et.al.	2502.10202	null
2025-02-13	Theoretical Benefit and Limitation of Diffusion Language Model	Guhao Feng et.al.	2502.09622	null
2025-02-13	MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency	Dongzhi Jiang et.al.	2502.09621	null
2025-02-13	Exploring the Potential of Encoder-free Architectures in 3D LMMs	Yiwen Tang et.al.	2502.09620	link
2025-02-13	Human-LLM Coevolution: Evidence from Academic Writing	Mingmeng Geng et.al.	2502.09606	null
2025-02-13	SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models	Yung-Sung Chuang et.al.	2502.09604	link
2025-02-13	GAIA: A Global, Multi-modal, Multi-scale Vision-Language Dataset for Remote Sensing Image Analysis	Angelos Zavras et.al.	2502.09598	link
2025-02-13	Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs	Siyan Zhao et.al.	2502.09597	link
2025-02-13	KIMAs: A Configurable Knowledge Integrated Multi-Agent System	Zitao Li et.al.	2502.09596	null
2025-02-13	Logical forms complement probability in understanding language model (and human) performance	Yixuan Wang et.al.	2502.09589	null
2025-02-13	Polymind: Parallel Visual Diagramming with Large Language Models to Support Prewriting Through Microtasks	Qian Wan et.al.	2502.09577	null
2025-02-13	MorphNLI: A Stepwise Approach to Natural Language Inference Using Text Morphing	Vlad Andrei Negru et.al.	2502.09567	null
2025-02-13	Zero-shot generation of synthetic neurosurgical data with large language models	Austin A. Barr et.al.	2502.09566	link
2025-02-13	MDCrow: Automating Molecular Dynamics Workflows with Large Language Models	Quintina Campbell et.al.	2502.09565	link
2025-02-13	EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents	Rui Yang et.al.	2502.09560	null
2025-02-13	Explainable AI-assisted Optimization for Feynman Integral Reduction	Zhuo-Yang Song et.al.	2502.09544	null
2025-02-13	Mind the Gap! Choice Independence in Using Multilingual LLMs for Persuasive Co-Writing Tasks in Different Languages	Shreyan Biswas et.al.	2502.09532	null
2025-02-13	When and How Does CLIP Enable Domain and Compositional Generalization?	Elias Kempf et.al.	2502.09507	link
2025-02-13	Improve LLM-based Automatic Essay Scoring with Linguistic Features	Zhaoyi Joey Hou et.al.	2502.09497	null
2025-02-13	Foundation Neural-Network Quantum States	Riccardo Rende et.al.	2502.09488	null
2025-02-13	Objective quantification of mood states using large language models	Jakub Onysk et.al.	2502.09487	null
2025-02-12	SwiftSketch: A Diffusion Model for Image-to-Vector Sketch Generation	Ellie Arar et.al.	2502.08642	null
2025-02-12	Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples	Andrianos Michail et.al.	2502.08638	null
2025-02-12	Ensemble based approach to quantifying uncertainty of LLM based classifications	Srijith Rajamohan et.al.	2502.08631	null
2025-02-12	Continuous Cardiac Arrest Prediction in ICU using PPG Foundation Model	Saurabh Kataria et.al.	2502.08612	null
2025-02-12	Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors	Vishwanath Pratap Singh et.al.	2502.08587	null
2025-02-12	Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks	Ang Li et.al.	2502.08586	null
2025-02-12	COAST: Intelligent Time-Adaptive Neural Operators	Zhikai Wu et.al.	2502.08574	null
2025-02-12	QA-Expand: Multi-Question Answer Generation for Enhanced Query Expansion in Information Retrieval	Wonduk Seo et.al.	2502.08557	null
2025-02-12	Human-Centric Foundation Models: Perception, Generation and Agentic Modeling	Shixiang Tang et.al.	2502.08556	link
2025-02-12	Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies	Sunnie S. Y. Kim et.al.	2502.08554	null
2025-02-12	LLMs can implicitly learn from mistakes in-context	Lisa Alazraki et.al.	2502.08550	null
2025-02-12	Representation Learning to Advance Multi-institutional Studies with Electronic Health Record Data	Doudou Zhou et.al.	2502.08547	null
2025-02-12	Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval	Kevin Flanagan et.al.	2502.08544	link
2025-02-12	LLM Pretraining with Continuous Concepts	Jihoon Tack et.al.	2502.08524	null
2025-02-12	The Paradox of Stochasticity: Limited Creativity and Computational Decoupling in Temperature-Varied LLM Outputs of Structured Fictional Data	Evgenii Evstafev et.al.	2502.08515	null
2025-02-12	Faithful, Unfaithful or Ambiguous? Multi-Agent Debate with Initial Stance for Summary Evaluation	Mahnaz Koupaee et.al.	2502.08514	link
2025-02-12	Measuring Diversity in Synthetic Datasets	Yuchang Zhu et.al.	2502.08512	link
2025-02-12	Explanation based In-Context Demonstrations Retrieval for Multilingual Grammatical Error Correction	Wei Li et.al.	2502.08507	link
2025-02-12	Salamandra Technical Report	Aitor Gonzalez-Agirre et.al.	2502.08489	link
2025-02-12	One-Shot Federated Learning with Classifier-Free Diffusion Models	Obaidullah Zaland et.al.	2502.08488	null
2025-02-11	DarwinLM: Evolutionary Structured Pruning of Large Language Models	Shengkun Tang et.al.	2502.07780	link
2025-02-11	Auditing Prompt Caching in Language Model APIs	Chenchen Gu et.al.	2502.07776	link
2025-02-11	Automatic Robot Task Planning by Integrating Large Language Model with Genetic Programming	Azizjon Kobilov et.al.	2502.07772	null
2025-02-11	Breaking Down Bias: On The Limits of Generalizable Pruning Strategies	Sibo Ma et.al.	2502.07771	null
2025-02-11	Great Power Brings Great Responsibility: Personalizing Conversational AI for Diverse Problem-Solvers	Italo Santos et.al.	2502.07763	null
2025-02-11	Scalable Fingerprinting of Large Language Models	Anshul Nasery et.al.	2502.07760	null
2025-02-11	Towards Efficient Optimizer Design for LLM via Structured Fisher Approximation with a Low-Rank Extension	Wenbo Gong et.al.	2502.07752	null
2025-02-11	WHODUNIT: Evaluation benchmark for culprit detection in mystery stories	Kshitij Gupta et.al.	2502.07747	link
2025-02-11	The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing	Dirk Bergemann et.al.	2502.07736	null
2025-02-11	Economics of Sourcing Human Data	Sebastin Santy et.al.	2502.07732	null
2025-02-11	Verifying LLM-Generated Code in the Context of Software Verification with Ada/SPARK	Marcos Cramer et.al.	2502.07728	null
2025-02-11	Making Language Models Robust Against Negation	MohammadHossein Rezaei et.al.	2502.07717	link
2025-02-11	Magic 1-For-1: Generating One Minute Video Clips within One Minute	Hongwei Yi et.al.	2502.07701	link
2025-02-11	A Framework for LLM-powered Design Assistants	Swaroop Panda et.al.	2502.07698	null
2025-02-11	Large Language Models as Proxies for Theories of Human Linguistic Cognition	Imry Ziv et.al.	2502.07687	null
2025-02-11	SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models	Shihao Xia et.al.	2502.07644	null
2025-02-11	FoQA: A Faroese Question-Answering Dataset	Annika Simonsen et.al.	2502.07642	null
2025-02-11	Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving	Yong Lin et.al.	2502.07640	link
2025-02-11	Exploring Mobile Touch Interaction with Large Language Models	Tim Zindulka et.al.	2502.07629	null
2025-02-11	Scaling Pre-training to One Hundred Billion Data for Vision Language Models	Xiao Wang et.al.	2502.07617	null
2025-02-10	EVEv2: Improved Baselines for Encoder-Free Vision-Language Models	Haiwen Diao et.al.	2502.06788	link
2025-02-10	Visual Agentic AI for Spatial Reasoning with a Dynamic API	Damiano Marsili et.al.	2502.06787	null
2025-02-10	DeepCrossAttention: Supercharging Transformer Residual Connections	Mike Heddes et.al.	2502.06785	null
2025-02-10	Towards Internet-Scale Training For Agents	Brandon Trabucco et.al.	2502.06776	null
2025-02-10	Enhancing Trust in Language Model-Based Code Optimization through RLHF: A Research Design	Jingzhi Gong et.al.	2502.06769	null
2025-02-10	Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs	Ryan Synk et.al.	2502.06766	link
2025-02-10	Rationalization Models for Text-to-SQL	Gaetano Rossiello et.al.	2502.06759	null
2025-02-10	Accelerating Data Processing and Benchmarking of AI Models for Pathology	Andrew Zhang et.al.	2502.06750	link
2025-02-10	Gradient Multi-Normalization for Stateless and Scalable LLM Training	Meyer Scetbon et.al.	2502.06742	null
2025-02-10	VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data	Thomas Zeng et.al.	2502.06737	null
2025-02-10	Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining	Daouda Sow et.al.	2502.06733	null
2025-02-10	Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling	Runze Liu et.al.	2502.06703	link
2025-02-10	EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks	Michael Arbel et.al.	2502.06684	null
2025-02-10	Boosting Self-Efficacy and Performance of Large Language Models via Verbal Efficacy Stimulations	Rui Chen et.al.	2502.06669	null
2025-02-10	Automatic Evaluation of Healthcare LLMs Beyond Question-Answering	Anna Arias-Duart et.al.	2502.06666	null
2025-02-10	Evaluation of Deep Audio Representations for Hearables	Fabian Gröger et.al.	2502.06664	null
2025-02-10	EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models	Xingrun Xing et.al.	2502.06663	null
2025-02-10	Unbiased Evaluation of Large Language Models from a Causal Perspective	Meilin Chen et.al.	2502.06655	null
2025-02-10	In-Context Learning (and Unlearning) of Length Biases	Stephanie Schoch et.al.	2502.06653	null
2025-02-10	Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A	Anna Leschanowsky et.al.	2502.06652	null
2025-02-07	Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray	Yunhang Shen et.al.	2502.05177	link
2025-02-07	Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach	Jonas Geiping et.al.	2502.05171	link
2025-02-07	NoLiMa: Long-Context Evaluation Beyond Literal Matching	Ali Modarressi et.al.	2502.05167	link
2025-02-07	Multitwine: Multi-Object Compositing with Text and Layout Control	Gemma Canet Tarrés et.al.	2502.05165	null
2025-02-07	DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails	Yihe Deng et.al.	2502.05163	link
2025-02-07	A Lightweight Method to Disrupt Memorized Sequences in LLM	Parjanya Prajakta Prashant et.al.	2502.05159	null
2025-02-07	Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation	Steffen Eger et.al.	2502.05151	link
2025-02-07	CodeSCM: Causal Analysis for Multi-Modal Code Generation	Mukur Gupta et.al.	2502.05150	link
2025-02-07	An Annotated Reading of 'The Singer of Tales' in the LLM Era	Kush R. Varshney et.al.	2502.05148	null
2025-02-07	Chest X-ray Foundation Model with Global and Local Representations Integration	Zefan Yang et.al.	2502.05142	link
2025-02-07	Refining Integration-by-Parts Reduction of Feynman Integrals with Machine Learning	Matt von Hippel et.al.	2502.05121	null
2025-02-07	Flexible and Efficient Grammar-Constrained Decoding	Kanghee Park et.al.	2502.05111	null
2025-02-07	Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs	Rohit Saxena et.al.	2502.05092	null
2025-02-07	DCFormer: Efficient 3D Vision-Language Modeling with Decomposed Convolutions	Gorkem Can Ates et.al.	2502.05091	null
2025-02-07	Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs	Thierry Bossy et.al.	2502.05087	link
2025-02-07	Causality can systematically address the monsters under the bench(marks)	Felix Leeb et.al.	2502.05085	null
2025-02-07	ChallengeMe: An Adversarial Learning-enabled Text Summarization Framework	Xiaoyu Deng et.al.	2502.05084	null
2025-02-07	Adaptive Graph of Thoughts: Test-Time Adaptive Reasoning Unifying Chain, Tree, and Graph Structures	Tushar Pandey et.al.	2502.05078	link
2025-02-07	nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow	Geliang Ouyang et.al.	2502.05036	link
2025-02-07	EnseSmells: Deep ensemble and programming language models for automated code smells detection	Anh Ho et.al.	2502.05012	link
2025-02-06	Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment	Zuyan Liu et.al.	2502.04328	link
2025-02-06	Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions	Yik Siu Chan et.al.	2502.04322	link
2025-02-06	ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features	Alec Helbling et.al.	2502.04320	link
2025-02-06	sshELF: Single-Shot Hierarchical Extrapolation of Latent Features for 3D Reconstruction from Sparse-Views	Eyvaz Najafli et.al.	2502.04318	null
2025-02-06	ChamaleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters	Kamer Ali Yuksel et.al.	2502.04315	link
2025-02-06	Great Models Think Alike and this Undermines AI Oversight	Shashwat Goel et.al.	2502.04313	link
2025-02-06	ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization	Yinjie Wang et.al.	2502.04306	link
2025-02-06	Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization	Yuanye Liu et.al.	2502.04295	link
2025-02-06	PILAF: Optimal Human Preference Sampling for Reward Modeling	Yunzhen Feng et.al.	2502.04270	null
2025-02-06	How does a Multilingual LM Handle Multiple Languages?	Santhosh Kakarla et.al.	2502.04269	null
2025-02-06	Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion	Marco Mistretta et.al.	2502.04263	link
2025-02-06	Efficient Randomized Experiments Using Foundation Models	Piersilvio De Bartolomeis et.al.	2502.04262	link
2025-02-06	MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion	Xintong Hao et.al.	2502.04235	null
2025-02-06	Can LLMs Hack Enterprise Networks? Autonomous Assumed Breach Penetration-Testing Active Directory Networks	Andreas Happe et.al.	2502.04227	link
2025-02-06	Keep It Light! Simplifying Image Clustering Via Text-Free Adapters	Yicen Li et.al.	2502.04226	null
2025-02-06	Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents	Ilia Karmanov et.al.	2502.04223	null
2025-02-06	Sports and Women's Sports: Gender Bias in Text Generation with Olympic Data	Laura Biester et.al.	2502.04218	null
2025-02-06	Algorithmic causal structure emerging through compression	Liang Wendong et.al.	2502.04210	null
2025-02-06	"Short-length" Adversarial Training Helps LLMs Defend "Long-length" Jailbreak Attacks: Theoretical and Empirical Evidence	Shaopeng Fu et.al.	2502.04204	link
2025-02-06	The Best Instruction-Tuning Data are Those That Fit	Dylan Zhang et.al.	2502.04194	null
2025-02-05	Do Large Language Model Benchmarks Test Reliability?	Joshua Vendrow et.al.	2502.03461	link
2025-02-05	Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training	Boyao Wang et.al.	2502.03460	null
2025-02-05	SKI Models: Skeleton Induced Vision-Language Embeddings for Understanding Activities of Daily Living	Arkaprava Sinha et.al.	2502.03459	null
2025-02-05	A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs)	Yiye Chen et.al.	2502.03450	null
2025-02-05	BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving	Ran Xin et.al.	2502.03438	null
2025-02-05	On Fairness of Unified Multimodal Large Language Model for Image Generation	Ming Liu et.al.	2502.03429	null
2025-02-05	Harnessing Large Language Models for Curated Code Reviews	Oussama Ben Sghaier et.al.	2502.03425	link
2025-02-05	Think or Step-by-Step? UnZIPping the Black Box in Zero-Shot Prompts	Nikta Gohari Sadr et.al.	2502.03418	null
2025-02-05	SPRI: Aligning Large Language Models with Context-Situated Principles	Hongli Zhan et.al.	2502.03397	null
2025-02-05	Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications	Issar Arab et.al.	2502.03395	null
2025-02-05	LIMO: Less is More for Reasoning	Yixin Ye et.al.	2502.03387	link
2025-02-05	Transformers and Their Roles as Time Series Foundation Models	Dennis Wu et.al.	2502.03383	null
2025-02-05	High-Fidelity Simultaneous Speech-To-Speech Translation	Tom Labiausse et.al.	2502.03382	link
2025-02-05	Demystifying Long Chain-of-Thought Reasoning in LLMs	Edward Yeo et.al.	2502.03373	link
2025-02-05	PalimpChat: Declarative and Interactive AI analytics	Chunwei Liu et.al.	2502.03368	null
2025-02-05	Minerva: A Programmable Memory Test Benchmark for Language Models	Menglin Xia et.al.	2502.03358	null
2025-02-05	RadVLM: A Multitask Conversational Vision-Language Model for Radiology	Nicolas Deperrois et.al.	2502.03333	null
2025-02-05	ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model	Qiguang Chen et.al.	2502.03325	null
2025-02-05	Out-of-Distribution Detection using Synthetic Data Generation	Momin Abbas et.al.	2502.03323	null
2025-02-05	Simplifying Formal Proof-Generating Models with ChatGPT and Basic Searching Techniques	Sangjun Han et.al.	2502.03321	null
2025-02-04	Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling	Xiaowen Qiu et.al.	2502.02590	null
2025-02-04	COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation	Xueqing Deng et.al.	2502.02589	null
2025-02-04	A comparison of translation performance between DeepL and Supertext	Alex Flückiger et.al.	2502.02577	link
2025-02-04	Are Language Models Up to Sequential Optimization Problems? From Evaluation to a Hegelian-Inspired Enhancement	Soheil Abbasloo et.al.	2502.02573	null
2025-02-04	Learning the RoPEs: Better 2D and 3D Position Encodings with STRING	Connor Schenck et.al.	2502.02562	null
2025-02-04	Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation	Junha Lee et.al.	2502.02548	null
2025-02-04	LLMs for Generation of Architectural Components: An Exploratory Empirical Study in the Serverless World	Shrikara Arun et.al.	2502.02539	null
2025-02-04	Adaptive Self-improvement LLM Agentic System for ML Library Development	Genghan Zhang et.al.	2502.02534	link
2025-02-04	Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies	Han Zhou et.al.	2502.02533	null
2025-02-04	Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search	Maohao Shen et.al.	2502.02508	null
2025-02-04	Analyzing Similarity Metrics for Data Selection for Language Model Pretraining	Dylan Sam et.al.	2502.02494	null
2025-02-04	EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization	Yize Wu et.al.	2502.02493	null
2025-02-04	Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study	Menglong Cui et.al.	2502.02481	null
2025-02-04	Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification	Valentina Vadori et.al.	2502.02471	link
2025-02-04	Modular Training of Neural Networks aids Interpretability	Satvik Golechha et.al.	2502.02470	null
2025-02-04	SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency	Qianhao Yuan et.al.	2502.02458	link
2025-02-04	IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning	Quan Zhang et.al.	2502.02454	null
2025-02-04	Personalization Toolkit: Training Free Personalization of Large Vision Language Models	Soroush Seifi et.al.	2502.02452	null
2025-02-04	Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study	Calvin Yixiang Cheng et.al.	2502.02451	link
2025-02-04	Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models	Haoran Ye et.al.	2502.02444	null
2025-01-31	Low-Rank Adapting Models for Sparse Autoencoders	Matthew Chen et.al.	2501.19406	link
2025-01-31	Vintix: Action Model via In-Context Reinforcement Learning	Andrey Polubarov et.al.	2501.19400	link
2025-01-31	Scalable-Softmax Is Superior for Attention	Ken M. Nakanishi et.al.	2501.19399	null
2025-01-31	Do LLMs Strategically Reveal, Conceal, and Infer Information? A Theoretical and Empirical Analysis in The Chameleon Game	Mustafa O. Karabag et.al.	2501.19398	link
2025-02-03	s1: Simple test-time scaling	Niklas Muennighoff et.al.	2501.19393	link
2025-01-31	Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models	Alina Shutova et.al.	2501.19392	link
2025-01-31	Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models	Wenzhi Fang et.al.	2501.19389	link
2025-01-31	Decoding-based Regression	Xingyou Song et.al.	2501.19383	link
2025-01-31	TableMaster: A Recipe to Advance Table Understanding with Language Models	Lang Cao et.al.	2501.19378	null
2025-02-03	SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions	Dominik Wagner et.al.	2501.19377	null
2025-01-31	We're Different, We're the Same: Creative Homogeneity Across LLMs	Emily Wenger et.al.	2501.19361	null
2025-01-31	Mechanical Properties of the Meninges: Large Language Model Assisted Systematic Review of over 25,000 Studies	Brandon P. Chelstrom et.al.	2501.19359	null
2025-01-31	The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking	Yuchun Miao et.al.	2501.19358	null
2025-01-31	Towards Adaptive Self-Improvement for Smarter Energy Systems	Alexander Sommer et.al.	2501.19340	null
2025-01-31	PixelWorld: Towards Perceiving Everything as Pixels	Zhiheng Lyu et.al.	2501.19339	null
2025-01-31	Homogeneity Bias as Differential Sampling Uncertainty in Language Models	Messi H. J. Lee et.al.	2501.19337	null
2025-01-31	Reward-Guided Speculative Decoding for Efficient LLM Reasoning	Baohao Liao et.al.	2501.19324	null
2025-01-31	MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems	Anirudh Chari et.al.	2501.19318	null
2025-01-31	LLM-based Affective Text Generation Quality Based on Different Quantization Values	Yarik Menchaca Resendiz et.al.	2501.19317	null
2025-01-31	An Efficient Approach for Machine Translation on Low-resource Languages: A Case Study in Vietnamese-Chinese	Tran Ngoc Son et.al.	2501.19314	null
2025-01-30	Foundational Models for 3D Point Clouds: A Survey and Outlook	Vishal Thengane et.al.	2501.18594	null
2025-01-30	Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models	Hao Dong et.al.	2501.18592	link
2025-01-30	Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs	Yue Wang et.al.	2501.18585	null
2025-01-30	Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling	Dan M. Kluger et.al.	2501.18577	link
2025-01-30	Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH	Evgenii Evstafev et.al.	2501.18576	null
2025-01-30	BounTCHA: A CAPTCHA Utilizing Boundary Identification in AI-extended Videos	Lehao Lin et.al.	2501.18565	null
2025-01-30	SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation	Haoquan Fang et.al.	2501.18564	link
2025-01-30	Semantic Web and Creative AI -- A Technical Report from ISWS 2023	Raia Abu Ahmad et.al.	2501.18542	null
2025-01-30	Loss Functions and Operators Generated by f-Divergences	Vincent Roulet et.al.	2501.18537	null
2025-01-30	Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges	Manveer Singh Tamber et.al.	2501.18536	link
2025-01-30	Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models	Yi Ding et.al.	2501.18533	null
2025-01-30	Differentially Private Steering for Large Language Model Alignment	Anmol Goel et.al.	2501.18532	link
2025-01-30	Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models	Guanqun Cao et.al.	2501.18516	null
2025-01-30	Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch	Arthur Douillard et.al.	2501.18512	null
2025-01-30	WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training	Benjamin Feuer et.al.	2501.18511	link
2025-01-30	CLEAR: Cue Learning using Evolution for Accurate Recognition Applied to Sustainability Data Extraction	Peter J. Bentley et.al.	2501.18504	null
2025-01-30	A Tool for In-depth Analysis of Code Execution Reasoning of Large Language Models	Changshu Liu et.al.	2501.18482	null
2025-01-30	CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization	Yanxia Deng et.al.	2501.18475	null
2025-01-30	Tuning Vision Foundation Model via Test-Time Prompt-Guided Training for VFSS Segmentations	Chengxi Zeng et.al.	2501.18474	null
2025-01-30	A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models	Shiho Noda et.al.	2501.18463	link
2025-01-29	Learning Beyond the Surface: How Far Can Continual Pre-Training with LoRA Enhance LLMs' Domain-Specific Insight Learning?	Pouya Pezeshkpour et.al.	2501.17840	link
2025-01-29	Matrix Product Sketching via Coordinated Sampling	Majid Daliri et.al.	2501.17836	null
2025-01-29	Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology	Sobhan Hemati et.al.	2501.17822	null
2025-01-29	Leveraging Multimodal LLM for Inspirational User Interface Search	Seokhyeon Park et.al.	2501.17799	link
2025-01-29	BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights	Chan-Jan Hsu et.al.	2501.17790	null
2025-01-29	Reasoning Over the Glyphs: Evaluation of LLM's Decipherment of Rare Scripts	Yu-Fei Shih et.al.	2501.17785	null
2025-01-29	AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing	Peter Pak et.al.	2501.17784	null
2025-01-29	2SSP: A Two-Stage Framework for Structured Pruning of LLMs	Fabrizio Sandri et.al.	2501.17771	link
2025-01-29	Hybrid Graphs for Table-and-Text based Question Answering using LLMs	Ankush Agarwal et.al.	2501.17767	null
2025-01-29	On the Partitioning of GPU Power among Multi-Instances	Tirth Vamja et.al.	2501.17752	null
2025-01-29	Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation	Aitor Arrieta et.al.	2501.17749	null
2025-01-29	A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches	Ana R. Baião et.al.	2501.17729	null
2025-01-29	Using Code Generation to Solve Open Instances of Combinatorial Design Problems	Christopher D. Rosin et.al.	2501.17725	link
2025-01-29	RICoTA: Red-teaming of In-the-wild Conversation with Test Attempts	Eujeong Choi et.al.	2501.17715	link
2025-01-29	Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate	Yubo Wang et.al.	2501.17703	null
2025-01-29	Planning with Vision-Language Models and a Use Case in Robot-Assisted Teaching	Xuzhe Dang et.al.	2501.17665	null
2025-01-29	Exploring Vision Language Models for Multimodal and Multilingual Stance Detection	Jake Vasilakes et.al.	2501.17654	null
2025-01-29	Tonguescape: Exploring Language Models Understanding of Vowel Articulation	Haruki Sakajo et.al.	2501.17643	link
2025-01-29	Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation	Lin Chen et.al.	2501.17642	null
2025-01-29	In-Context Meta LoRA Generation	Yihua Shao et.al.	2501.17635	null
2025-01-28	SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training	Tianzhe Chu et.al.	2501.17161	null
2025-01-28	AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders	Zhengxuan Wu et.al.	2501.17148	link
2025-01-28	FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data	Deren Lei et.al.	2501.17144	link
2025-01-28	ASTRAL: Automated Safety Testing of Large Language Models	Miriam Ugarte et.al.	2501.17132	null
2025-01-28	Scenario Understanding of Traffic Scenes Through Large Visual Language Models	Rivera Esteban et.al.	2501.17131	null
2025-01-28	Histoires Morales: A French Dataset for Assessing Moral Alignment	Thibaud Leteno et.al.	2501.17117	link
2025-01-28	Optimizing Large Language Model Training Using FP4 Quantization	Ruizhe Wang et.al.	2501.17116	null
2025-01-28	Unlocking Transparent Alignment Through Enhanced Inverse Constitutional AI for Principle Extraction	Carl-Leander Henneking et.al.	2501.17112	null
2025-01-28	COS(M+O)S: Curiosity and RL-Enhanced MCTS for Exploring Story Space via Language Models	Tobias Materzok et.al.	2501.17104	null
2025-01-28	Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving	Evgenii Evstafev et.al.	2501.17084	null
2025-01-28	Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding	Akash Kumar et.al.	2501.17053	null
2025-01-28	How Linguistics Learned to Stop Worrying and Love the Language Models	Richard Futrell et.al.	2501.17047	null
2025-01-28	Enhanced Retrieval of Long Documents: Leveraging Fine-Grained Block Representations with Large Language Models	Minghan Li et.al.	2501.17039	null
2025-01-28	Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies	Manojkumar Parmar et.al.	2501.17030	null
2025-01-28	Automated Refactoring of Non-Idiomatic Python Code: A Differentiated Replication with LLMs	Alessandro Midolo et.al.	2501.17024	link
2025-01-28	Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement	Kei Katsumata et.al.	2501.17022	link
2025-01-28	Large Language Models for Code Generation: The Practitioners Perspective	Zeeshan Rasheed et.al.	2501.16998	link
2025-01-28	Artificial Intelligence Clones	Annie Liang et.al.	2501.16996	null
2025-01-28	FedEFM: Federated Endovascular Foundation Model with Unseen Data	Tuong Do et.al.	2501.16992	null
2025-01-28	Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection	Xiangyu Gao et.al.	2501.16981	null
2025-01-27	LUCY: Linguistic Understanding and Control Yielding Early Stage of Her	Heting Gao et.al.	2501.16327	link
2025-01-27	Evaluating The Performance of Using Large Language Models to Automate Summarization of CT Simulation Orders in Radiation Oncology	Meiyun Cao et.al.	2501.16309	null
2025-01-27	RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval	Long Nguyen et.al.	2501.16303	null
2025-01-27	Matryoshka Re-Ranker: A Flexible Re-Ranking Architecture With Configurable Depth and Width	Zheng Liu et.al.	2501.16302	null
2025-01-27	Large Models in Dialogue for Active Perception and Anomaly Detection	Tzoulio Chamiti et.al.	2501.16300	link
2025-01-27	FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers	Renshan Zhang et.al.	2501.16297	null
2025-01-27	Brain-Adapter: Enhancing Neurological Disorder Analysis with Adapter-Tuning Multimodal Large Language Models	Jing Zhang et.al.	2501.16282	null
2025-01-27	Do LLMs Have Visualization Literacy? An Evaluation on Modified Visualizations to Test Generalization in Data Interpretation	Jiayi Hong et.al.	2501.16277	link
2025-01-27	URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT	Long Nguyen et.al.	2501.16276	null
2025-01-27	Return of the Encoder: Maximizing Parameter Efficiency for SLMs	Mohamed Elfeki et.al.	2501.16273	link
2025-01-27	A foundation model for human-AI collaboration in medical literature mining	Zifeng Wang et.al.	2501.16255	null
2025-01-27	Multi-Agent Geospatial Copilots for Remote Sensing Workflows	Chaehong Lee et.al.	2501.16254	null
2025-01-27	Zero-Shot Decision Tree Construction via Large Language Models	Lucas Carrasco et.al.	2501.16247	null
2025-01-27	CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation	Xiaochuan Ma et.al.	2501.16246	null
2025-01-27	Phase Transitions in Large Language Models and the $O(N)$ Model	Youran Sun et.al.	2501.16241	null
2025-01-27	AiGet: Transforming Everyday Moments into Hidden Knowledge Discovery with AI Assistance on Smart Glasses	Runze Cai et.al.	2501.16240	link
2025-01-27	Distilling foundation models for robust and efficient models in digital pathology	Alexandre Filiot et.al.	2501.16239	null
2025-01-27	Language-Based Bayesian Optimization Research Assistant (BORA)	Abdoulatif Cissé et.al.	2501.16224	null
2025-01-27	Enhancing Visual Inspection Capability of Multi-Modal Large Language Models on Medical Time Series with Supportive Conformalized and Interpretable Small Specialized Models	Huayu Li et.al.	2501.16215	link
2025-01-27	Provence: efficient and robust context pruning for retrieval-augmented generation	Nadezhda Chirkova et.al.	2501.16214	null
2025-01-24	HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation	Xin Zhou et.al.	2501.14729	link
2025-01-24	Do LLMs Provide Consistent Answers to Health-Related Questions across Languages?	Ipek Baris Schlicht et.al.	2501.14719	null
2025-01-24	Towards Better Understanding Table Instruction Tuning: Decoupling the Effects from Data versus Models	Naihao Deng et.al.	2501.14717	null
2025-01-24	FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing	James Seale Smith et.al.	2501.14713	null
2025-01-24	The Karp Dataset	Mason DiCicco et.al.	2501.14705	null
2025-01-24	Rethinking Table Instruction Tuning	Naihao Deng et.al.	2501.14693	null
2025-01-24	Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST	Fuping Wu et.al.	2501.14685	null
2025-01-24	An Empirical Study on LLM-based Classification of Requirements-related Provisions in Food-safety Regulations	Shabnam Hassani et.al.	2501.14683	null
2025-01-24	Diffusion based Text-to-Music Generationwith Global and Local Text based Conditioning	Jisi Zhang et.al.	2501.14680	null
2025-01-24	MedAgentBench: Dataset for Benchmarking LLMs as Agents in Medical Applications	Yixing Jiang et.al.	2501.14654	link
2025-01-24	Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion	Ziyao Xu et.al.	2501.14649	link
2025-01-24	Recommending Actionable Strategies: A Semantic Approach to Integrating Analytical Frameworks with Decision Heuristics	Renato Ghisellini et.al.	2501.14634	null
2025-01-24	Extracting Problem Structure with LLMs for Optimized SAT Local Search	André Schilder et.al.	2501.14630	null
2025-01-24	ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations	Tianming Liang et.al.	2501.14607	null
2025-01-24	Knowledge Graphs Construction from Criminal Court Appeals: Insights from the French Cassation Court	Alexander V. Belikov et.al.	2501.14579	null
2025-01-24	ZETA: Leveraging Z-order Curves for Efficient Top-k Attention	Qiuhao Zeng et.al.	2501.14577	null
2025-01-24	Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding	Zhongyi Shui et.al.	2501.14548	link
2025-01-24	Leveraging ChatGPT's Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research	Hamid Sarmadi et.al.	2501.14546	null
2025-01-24	VERUS-LM: a Versatile Framework for Combining LLMs with Symbolic Reasoning	Benjamin Callewaert et.al.	2501.14540	null
2025-01-24	Design and Implementation of a Psychiatry Resident Training System Based on Large Language Models	Zhenguang Zhong et.al.	2501.14530	link
2025-01-23	CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation	Guofeng Cui et.al.	2501.13927	null
2025-01-23	The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities	Chan-Jan Hsu et.al.	2501.13921	link
2025-01-23	Analysis of Indic Language Capabilities in LLMs	Aatman Vaidya et.al.	2501.13912	null
2025-01-23	Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models	Linh Tran et.al.	2501.13904	null
2025-01-23	Exploring Finetuned Audio-LLM on Heart Murmur Features	Adrian Florea et.al.	2501.13884	null
2025-01-23	The machine learning platform for developers of large systems	Alexey Naikov et.al.	2501.13881	null
2025-01-23	A RAG-Based Institutional Assistant	Gustavo Kuratomi et.al.	2501.13880	null
2025-01-23	Dual-Modal Prototype Joint Learning for Compositional Zero-Shot Learning	Shiyu Zhang et.al.	2501.13859	null
2025-01-23	Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes	Shiling Deng et.al.	2501.13851	link
2025-01-23	Think Outside the Data: Colonial Biases and Systemic Issues in Automated Moderation Pipelines for Low-Resource Languages	Farhana Shahid et.al.	2501.13836	null
2025-01-23	On the Reasoning Capacity of AI Models and How to Quantify It	Santosh Kumar Radha et.al.	2501.13833	null
2025-01-23	Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing	Hao Zhang et.al.	2501.13831	null
2025-01-23	Hallucinations Can Improve Large Language Models in Drug Discovery	Shuzhou Yuan et.al.	2501.13824	null
2025-01-23	Large Language Model driven Policy Exploration for Recommender Systems	Jie Wang et.al.	2501.13816	null
2025-01-23	Enhancing LLMs for Governance with Human Oversight: Evaluating and Aligning LLMs on Expert Classification of Climate Misinformation for Detecting False or Misleading Claims about Climate Change	Mowafak Allaham et.al.	2501.13802	null
2025-01-23	PromptMono: Cross Prompting Attention for Self-Supervised Monocular Depth Estimation in Challenging Environments	Changhao Wang et.al.	2501.13796	null
2025-01-23	Training-Free Zero-Shot Temporal Action Detection with Vision-Language Models	Chaolei Han et.al.	2501.13795	link
2025-01-23	Parameter-Efficient Fine-Tuning for Foundation Models	Dan Zhang et.al.	2501.13787	link
2025-01-23	Not Every AI Problem is a Data Problem: We Should Be Intentional About Data Scaling	Tanya Rodchenko et.al.	2501.13779	null
2025-01-23	Explainable XR: Understanding User Behaviors of XR Environments using LLM-assisted Analytics Framework	Yoonsang Kim et.al.	2501.13778	link
2025-01-22	VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding	Boqiang Zhang et.al.	2501.13106	link
2025-01-22	Refining Input Guardrails: Enhancing LLM-as-a-Judge Efficiency Through Chain-of-Thought Fine-Tuning and Alignment	Melissa Kazemi Rad et.al.	2501.13080	null
2025-01-22	Autonomy-of-Experts Models	Ang Lv et.al.	2501.13074	null
2025-01-22	Does Table Source Matter? Benchmarking and Improving Multimodal Scientific Table Understanding and Reasoning	Bohao Yang et.al.	2501.13042	link
2025-01-22	Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament	Yantao Liu et.al.	2501.13007	link
2025-01-22	Large Language Model-Based Semantic Communication System for Image Transmission	Soheyb Ribouh et.al.	2501.12988	null
2025-01-22	LLM4WM: Adapting LLM for Wireless Multi-Tasking	Xuanyu Liu et.al.	2501.12983	null
2025-01-22	OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language Models	Chongren Sun et.al.	2501.12975	link
2025-01-22	Accessible Smart Contracts Verification: Synthesizing Formal Models with Tamed LLMs	Jan Corazza et.al.	2501.12972	link
2025-01-22	It's complicated. The relationship of algorithmic fairness and non-discrimination regulations in the EU AI Act	Kristof Meding et.al.	2501.12962	null
2025-01-22	Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference	Weizhi Fei et.al.	2501.12959	null
2025-01-22	GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models	Pengxiang Zhao et.al.	2501.12956	null
2025-01-22	Correctness Assessment of Code Generated by Large Language Models Using Internal Representations	Tuan-Dung Bui et.al.	2501.12934	link
2025-01-22	DynamicEarth: How Far are We from Open-Vocabulary Change Detection?	Kaiyu Li et.al.	2501.12931	null
2025-01-22	A Functional Software Reference Architecture for LLM-Integrated Systems	Alessio Bucaioni et.al.	2501.12904	null
2025-01-22	Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration	Offa Kingsleigh et.al.	2501.12901	null
2025-01-22	Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback	Yafu Li et.al.	2501.12895	link
2025-01-22	Generative AI Misuse Potential in Cyber Security Education: A Case Study of a UK Degree Program	Carlton Shepherd et.al.	2501.12883	null
2025-01-22	WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge	Jingyuan Chen et.al.	2501.12877	null
2025-01-22	HierPromptLM: A Pure PLM-based Framework for Representation Learning on Heterogeneous Text-rich Networks	Qiuyu Zhu et.al.	2501.12857	null
2025-01-21	InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling	Yi Wang et.al.	2501.12386	link
2025-01-21	MMVU: Measuring Expert-Level Multi-Discipline Video Understanding	Yilun Zhao et.al.	2501.12380	link
2025-01-21	Expertise elevates AI usage: experimental evidence comparing laypeople and professional artists	Thomas F. Eisenmann et.al.	2501.12374	link
2025-01-21	Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL	Yeounoh Chung et.al.	2501.12372	link
2025-01-21	Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models	Samira Abnar et.al.	2501.12370	null
2025-01-21	InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model	Yuhang Zang et.al.	2501.12368	link
2025-01-21	Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2	Md. Rakibul Islam et.al.	2501.12356	null
2025-01-21	Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration	Thomas Walshe et.al.	2501.12332	null
2025-01-21	Cinepro: Robust Training of Foundation Models for Cancer Detection in Prostate Ultrasound Cineloops	Mohamed Harmanani et.al.	2501.12331	link
2025-01-21	VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model	Xianwei Zhuang et.al.	2501.12327	link
2025-01-21	LLM-Assisted Knowledge Graph Completion for Curriculum and Domain Modelling in Personalized Higher Education Recommendations	Hasan Abu-Rasheed et.al.	2501.12300	null
2025-01-21	MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing Networks	Qishen Zhou et.al.	2501.12281	link
2025-01-21	Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement	Maosong Cao et.al.	2501.12273	link
2025-01-21	CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification	Cristiano Patrício et.al.	2501.12266	null
2025-01-21	FOCUS: First Order Concentrated Updating Scheme	Yizhou Liu et.al.	2501.12243	null
2025-01-21	InsTALL: Context-aware Instructional Task Assistance with Multi-modal Large Language Models	Pha Nguyen et.al.	2501.12231	null
2025-01-21	CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning	Yuanheng Fang et.al.	2501.12226	null
2025-01-21	Leveraging Large Language Models for Realizing Truly Intelligent User Interfaces	Allard Oelen et.al.	2501.12221	null
2025-01-21	You Can't Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense	Wuyuao Mai et.al.	2501.12210	null
2025-01-21	Fixing Imbalanced Attention to Mitigate In-Context Hallucination of Large Vision-Language Model	Kazi Hasan Ibn Arif et.al.	2501.12206	link
2025-01-17	FaceXBench: Evaluating Multimodal LLMs on Face Understanding	Kartik Narayan et.al.	2501.10360	link
2025-01-17	Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems	Weibo Gao et.al.	2501.10332	link
2025-01-17	BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation	Suvodip Dey et.al.	2501.10328	link
2025-01-17	Large language models for automated scholarly paper review: A survey	Zhenzhen Zhuang et.al.	2501.10326	null
2025-01-17	Hierarchical Autoregressive Transformers: Combining Byte-~and Word-Level Processing for Robust, Adaptable Language Models	Pit Neitemeier et.al.	2501.10322	null
2025-01-17	HiMix: Reducing Computational Complexity in Large Vision-Language Models	Xuange Zhang et.al.	2501.10318	null
2025-01-17	Addressing Popularity Bias in Third-Party Library Recommendations Using LLMs	Claudio Di Sipio et.al.	2501.10313	null
2025-01-17	Computational Protein Science in the Era of Large Language Models (LLMs)	Wenqi Fan et.al.	2501.10282	null
2025-01-17	Test Wars: A Comparative Study of SBST, Symbolic Execution, and LLM-Based Approaches to Unit Test Generation	Azat Abdullin et.al.	2501.10200	null
2025-01-17	Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education	William Hersh et.al.	2501.10186	null
2025-01-17	Multi-stage Training of Bilingual Islamic LLM for Neural Passage Retrieval	Vera Pavlova et.al.	2501.10175	null
2025-01-17	Dual Debiasing: Remove Stereotypes and Keep Factual Gender for Fair Language Modeling and Translation	Tomasz Limisiewicz et.al.	2501.10150	null
2025-01-17	A Vision-Language Framework for Multispectral Scene Representation Using Language-Grounded Features	Enes Karanfil et.al.	2501.10144	null
2025-01-17	Exploring the Impact of Generative Artificial Intelligence in Education: A Thematic Analysis	Abhishek Kaushik et.al.	2501.10134	null
2025-01-17	ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario	Lucen Zhong et.al.	2501.10132	link
2025-01-17	PaSa: An LLM Agent for Comprehensive Academic Paper Search	Yichen He et.al.	2501.10120	link
2025-01-17	LLM Reasoner and Automated Planner: A new NPC approach	Israel Puerta-Merino et.al.	2501.10106	null
2025-01-17	Universal Actions for Enhanced Embodied Foundation Models	Jinliang Zheng et.al.	2501.10105	link
2025-01-17	Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks	Michael Schwingshackl et.al.	2501.10080	link
2025-01-17	SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning	Yuecheng Liu et.al.	2501.10074	null
2025-01-16	Distilling Multi-modal Large Language Models for Autonomous Driving	Deepti Hegde et.al.	2501.09757	null
2025-01-16	Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues	Youngjoon Jang et.al.	2501.09754	null
2025-01-16	OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking	Zekun Xi et.al.	2501.09751	link
2025-01-16	Enhancing Lexicon-Based Text Embeddings with Large Language Models	Yibin Lei et.al.	2501.09749	null
2025-01-16	Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models	Bihui Jin et.al.	2501.09745	null
2025-01-16	Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps	Nanye Ma et.al.	2501.09732	null
2025-01-16	A Simple Aerial Detection Baseline of Multimodal Language Models	Qingyun Li et.al.	2501.09720	link
2025-01-16	CyberMentor: AI Powered Learning Tool Platform to Address Diverse Student Needs in Cybersecurity Education	Tianyu Wang et.al.	2501.09709	link
2025-01-16	Domain Adaptation of Foundation LLMs for e-Commerce	Christian Herold et.al.	2501.09706	null
2025-01-16	Cueless EEG imagined speech for subject identification: dataset and benchmarks	Ali Derakhshesh et.al.	2501.09700	link
2025-01-16	Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key	Zhihe Yang et.al.	2501.09695	link
2025-01-16	Simulated Interactive Debugging	Yannic Noller et.al.	2501.09694	null
2025-01-16	Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models	Fengli Xu et.al.	2501.09686	null
2025-01-16	Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review	Masatoshi Uehara et.al.	2501.09685	null
2025-01-16	Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark	Alexis Roger et.al.	2501.09672	null
2025-01-16	A Survey of Research in Large Language Models for Electronic Design Automation	Jingyu Pan et.al.	2501.09655	null
2025-01-16	The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models	Jonathan Katzy et.al.	2501.09653	null
2025-01-16	CarMem: Enhancing Long-Term Memory in LLM Voice Assistants through Category-Bounding	Johannes Kirmayr et.al.	2501.09645	link
2025-01-16	LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading	Kuan-Ming Liu et.al.	2501.09636	null
2025-01-16	Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework	Yushen Lin et.al.	2501.09631	null
2025-01-15	Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians	Ishan Amin et.al.	2501.09009	link
2025-01-15	Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails	Shaona Ghosh et.al.	2501.09004	null
2025-01-15	Vision Foundation Models for Computed Tomography	Suraj Pai et.al.	2501.09001	link
2025-01-15	CityLoc: 6 DoF Localization of Text Descriptions in Large-Scale Scenes with Gaussian Representation	Qi Ma et.al.	2501.08982	null
2025-01-15	Development and Validation of the Provider Documentation Summarization Quality Instrument for Large Language Models	Emma Croxford et.al.	2501.08977	null
2025-01-15	Learning to Extract Cross-Domain Aspects and Understanding Sentiments Using Large Language Models	Karukriti Kaushik Ghosh et.al.	2501.08974	null
2025-01-15	Analyzing the Ethical Logic of Six Large Language Models	W. Russell Neuman et.al.	2501.08951	null
2025-01-15	Applying General Turn-taking Models to Conversational Human-Robot Interaction	Gabriel Skantze et.al.	2501.08946	null
2025-01-15	Disentangling Exploration of Large Language Models by Optimal Exploitation	Tim Grams et.al.	2501.08925	null
2025-01-15	GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge	Liam Dugan et.al.	2501.08913	link
2025-01-15	Leveraging Large Language Models as Knowledge-Driven Agents for Reliable Retrosynthesis Planning	Qinyu Ma et.al.	2501.08897	link
2025-01-15	Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving	Tengpeng Li et.al.	2501.08861	link
2025-01-15	Exploring Task-Level Optimal Prompts for Visual In-Context Learning	Yan Zhu et.al.	2501.08841	null
2025-01-15	IDEA: Image Description Enhanced CLIP-Adapter	Zhipeng Ye et.al.	2501.08816	link
2025-01-15	How Developers Interact with AI: A Taxonomy of Human-AI Collaboration in Software Engineering	Christoph Treude et.al.	2501.08774	null
2025-01-15	Admitting Ignorance Helps the Video Question Answering Models to Answer	Haopeng Li et.al.	2501.08771	null
2025-01-15	Enhanced Large Language Models for Effective Screening of Depression and Anxiety	June M. Liu et.al.	2501.08769	null
2025-01-15	Leveraging LLM Agents for Translating Network Configurations	Yunze Wei et.al.	2501.08760	null
2025-01-15	Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models	Hong-Viet Tran et.al.	2501.08758	null
2025-01-15	The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities	Irina Bigoulaeva et.al.	2501.08716	link
2025-01-14	PokerBench: Training Large Language Models to become Professional Poker Players	Richard Zhuang et.al.	2501.08328	link
2025-01-14	Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks	Miran Heo et.al.	2501.08326	null
2025-01-14	ADAM-1: AI and Bioinformatics for Alzheimer's Detection and Microbiome-Clinical Data Integrations	Ziyuan Huang et.al.	[2501.08324](http://arxiv.org/abs/2501.0

Name		Name	Last commit message	Last commit date
Latest commit History 1,340 Commits
.github/workflows		.github/workflows
docs		docs
README.md		README.md
config.yaml		config.yaml
daily_arxiv.py		daily_arxiv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Updated on 2026.02.11

Single Object & Visual Language Tracking

Large Language Model

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Xuchen-Li/cv-arxiv-daily

Folders and files

Latest commit

History

Repository files navigation

Updated on 2026.02.11

Single Object & Visual Language Tracking

Large Language Model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages